Elon Musk's Grok AI chatbot gets new vision and multilingual capabilities

Indian Express23-04-2025

Elon Musk's Grok AI has introduced a new vision feature, endowing the chatbot with the ability to 'see' the world in real-time using a smartphone's camera. This is similar to OpenAI's ChatGPT and Google Gemini, which can analyse images and visuals in real-time.
On Tuesday, April 22, xAI announced Grok Vision, the feature that allows users to point their smartphone cameras at objects, signs, documents, etc., and ask questions about them. The new feature can be accessed using the Grok app for iOS. However, the feature is yet to be introduced in Grok's Android app.
xAI also introduced capabilities such as multilingual audio and real-time search in Grok's voice mode. The feature is available to those subscribed to the SuperGrok plan.
Grok's memory feature
Last week, a memory feature was added to Grok 3, allowing it to remember conversations that users have had with it, allowing it to come up with more personalised responses. In simple words, if a user mentions their health routine, Grok can later suggest a diet plan personalised to their historical habits.
xAI claimed Grok's memory feature is distinct compared to other chatbots, as its memories are 'transparent'. This means the user can see exactly what Grok knows and chooses to forget. Transparency and user control set Grok's memory feature apart from other chatbots. xAI reportedly plans to introduce a 'forget' button for Grok users on Android OS that will let them exclude specific chats from its memory.
Earlier this month, Grok gained a Canvas-like feature for editing and creating documents as well as developing basic AI apps in Grok Studio. The features are available for free and paying users on Grok.com. 'Grok can now generate documents, code, reports, and browser games. Grok Studio will open your content in a separate window, allowing both you and Grok to collaborate on the content together,' the company said on its official X handle.

Hashtags

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

Rednote joins wave of Chinese firms releasing open-source AI models

Mint

28 minutes ago

Mint

Rednote joins wave of Chinese firms releasing open-source AI models

BEIJING, June 9 (Reuters) - China's Rednote, one of the country's most popular social media platforms, has released an open-source large language model, joining a wave of Chinese tech firms making their artificial intelligence models freely available. The approach contrasts with many U.S. tech giants like OpenAI and Google, which have kept their most advanced models proprietary, though some American firms including Meta have also released open-source models. You may be interested in Open sourcing allows Chinese companies to demonstrate their technological capabilities, build developer communities and spread influence globally at a time when the U.S. has sought to stymie China's tech progress with export restrictions on advanced semiconductors. Rednote's model, called is available for download on developer platform Hugging Face. A company technical paper describing it was uploaded on Friday. In coding tasks, the model performs comparably to Alibaba's Qwen 2.5 series, though it trails more advanced models such as DeepSeek-V3, the technical paper said. RedNote, also known by its Chinese name Xiaohongshu, is an Instagram-like platform where users share photos, videos, text posts and live streams. The platform gained international attention earlier this year when some U.S. users flocked to the app amid concerns over a potential TikTok ban. The company has invested in large language model development since 2023, not long after OpenAI's release of ChatGPT in late 2022. It has accelerated its AI efforts in recent months, launching Diandian, an AI-powered search application that helps users find content on Xiaohongshu's main platform. Other companies that are pursuing an open-source approach include Alibaba which launched Qwen 3, an upgraded version of its model in April. Earlier this year, startup DeepSeek released its low-cost R1 model as open-source software, shaking up the global AI industry due to its competitive performance despite being developed at a fraction of the cost of Western rivals.

Sam Altman's eye-scanning orb lands in UK to fight Deepfakes and reward you with Crypto

India Today

33 minutes ago

India Today

Sam Altman's eye-scanning orb lands in UK to fight Deepfakes and reward you with Crypto

OpenAI CEO Sam Altman's ambitious eye-scanning project has now landed in the United Kingdom. The venture, known as World (formerly Worldcoin), uses a shiny, spherical device called the Orb to scan people's irises and generate a unique digital ID to prove they are human – not a bot or deepfake. Starting this week, people in London will start to spot the orb in some high street shops and malls. In the coming months, the rollout will expand to Manchester, Birmingham, Cardiff, Belfast, and Glasgow, according to a report by Bloomberg. Reportedly, the company also plans to work with some retailers in the country to install self-serve Orbs, similar to how ATMs work independently. advertisementThe system works like this: you stare into the Orb, which photographs your iris and face, then generates a unique 'World ID' that proves your 'humanness.' This World ID can be used to log into apps such as Telegram, Minecraft, Reddit, and Discord without revealing personal data. As a bonus, users are rewarded with a few units of the project's cryptocurrency, Worldcoin (WLD).World is operated by Tools for Humanity, a startup Altman co-founded in 2019. The company says its mission is to address a growing issue: how do we know who's real online, especially as artificial intelligence – including Altman's own ChatGPT – makes it easier to fake people's identities? In an interview with CNBC, Adrian Ludwig, who is the chief architect at Tools for Humanity, said the project is moving from a 'science project to a real network,' with growing demand from both governments and companies. 'The idea is no longer just theoretical. It's something that's real and affecting them every single day,' Ludwig said, World's approach has raised red flags around the globe. Privacy regulators in Germany, Argentina, and Kenya have launched investigations. Spain and Hong Kong have banned the project outright, and South Korea recently fined the company over $800,000 for privacy for Humanity insists it doesn't store personal or biometric data. It says the iris scans are converted into an encrypted code, and that the original images are deleted immediately. The user's World ID is then stored locally on their phone. Ludwig argues this makes the system safer and more private than others. (For instance, Aadhaar is a similar concept with people's biometric data stores against an unique ID for each individual; it has faced multiple data breaches over the years.)Currently, there are reportedly around 1,500 Orbs in circulation globally, but the company apparently aims to ship 12,000 more over the next year. The UK rollout follows the Orb's earlier arrival in six US cities, including San Francisco, Miami, and Atlanta, in May 2025, where users lined up to exchange their iris scans for crypto and a digital identity.

A 60-member team of mostly IIT grads plans to become the Adobe of India — with a little help from Apple

India Today

34 minutes ago

India Today

A 60-member team of mostly IIT grads plans to become the Adobe of India — with a little help from Apple

Photo and video editing aren't new tricks, even if some of the tools that make these things possible today have changed remarkably since the early days. And they seem to be constantly evolving at lightning-fast speed in the age of large language models (LLMs) and generative a time when OpenAI, Google, and virtually every other big tech company have started making these tools available to users at the flick of a button, one wonders what will happen to dedicated photo and editing apps, particularly from smaller players. Are their days numbered? Amid all the risk and uncertainty, a quietly ambitious company from Noida is charting a course to redefine editing, with its sights set on becoming the 'Adobe of the subcontinent.'advertisementAt the helm of this uphill task is Sharad Shankar, founder and CEO of AndOr Communications Pvt. Ltd., whose flagship product, LightX, is rapidly gaining traction in India and abroad. It is not only challenging the status quo but also inspiring a generation of budding developers to dream big and work hard. We sat down with Shankar to delve into the company's journey, its bold aspirations, and its deep integration with Apple's ecosystem that are helping LightX stand out in the burgeoning field of AI-driven content creation. AndOr Communications was founded in December 2015, a pivotal time in the mobile technology space. 'At that point of time, we realised that iPhones were getting really powerful in terms of processing power,' Shankar says. This foresight formed the basis for LightX. 'We realised early on that content creation is something which is going to be used widely in the coming years, and we thought of building a solution which could do the same task that Photoshop can do on a desktop.'advertisementLightX is a comprehensive suite of photo and video editing tools designed for the mobile-first creator. 'You can do all kinds of photo and video editing tasks like cutting, cropping, trimming, cleaning up, removing backgrounds, take your selfie and apply different kinds of makeup, change the expression of your face and things like that,' Shankar traditional editing, LightX provides a vast array of templates for designing graphics like cards and posters. More recently, the company has pivoted significantly towards artificial intelligence. It claims to offer unprecedented control over images, letting users change hairstyles and outfits in portrait shots, turn them into caricatures, and replace certain things and items within them, all with prompt input.'With AI, you can generate images really fast and speed up the creative process,' Shankar brains behind LightXShankar is an IIT Kanpur alumnus. 'Initially, we were providing and creating a complete mobile ecosystem for print systems. We also worked with a company based out of California to develop their complete mobile engineering system for a customer CRM based program on mobile.' One thing led to talent pool at AndOr Communications is notable. 'Most of my core 60-member team comes from IIT,' he says proudly, highlighting the strong academic foundation of his engineering team. The initial foray of LightX into the market was met with significant success. "We rolled out the first version of LightX in May 2016. It was a paid app priced differently in different territories, from USD $1 to USD $5,' Shankar recalls. 'We sold nearly 1.2 million copies at the initial launch of LightX with the global rating of 4.7 out of five and we were one of the top paid apps globally in 2016 and 2017 in the US and most of the European countries.'advertisementRecognising the evolving app economy, the company strategically shifted its monetisation model. 'A one-time purchase model was not feasible for us in the long term, so we moved to a subscription-based model where you pay USD $4 a month or USD $48 a year. It has been downloaded 7.55 million times globally since then, and we still maintain a 4.7 out of 5 rating out of 138K ratings that we have received.' This transition has ensured sustained growth and engagement in the competitive app from AppleLightX's success is intricately linked with its deep integration into the Apple ecosystem. 'Apple has been helping us a lot by featuring us during different events. It gives great visibility to our software and the things that this software can do,' Shankar visibility, Apple's technologies have been instrumental in LightX's performance. 'We have been integrating with the technologies that Apple offers on the device, especially around speeding up your workflow, because most people want to see results instantly. We have been integrating the Metal framework, and we have also been writing our own shaders on top of that.' The Metal framework, specifically, allows LightX to achieve high-performance rendering and seamless user experiences, critical for real-time image and video learning models that run locally on the device itself enhance speed while addressing privacy concerns. 'Apple has a very good ecosystem for deploying machine learning models on devices themselves. So, whatever machine learning models that we train can be easily deployed on iPhones using Core ML models, and we use their Vision frameworks for computer vision-related tasks like detecting faces and poses."LightX also leverages the App Store's promotional tools effectively. 'We have been using all the features that the Apple App Store provides, particularly the App Store events. We also have different custom product pages for different marketing channels, which helps in better discovery of the product.' This strategic use of App Store features maximises their reach and user mentions that most of the downloads they get come from organic channels through keyword search and 'it works in a way where we get a good number of users from all countries, otherwise it's a very expensive process to get downloads globally.'The future of editingShankar details several ambitious plans for LightX's AI capabilities, particularly in semantic and generative editing. 'We are working towards enhancing the experience of editing faces where, if you have a group of five people and one is looking in another direction, you can adjust and make this person look in the correct direction. Similarly, you can apply different touch-up options.'Moreover, the power of natural language prompts is set to revolutionise user interactions. 'We already have prompts on iOS which give you the power to do generative editing. We are working to build even more advanced semantic photo editing where you can just say you want to change the sky to blue, or you want to move a person backwards, correct their face, make them smile - you can do all this by typing a prompt and it will happen. It is more intuitive than finding the right settings manually."advertisementThis extends to graphic design as well. "Similarly, we are working to build designs using prompts. Suppose you have a daughter named Sarah who is 10 years old, and she loves hiking. So, you can say: create a birthday card for Sarah, she is into hiking and is ten years old, so it will automatically generate the text and theme for the birthday card. Ultimately, we are trying to make LightX a one-stop solution for all kinds of creative needs.' The idea, if you can read between the lines is, to democratise sophisticated editing tools.'We use open-source models which are based on Stable Diffusion because it takes a lot of effort in training foundation models. So right now, we are taking the foundation models as they are. We retrain the system, or you can say, distill the system for our specific needs,' Shankar he acknowledges the challenges. 'We understand that open-source models only work to a certain degree and proprietary models from say, OpenAI, are more advanced, so it becomes difficult at times to compete with those players. We are in the process of training our own models which can give us more native and India-specific content.'The ambition extends to adding multilingual support. "Eventually, we plan to integrate with Siri, and make the models understand and respond to queries in Hindi, Tamil, and Telugu to create graphics designs. We are working towards more optimisation of those channels, which can give a more native experience to the users in India. We have started the training process for different languages in which images can be generated as well. Obviously, in text there are lots of models from OpenAI and others. But we are working more towards image generation based on your command prompts."Apple's commitment to developers is a significant advantage for companies like LightX. 'Apple already provides SDKs around voice that reduce the cycle time of developing things. We use their SDKs, and they also give us prompts. We have our own LLMs at the back end which understand these prompts and then create the image. Apple helps us in terms of rolling out the features that developers can use rather than building everything from scratch. It's difficult to build each piece of the puzzle. So, they are providing us a very good intelligence system which we use to derive our own systems.' This symbiotic relationship with Apple allows LightX to focus on its core expertise in imaging while leveraging the iPhone maker's world-renowned and battling DeepfakesIn an era of increasing concerns about data privacy and the misuse of AI, Shankar outlines LightX's robust policies. 'We don't use user images for training purposes.'He elaborates on their image retention and content filtering measures. 'We delete all the images within 24 hours of acceptance and if a user wants to keep any generated photo, then only they can keep it. Otherwise, we automatically delete all kind of user input images. Other than that, we have a strict check on NSFW content. Not only do we do content filtering, once the image is generated, we analyse it further if it contains any kind of nudity. If we detect anything inappropriate, then we reject it promptly.'Compliance with global regulations is also a priority. 'Also, we do setups in GDPR. That is for Europe because they want their data not to be transferred to any other country so that there is no kind of data violation.'Competing with giantsThe rise of large language models and generative AI from tech giants like OpenAI poses a significant challenge for smaller players. Shankar accepts this reality but says LightX can still make it because of its unique advantage.'It's a concern, but we can still make it work. OpenAI will take INR 5 or INR 8 for photo editing, but players like us who are doing their own technology, building their own stack, are more specialised, and we do it in just 20 or 30 Paisa. So, we have a huge window of cost opportunity where we can innovate,' he says. This cost advantage is what gives LightX the bandwidth 'to do more innovation', which is also greatly customised and not just stand out but 'definitely win' over bigger companies.'It's challenging, but eventually, I think people who can sustain through this and can keep innovating will come out as survivors. So, we constantly focus on building our own tech stack and expect it will keep us alive for a long time.'