logo
#

Latest news with #MistralSmall

Indian AI startup launches Sarvam-M model: What is it, why is everyone talking about it
Indian AI startup launches Sarvam-M model: What is it, why is everyone talking about it

India Today

time26-05-2025

  • Business
  • India Today

Indian AI startup launches Sarvam-M model: What is it, why is everyone talking about it

India's homegrown AI startup Sarvam has launched its newest language model, Sarvam-M, which is making waves in the tech community for both good and not-so-good reasons. The model is being praised for its focus on Indian languages, maths, and programming tasks, but it is also facing criticism for not being 'good enough.' The drama surrounding the AI company has sparked even more interest. If you have questions too, here's a breakdown of what the Sarvam-M model is, why it matters, and why the AI company is facing exactly is Sarvam-M?Sarvam-M is a large language model, or LLM, developed by Indian startup Sarvam AI. These types of models are trained to understand and generate human-like text, and they power tools like chatbots, translation software, and educational apps. Sarvam-M is based on a smaller model called Mistral Small and has been expanded into a much larger system with 24 billion parameters, which are basically the knobs and dials that help it process language and learn from simple terms, Sarvam-M is like a very smart AI assistant that can handle a wide range of tasks, from answering complex math questions to understanding and responding in Indian languages like Hindi, Bengali, Gujarati, and more. What makes it different is that it has been built with India in mind, supporting 10 local languages and offering strong performance in tasks involving both language and was it built?advertisement Sarvam-M was trained using a three-step process:Supervised Fine-Tuning (SFT): This stage involved feeding the model high-quality questions and answers to help it learn. The team made sure the responses were relevant, less biased, and culturally appropriate. This helped the model get good at both everyday conversations and more complex problem-solving Learning with Verifiable Rewards (RLVR): In this step, Sarvam-M was further improved using data related to instructions, programming, and mathematics. It was taught to follow instructions better and think more logically using feedback loops and carefully designed Optimisation: This final stage involved making the model run faster and more efficiently. Techniques like FP8 quantisation (a way of simplifying data without losing accuracy) and better decoding methods helped improve the model's speed and performance, though there were still some issues with handling high can Sarvam-M do?The model has been built to power various real-world applications. It can be used for:Conversational AI, which will basically mean that it can power chatbots and virtual assistantsMachine translation, where it could be used to translate between English and Indian languagesEducation, considering its ability to solve maths problems, answer science questions, or even helping students prepare for competitive exams like JEE. In fact, one of the Sarvam team members shared results showing that Sarvam-M's "Think" mode correctly answered several JEE Advanced-level questions in Hindi, which can be a major step in making such tools useful for Indian does it compare with other models?advertisementSarvam-M has shown impressive results in certain areas. In a test that combined math with romanised Indian languages, the model achieved over 86 per cent improvement, beating out some other well-known models. It performed better than Meta's Llama-4 Scout on many benchmarks and was on par with much larger models like Llama-3.3 70B and Google's Gemma 3 it did slightly underperform in English knowledge tests, with about 1 per cent lower accuracy compared to others. Still, the model stands out for its Indian language skills and reasoning why the backlash?Despite all the technical achievements, the model didn't get the warm welcome one might expect. On Hugging Face, a platform where developers can download and test AI models, Sarvam-M was downloaded only 334 times in the first two days. Some critics saw this as a sign of Das, an investor at Menlo Ventures, called the response 'embarrassing,' saying there's little interest in this kind of work. He compared it to a different model created by two Korean college students, which got nearly 200,000 downloads quickly. advertisementThis sparked a debate. Supporters of Sarvam-M, including Aashay Sachdeva from the company, defended the model, highlighting its benchmark results and customisation process. He even posted proof of the model's performance on social media. Another user, who works at AI4Bharat, added that the real achievement was not just the model, but the method used to train it. He said it sets a strong foundation for other Indian developers to build on. advertisementMeanwhile, Sarvam's co-founder, Vivek Raghavan, called Sarvam-M a 'stepping stone' toward building India's own AI systems. The company is one of the few chosen under the Indian government's IndiaAI Mission to develop a sovereign LLM for the founder Sridhar Vembu also urged people not to focus only on instant success. He said that most products take time to find their place, and praised Sarvam for their efforts. 'Keep fighting the good fight,' he encouraged.

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into the world of global news and events? Download our app today from your preferred app store and start exploring.
app-storeplay-store