Latest news with #Sarvam-M

Under fire, Sarvam AI co-founder says worries about Indic GenAI model premature

Time of India

a day ago

Business
Time of India

Under fire, Sarvam AI co-founder says worries about Indic GenAI model premature

Chennai/Bengaluru: India's latest home-grown generative AI language model, Sarvam-M, has drawn fire from sections of the developer community for what they describe as "under-whelming" performance. Tired of too many ads? go ad free now But Pratyush Kumar, co-founder of Sarvam AI, insists the scepticism is premature and betrays a misunderstanding of how AI frontier models mature. "The ecosystem is early and people are worried too early," he tells TOI. "We are scrambling amongst ourselves when the world moves fast. We want to create an AI ecosystem where more people can positively collaborate." Released this month, the relatively small 24-billion-parameter Sarvam-M model was trained to reason across ten Indian languages while tackling maths and coding tasks. Kumar says benchmarks on Hugging Face (a platform and open-source library primarily used for leveraging machine learning models) show the model matching or outscoring popular open-source rivals (like Meta's Llama, Mistral Small and Gemma 3) in mathematics, programming and Indic-language comprehension. "With this we want to show that we cracked post-training (process of refining and optimising a machine learning model after its initial training phase) problems and our methodology is comparable with other models," he explains. "We open-sourced this because we want to show that such a model can be built and encourage other people to do it." Much of the social-media push-back has centred on relatively modest early-stage download numbers and the perception that Sarvam-M offers few breakthrough capabilities. Kumar counters that India's sovereign-AI ambitions demand more than one blockbuster release. "These things involve both scientific explorations and resource consumption," he says. Tired of too many ads? go ad free now "I think we are on the path to building state-of-the-art models. " Sarvam AI is the first startup chosen to build a frontier model under the government's IndiaAI Mission, which is funding compute, data and research partnerships to reduce reliance on overseas platforms. Although the latest model is a private effort separate from the IndiaAI Mission, Kumar says the initiative will benefit everyone. He declines to give a timeline for the AI Mission-backed foundation model, noting that the company has yet to receive graphics-processing units (GPUs) from government suppliers. "We will open-source the foundational model," he says, but warns that schedules depend on hardware access and collaborative research cycles. Industry weighs in Seasoned AI practitioners say early criticism overlooks the scale of what Sarvam is attempting. "Building a 24-billion-parameter model in India is not easy, especially when deep research isn't encouraged in most universities or companies," says Jaspreet Bindra, co-founder of consultancy AI&Beyond. "Sarvam-M demonstrates robust multilingual reasoning by supporting ten Indian languages – no other model in the world has such a strong Indic component. " Sourabh Deorah, CEO & co-founder of an AI-powered employee engagement and rewards platform, says that as someone deeply involved in machine learning, he understands how challenging it is to create a 24-bn parameter model that not only handles reasoning tasks like math and programming but also delivers high-quality performance across multiple Indian languages – many of which have long been underserved in the AI space. Piyush Goel, CEO & founder of IT consulting company Beyond Key, says that the new model's potential to drive agentic AI in education, healthcare, and automation is exciting. Agentic AI is a type of AI that makes decisions and takes actions based on context and objectives without constant human intervention. Karthikeyan G, senior director of engineering architecture at software company Ascendion, says Sarvam-M's architecture will enable AI agents to interact among themselves (to take complex decisions) thanks to the standardised protocols being used. This will be crucial for the next stage of the AI wave.

Indian AI startup launches Sarvam-M model: What is it, why is everyone talking about it

India Today

26-05-2025

Business
India Today

Indian AI startup launches Sarvam-M model: What is it, why is everyone talking about it

India's homegrown AI startup Sarvam has launched its newest language model, Sarvam-M, which is making waves in the tech community for both good and not-so-good reasons. The model is being praised for its focus on Indian languages, maths, and programming tasks, but it is also facing criticism for not being 'good enough.' The drama surrounding the AI company has sparked even more interest. If you have questions too, here's a breakdown of what the Sarvam-M model is, why it matters, and why the AI company is facing exactly is Sarvam-M?Sarvam-M is a large language model, or LLM, developed by Indian startup Sarvam AI. These types of models are trained to understand and generate human-like text, and they power tools like chatbots, translation software, and educational apps. Sarvam-M is based on a smaller model called Mistral Small and has been expanded into a much larger system with 24 billion parameters, which are basically the knobs and dials that help it process language and learn from simple terms, Sarvam-M is like a very smart AI assistant that can handle a wide range of tasks, from answering complex math questions to understanding and responding in Indian languages like Hindi, Bengali, Gujarati, and more. What makes it different is that it has been built with India in mind, supporting 10 local languages and offering strong performance in tasks involving both language and was it built?advertisement Sarvam-M was trained using a three-step process:Supervised Fine-Tuning (SFT): This stage involved feeding the model high-quality questions and answers to help it learn. The team made sure the responses were relevant, less biased, and culturally appropriate. This helped the model get good at both everyday conversations and more complex problem-solving Learning with Verifiable Rewards (RLVR): In this step, Sarvam-M was further improved using data related to instructions, programming, and mathematics. It was taught to follow instructions better and think more logically using feedback loops and carefully designed Optimisation: This final stage involved making the model run faster and more efficiently. Techniques like FP8 quantisation (a way of simplifying data without losing accuracy) and better decoding methods helped improve the model's speed and performance, though there were still some issues with handling high can Sarvam-M do?The model has been built to power various real-world applications. It can be used for:Conversational AI, which will basically mean that it can power chatbots and virtual assistantsMachine translation, where it could be used to translate between English and Indian languagesEducation, considering its ability to solve maths problems, answer science questions, or even helping students prepare for competitive exams like JEE. In fact, one of the Sarvam team members shared results showing that Sarvam-M's "Think" mode correctly answered several JEE Advanced-level questions in Hindi, which can be a major step in making such tools useful for Indian does it compare with other models?advertisementSarvam-M has shown impressive results in certain areas. In a test that combined math with romanised Indian languages, the model achieved over 86 per cent improvement, beating out some other well-known models. It performed better than Meta's Llama-4 Scout on many benchmarks and was on par with much larger models like Llama-3.3 70B and Google's Gemma 3 it did slightly underperform in English knowledge tests, with about 1 per cent lower accuracy compared to others. Still, the model stands out for its Indian language skills and reasoning why the backlash?Despite all the technical achievements, the model didn't get the warm welcome one might expect. On Hugging Face, a platform where developers can download and test AI models, Sarvam-M was downloaded only 334 times in the first two days. Some critics saw this as a sign of Das, an investor at Menlo Ventures, called the response 'embarrassing,' saying there's little interest in this kind of work. He compared it to a different model created by two Korean college students, which got nearly 200,000 downloads quickly. advertisementThis sparked a debate. Supporters of Sarvam-M, including Aashay Sachdeva from the company, defended the model, highlighting its benchmark results and customisation process. He even posted proof of the model's performance on social media. Another user, who works at AI4Bharat, added that the real achievement was not just the model, but the method used to train it. He said it sets a strong foundation for other Indian developers to build on. advertisementMeanwhile, Sarvam's co-founder, Vivek Raghavan, called Sarvam-M a 'stepping stone' toward building India's own AI systems. The company is one of the few chosen under the Indian government's IndiaAI Mission to develop a sovereign LLM for the founder Sridhar Vembu also urged people not to focus only on instant success. He said that most products take time to find their place, and praised Sarvam for their efforts. 'Keep fighting the good fight,' he encouraged.

Sarvam AI debuts flagship open-source LLM with 24 billion parameters

Indian Express

24-05-2025

Business
Indian Express

Sarvam AI debuts flagship open-source LLM with 24 billion parameters

Indian AI startup Sarvam has unveiled its flagship Large Language Model (LLM), Sarvam-M. The LLM is a 24-billion-parameter open-weights hybrid language model built on top of Mistral Small. Sarvam-M has reportedly achieved new standards in mathematics, programming tasks, and even Indian language understanding. According to the company, the model has been designed for a broad range of applications. Conversational AI, machine translation, and educational tools are some of the notable use cases of Sarvam-M. The open-source model is capable of performing reasoning tasks like math and programming. According to the official blog post, the model has been enhanced through a three-step process – Supervised Fine-Tuning (SFT), Reinforcement Learning with Verifiable Rewards (RLVR), and Inference Optimisations. When it comes to SFT, the team at Sarvam curated a wide set of prompts focused on quality and difficulty. They generated completions using permissible models, filtered them through custom scoring, and adjusted outputs to reduce bias and cultural relevance. The SFT process trained Sarvam-M to function in both 'think', which is complex reasoning, and 'non-think' or general conversation modes. On the other hand, with RLVR, Sarvam-M was further trained using a curriculum consisting of instruction following, programming datasets, and math. The team used techniques like custom reward engineering and prompt sampling strategies to enhance the model's performance across tasks. For inference optimisation, the model underwent post-training quantisation for FP8 precision, achieving negligible loss in accuracy. Techniques like lookahead decoding were implemented to boost throughput; however, challenges in supporting higher concurrency were noted. Notably, in combined tasks with Indian languages and math, such as the romanised Indian language GSM-8K benchmark, the model achieved an impressive +86% improvement. In most benchmarks, Sarvam-M outperformed Llama-4 Scout, and it is comparable to larger models like Llama-3.3 70B and Gemma 3 27B. However, it shows a slight drop (~1%) in English knowledge benchmarks like MMLU. The Sarvam-M model is currently accessible via Sarvam's API and can be downloaded from Hugging Face for experimentation and integration.

Latest news with #Sarvam-M

Under fire, Sarvam AI co-founder says worries about Indic GenAI model premature

Indian AI startup launches Sarvam-M model: What is it, why is everyone talking about it

Sarvam AI debuts flagship open-source LLM with 24 billion parameters

Get Started Now: Download the App