logo
#

Latest news with #AbhishekUpperwal

Inside India's two-track strategy to become an AI powerhouse
Inside India's two-track strategy to become an AI powerhouse

Mint

time2 days ago

  • Business
  • Mint

Inside India's two-track strategy to become an AI powerhouse

Bengaluru: At Google's annual I/O Connect event in Bengaluru this July, the spotlight was on India's AI ambitions. With over 1,800 developers in attendance, the recurring theme echoing across various panel discussions, product announcements and workshops was that of building AI capability for India's linguistic diversity. With 22 official languages and hundreds of spoken dialects, India faces a monumental challenge in building AI systems that can work across this multilingual landscape. In the demo area of the event, this challenge was front and centre, with startups showcasing how they're tackling it. Among those were Sarvam AI, demonstrating Sarvam-Translate, a multilingual model fine-tuned on Google's open-source large language model (LLM), Gemma. Next to it, CoRover demonstrated BharatGPT, a chatbot for public services such as the one used by the Indian Railway Catering and Tourism Corporation (IRCTC). At the event, Google announced that AI startups Sarvam, Soket AI and Gnani are building the next generation of India AI models, fine-tuning them on Gemma. At first glance, this might seem contradictory. Three of these startups are among the four selected to build India's sovereign large language models under the ₹10,300 crore IndiaAI Mission, a government initiative to develop home-grown foundational models from scratch, trained on Indian data, languages and values. So, why Gemma? Building competitive models from scratch is a resource-heavy task involving multiple challenges and India does not have the luxury of building from scratch, in isolation. With limited high-quality training datasets, an evolving compute infrastructure and urgent market demand, the more pragmatic path is to start with what is available. These startups are therefore taking a layered approach, fine-tuning open-source models to solve real-world problems today, while simultaneously building the data pipelines, user feedback loops and domain-specific expertise needed to train more indigenous and independent models over time. Fine-tuning involves taking an existing large language model already trained on vast amounts of general data and teaching it to specialize further on focused and often local data, so that it can perform better in those contexts. Build and bootstrap Project EKA, an open-source community driven initiative led by Soket, is a sovereign LLM effort, being developed in partnership with IIT Gandhinagar, IIT Roorkee and IISc Bangalore. It is being designed from scratch by training code, infrastructure and data pipelines, all sourced within India. A 7 billion-parameter model is expected in the next four-five months, with a 120 billion-parameter model planned over a 10-month cycle. 'We've mapped four key domains: agriculture, law, education and defence," says Abhishek Upperwal, co-founder of Soket AI. 'Each has a clear dataset strategy, whether from government advisory bodies or public-sector use cases." A key feature of the EKA pipeline is that it is entirely decoupled from foreign infrastructure. Training happens on India's GPU cloud and the resulting models will be open-sourced for public use. The team, however, has taken a pragmatic approach, using Gemma to run initial deployments. 'The idea is not to depend on Gemma forever," Upperwal clarifies. 'It's to use what's there today to bootstrap and switch to sovereign stacks when ready." CoRover's BharatGPT is another example of this dual strategy in action. It currently runs on a fine-tuned model, offering conversational agentic AI services in multiple Indian languages to various government clients, including IRCTC, Bharat Electronics Ltd, and Life Insurance Corporation. 'For applications in public health, railways and space, we needed a base model that could be fine-tuned quickly," says Ankush Sabharwal, CoRover's founder. 'But we have also built our own foundational LLM with Indian datasets." Like Soket, CoRover treats the current deployments as both service delivery and dataset creation. By pre-training and fine-tuning Gemma to handle domain-specific inputs, it is trying to improve accessibility today while building a bridge to future sovereign deployments. 'You begin with an open-source model. Then you fine-tune it, add language understanding, lower latency and expand domain relevance," Sabharwal explains. 'Eventually, you'll swap out the core once your own sovereign model is ready," he adds. Amlan Mohanty, a technology policy expert, calls India's approach an experiment in trade-offs, betting on models such as Gemma to enable rapid deployment without giving up the long-term goal of autonomy. 'It's an experiment in reducing dependency on adversarial countries, ensuring cultural representation and seeing whether firms from allies like the US will uphold those expectations," he says. Mint reached out to Sarvam and Gnani with detailed queries regarding their use of Gemma and its relevance to their sovereign AI initiatives, but the companies did not respond. Why local context is critical For India, building its own AI capabilities is not just a matter of nationalistic pride or keeping up with global trends. It's more about solving problems that no foreign model can adequately address today. Think of a migrant from Bihar working in a cement factory in rural Maharashtra, who goes to a local clinic with a persistent cough. The doctor, who speaks Marathi, shows him a chest X-ray, while the AI tool assisting the doctor explains the findings in English, in a crisp Cupertino accent, using medical assumptions based on Western body types. The migrant understands only Hindi and much of the nuance is lost. Far from being just a language problem, it's a mismatch in cultural, physiological and contextual grounding. A rural frontline health worker in Bihar needs an AI tool that understands local medical terms in Maithili, just as a farmer in Maharashtra needs crop advisories that align with state-specific irrigation schedules. A government portal should be able to process citizen queries in 15 languages with regional variations. These are high-impact and everyday use cases where errors can directly affect livelihoods, functioning of public services and health outcomes. Fine-tuning open models gives Indian developers a way to address these urgent and ground-level needs right now, while building the datasets, domain knowledge and infrastructure that can eventually support a truly sovereign AI stack. This dual-track strategy is possibly one of the fastest ways forward, using open tools to bootstrap sovereign capacity from the ground up. 'We don't want to lose the momentum. Fine-tuning models like Gemma lets us solve real-world problems today in applications such as agriculture or education, while we build sovereign models from scratch," says Soket AI's Upperwal. 'These are parallel but separate threads," says Upperwal. 'One is about immediate utility, the other about long-term independence. Ultimately these threads will converge." A strategic priority The IndiaAI Mission is a national response to a growing geopolitical issue. As AI systems become central to education, agriculture, defence and governance, over-reliance on foreign platforms raises the risks of data exposure and loss of control. This was highlighted last month when Microsoft abruptly cut off cloud services to Nayara Energy after European Union sanctions on its Russian-linked operations. The disruption, which was reversed only after a court intervention, raised alarms on how foreign tech providers can become geopolitical pressure points. Around the same time, US President Donald Trump doubled tariffs on Indian imports to 50%, showing how trade and tech are increasingly being used as leverage. Besides reducing dependence, sovereign AI systems are also important for India's critical sectors to accurately represent local values, regulatory frameworks and linguistic diversity. Most global AI models are trained on English-dominant and Western datasets, which make them poorly equipped to handle the realities of India's multilingual population or the domain-specific complexity of its systems. This becomes a challenge when it comes to applications such as interpreting Indian legal judgments or accounting for local crop cycles and farming practices in agriculture. Mohanty says that sovereignty in AI isn't about isolation, but about who controls the infrastructure and who sets the terms. 'Sovereignty is basically about choice and dependencies. The more choice you have, the more sovereignty you have." He adds that full-stack independence from chips to models is not feasible for any country, including India. Even global powers such as the US and China balance domestic development with strategic partnerships. 'Nobody has complete sovereignty or control or self-sufficiency across the stack, so you either build it yourself or you partner with a trusted ally." Mohanty also points out that the Indian government has taken a pragmatic approach by staying agnostic to the foundational elements of its AI stack. This stance is shaped less by ideology and more by constraints such as lack of Indic data, compute capacity and ready-made open-source alternatives built for India. India's data lacunae Despite the momentum behind India's sovereign AI push, the lack of high-quality training data, particularly in Indian languages, continues to be one of its most fundamental roadblocks. While the country is rich in linguistic diversity, that diversity has not translated into digital data that AI systems can learn from. Manish Gupta, director of engineering at Google DeepMind India, cited internal assessments that found that 72 of India's spoken languages, which had over 100,000 speakers, had virtually no digital presence. 'Data is the fuel of AI and 72 out of those 125 languages had zero digital data," he says. To address this linguistic challenge for Google's India market, the company launched Project Vaani in collaboration with the Indian Institute of Science (IISc). This initiative aims to collect voice samples across hundreds of Indian districts. The first phase captured over 14,000 hours of speech data from 80 districts, representing 59 languages, 15 of which previously had no digital datasets. The second phase expanded coverage to 160 districts and future phases aim to reach all 773 districts in India. 'There's a lot of work that goes into cleaning up the data, because sometimes the quality is not good," Gupta says, referring to the challenges of transcription and audio consistency. Google is also developing techniques to integrate these local language capabilities into its large models. Gupta says that learnings from widely spoken languages such as English and Hindi are helping improve performance in lower-resource languages such as Gujarati and Tamil, largely due to cross-lingual transfer capabilities built into multilingual language models. The company's Gemma LLM incorporates Indian language capabilities derived from this body of work. Gemma ties into LLM efforts run by Indian startups through a combination of Google's technical collaborations, infrastructure guidance and by making its collected datasets publicly available. According to Gupta, the strategy is driven by both commercial and research imperatives. India is seen as a global testbed for multilingual and low-resource AI development. Supporting local language AI, especially through partnerships with startups such as Sarvam, Soket AI and allows Google to build inclusive tools that can scale beyond India to include other linguistically complex regions in Southeast Asia and Africa. For India's sovereign AI builders, the lack of readymade and high-quality Indic datasets means that model development and dataset creation must happen in parallel. For the Global South India's layered strategy to use open models now, while concurrently building sovereign models, also offers a roadmap for other countries navigating similar constraints. It's a blueprint for the Global South, where nations are wrestling with the same dilemma on how to build AI systems that reflect local languages, contexts and values without the luxury of vast compute budgets or mature data ecosystems. For these countries, fine-tuned open models offer a bridge to capability, inclusion, and control. 'Full-stack sovereignty in AI is a marathon, not a sprint," Upperwal says. 'You don't build a 120 billion model in a vacuum. You get there by deploying fast, learning fast and shifting when ready." Singapore, Vietnam and Thailand are already exploring similar methods, using Gemma to kickstart their local LLM efforts. By 2026, when India's sovereign LLMs, including EKA, are expected to be production-ready, Upperwal says the dual track will likely converge, and bootstrapped models will fade while homegrown systems may take their place. But even as these startups build on open tools such as Meta's Llama or Google's Gemma, which are engineered by global tech giants, the question of dependency continues to loom. Even for open-source models, control over architecture, training techniques and infrastructure support still leans heavily on Big Tech. While Google has open-sourced speech datasets, including Project Vaani, and extended partnerships with IndiaAI Mission startups, the terms of such openness are not always symmetrical. India's sovereign plans, therefore, depend not on shunning open models but on eventually outgrowing them. 'If Google is directed by the US government to close down its weights (model parameters), or increase API (application programming interface) prices or change transparency norms, what would the impact be on Sarvam or Soket?" questions Mohanty, adding that while the current India-US tech partnership is strong, future policies could shift and jeopardize India's digital sovereignty. In the years ahead, India and other nations in the Global South will face a critical question over whether they can convert this borrowed support into a complete, sovereign AI infrastructure, before the terms of access shift or the window to act closes.

Beyond text: Why voice is emerging as India's next frontier for AI interaction
Beyond text: Why voice is emerging as India's next frontier for AI interaction

Mint

time16-07-2025

  • Business
  • Mint

Beyond text: Why voice is emerging as India's next frontier for AI interaction

Voice is fast becoming the defining layer of human-AI interaction in India, despite being the most challenging to train. Artificial intelligence (AI) startups are sharpening their focus on sculpting this intgeraction with design, authentic emotion, and intent. Yet, India presents a unique challenge: the sheer diversity of its accents, languages, and tonalities. Unlike text, which is relatively uniform, spoken language is richly-layered—with cultural nuances, colloquialisms and emotion. Startups building voice-first AI models are now doubling down on one thing above all else: the depth and diversity of datasets. Why voice is emerging as the frontline interface In India, where oral tradition plays a pivotal role in communication, voice isn't just a convenience—it's a necessity. 'We're not an English-first or even a text-first country. Even when we type in Hindi, we often use the English script instead of Devanagari. That's exactly why we need to build voice-first models—because oral tradition plays such a vital role in our culture," said Abhishek Upperwal, chief executive officer (CEO) of Soket AI Labs. Voice is also proving critical for customer service and accessibility. 'Voice plays a crucial role in bridging accessibility gaps, particularly for users with disabilities," said Mahesh Makhija, leader, technology consulting, at EY. 'Many customers even prefer voicing complaints over typing, simply because talking feels more direct and human. Moreover, voice is far more frictionless than navigating mobile apps or interfaces—especially for users who are digitally-illiterate, older, or not fluent in English," said Makhija, adding that 'communicating in vernacular languages opens access to the next half a billion consumers, which is a major focus for enterprises." Startups like are already deploying voice systems across banking and financial services to streamline customer support, assist with loan applications, and eliminate virtual queues. 'The best way to reach people—regardless of literacy levels or demographics—is through voice in the local language, so it's very important to capture the tonality of the conversations," said Ganesh Gopalan, CEO of The hunt for rich, real-world data As of mid-2025, India's AI landscape shows a clear tilt toward text-based AI, with over 90 Indian companies active in the space, compared to 57 in voice-based AI. Text-based platforms tend to focus on document processing, chat interfaces, and analytics. In contrast, voice-based companies are more concentrated in customer service, telephony, and regional language access, according to data from Tracxn. In terms of funding, voice-first AI startups have attracted larger funding rounds at later stages, while text AI startups show broader distribution, especially at earlier stages. For example, a voice-first AI firm, raised a total of $47.6 million across five funding rounds. Similarly, has cumulatively secured around $102 million, including a major $78.15M Series C round in 2021, making it one of the top-funded startups in voice AI, data from Tracxn shows. However, data remains the foundational challenge for voice models. Voice AI systems need massive, diverse datasets that not only cover different languages, but also regional accents, slangs and emotional tonality. Chaitanya C., co-founder and chief technological officer of Ozonetel Communications, put it simply: 'The datasets matter the most—speaking as an AI engineer, I can say it's not about anything else; it's all about the data." IndiaAI Mission has allocated ₹199.55 crore for datasets—just about 2% of the mission's total ₹10,300 crore budget —while 44% has gone to compute. 'Investments solely in compute are inherently transient—their value fades once consumed. On the other hand, investments in datasets build durable, reusable assets that continue to deliver value over time," said Chaitanya. He also emphasized the scarcity of rich, culturally-relevant data in regional languages like Telugu and Kannada. 'The amount of data easily available in English, when compared with Telugu and Kannada or Hindi, it's not even comparable," he said. 'Somewhere it's just not perfect, it wouldn't be as good as an English story, which is why I wouldn't want it to tell a Telugu story for my kid." 'Some movie comes out, nobody's going to write it in government documents, but people are going to talk about it, and that is lost," he added, pointing out that government datasets often lack cultural nuance and everyday language. Gopalan of agreed. 'The colloquial language is often very different from the written form. Language experts have a great career path ahead of them because they not only understand the language technically, but also know how to converse naturally and grasp colloquial nuances." Startups are now employing creative methods to fill these gaps. 'First, we collect data directly from the field using multiple methods—and we're careful with how we handle that data. Second, we use synthetic data in some cases. Third, we augment that synthetic data further. In addition, we also leverage a substantial amount of open-source data available from universities and other sources," Gopalan said. Synthetic data is artificially-generated data that mimics real-world data for use in training, testing, or validating models. Upperwal added that Soket AI uses a similar approach: 'We start by training smaller AI models with the limited real voice data we have. Once these smaller models are reasonably accurate, we use them to generate synthetic voice data—essentially creating new, artificial examples of speech." However, some intend to consciously stay away from synthetic data. Ankush Sabarwal, CEO and founder of CoRover AI, said the company relies exclusively on real data, deliberately avoiding synthetic data, 'If I am a consumer and I am interacting with an AI bot, the AI bot will become intelligent by the virtue of it interacting with a human like me." The ethical labyrinth of voice AI As companies begin to scale their data pipelines, the new Digital Personal Data Protection (DPDP) Act will shape how they collect and use voice data. 'The DPDP law emphasizes three key areas: it mandates clear, specific, and informed consent before collecting data. Second, it enforces purpose limitation—data can only be used for legitimate, stated purposes like KYC or employment, not unrelated model training. Third, it requires data localization, meaning critical personal data must reside on servers in India," said Makhija. He added, 'Companies have begun including consent notices at the start of customer calls, often mentioning AI training. However, the exact process of how this data flows into model training pipelines is still evolving and will become clearer as DPDP rules are fully implemented." Outsourcing voice data collection raises red flags, too. 'For a deep-tech company like ours, voice data is one of the most powerful forms of IP (intellectual property) we have, and outsourcing it could compromise its integrity and ownership. What if someone is using copyrighted material?" said Gopalan.

Sarvam and three other AI firms in MeitY's LLM build out first shortlist
Sarvam and three other AI firms in MeitY's LLM build out first shortlist

Time of India

time24-04-2025

  • Business
  • Time of India

Sarvam and three other AI firms in MeitY's LLM build out first shortlist

Bengaluru-based Sarvam AI and three other artificial intelligence (AI) firms, including Soket AI Labs , and could be among the first cohort of companies to be selected by the ministry of IT ( MeitY ) to receive incentives for building frontier AI models under the Rs 10,000-crore IndiaAI Mission , sources said. Sarvam may receive Rs 200 crore worth of GPU (graphics processing unit) compute power provided by the government for free instead of any monetary award, they said, adding that the final list of awardees is likely to be announced by minister Ashwini Vaishnaw in the next few days, people familiar with the matter told ET. "The committee has almost finalised the names. It should be announced soon," said an official, requesting not to be named. Sarvam had pitched to develop a 70-billion parameter multimodal AI model which supports Indian languages along with English, said a person privy to the decision. "The work has already started on it." Sarvam did not reply to ET's request for comment. Live Events Soket AI Labs' founder Abhishek Upperwal said it has proposed to create a 120-billion parameter open-source Indic LLM (large language model) trained on 2 trillion tokens under the 'EKA Project'. Upperwal did not share the funding requirement. Discover the stories of your interest Blockchain 5 Stories Cyber-safety 7 Stories Fintech 9 Stories E-comm 9 Stories ML 8 Stories Edtech 6 Stories Meanwhile, and have proposed to build small language models. Ganesh Gopalan, co-founder and chief executive of said 'it would be appropriate to comment after receiving an official communication from the government'. The conversational AI company has proposed to build speech-to-speech AI models focused on customer service use-cases. San Francisco-based which offers personalised video creation with AI, could not be reached for comment. China's progress with the DeepSeek AI model earlier this year proved to be a wake-up call for the Indian government, academia and startups and led to fast-tracking of efforts to develop indigenous foundational models in India. MeitY announced incentive allocation of Rs 1,500 crore for entities and individuals who proposed to build an AI model ground up. By February 15, the ministry had received 67 applications from Indian and global startups and researchers, followed by another cohort of 120 applications in the following month. ET had reported that owing to an overwhelming response, the government has paused accepting applications and is evaluating the existing prospects. 'We should have three foundational models by the end of this year,' Vaishnaw had said earlier. Separately, the government had also empanelled 10 GPU-as-a-service providers to build a common compute facility, wherein GPU compute can be accessed at less than $1 per hour –among the lowest rates globally.

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into a world of global content with local flavor? Download Daily8 app today from your preferred app store and start exploring.
app-storeplay-store