Latest news with #BharatGen


Time of India
a day ago
- Politics
- Time of India
Bharat Gen, AI-based multimodal LLM for Indian languages, launched
New Delhi: Science and Technology Minister Jitendra Singh launched ' Bharat Gen ', an indigenously developed artificial intelligence-based multimodal Large Language Model (LLM) for Indian languages, here on Monday. Developed under the National Mission on Interdisciplinary Cyber-Physical Systems (NM-ICPS) and implemented through TIH Foundation for IoT (Internet of Things) and IoE (Internet of Everything) at IIT Bombay , Bharat Gen aims to revolutionize AI development across India's linguistic and cultural spectrum, Singh said. The initiative is supported by the Department of Science and Technology (DST) and brings together a consortium of leading academic institutions, experts, and innovators. Singh described Bharat Gen as a "national mission to create AI that is ethical, inclusive, multilingual, and deeply rooted in Indian values and ethos". The platform integrates text, speech, and image modalities, offering seamless AI solutions in 22 Indian languages. "This initiative will empower critical sectors such as healthcare, education, agriculture, and governance, delivering region-specific AI solutions that understand and serve every Indian," Singh said. The minister recounted a success story from his own constituency Udhampur where an AI doctor communicates fluently in the patient's native language. "It not only builds trust but has a placebo-like psychological effect, enabling better care in remote regions connected with superspeciality hospitals across India," he said. Singh emphasised the transformative role of Generative AI in grassroots governance, citing the integration of multilingual feedback systems into platforms like CPGRAMS to enhance citizen engagement and grievance redressal.
&w=3840&q=100)

Business Standard
2 days ago
- Politics
- Business Standard
Everything to know about Bharat Gen, the AI-based LLM for Indian languages
Union Minister for Science and Technology, Jitendra Singh, launched 'Bharat Gen', an artificial intelligence (AI)-based multimodal Large Language Model (LLM) designed for Indian languages, in New Delhi on Monday. Developed under the National Mission on Interdisciplinary Cyber-Physical Systems (NM-ICPS) and implemented through TIH Foundation for IoT (Internet of Things) and IoE (Internet of Everything) at IIT Bombay, Bharat Gen aims to revolutionise AI development across India's linguistic and cultural spectrum, Singh said. Bharat Gen is backed by the Department of Science and Technology (DST) and brings together a consortium of top academic institutions, experts, and innovators to lead AI research and application. Describing the project, Singh said it represents a national-level effort to build AI that is 'ethical, inclusive, multilingual, and deeply rooted in Indian values and ethos.' AI to support healthcare, education, and governance "This initiative will empower critical sectors such as healthcare, education, agriculture, and governance, delivering region-specific AI solutions that understand and serve every Indian," Singh said. He also shared a real-life example from his own constituency of Udhampur, where an AI doctor communicates fluently in the patient's native language. "It not only builds trust but has a placebo-like psychological effect, enabling better care in remote regions connected with superspeciality hospitals across India," he explained. Singh highlighted the growing role of Generative AI in governance at the grassroots level, particularly through improved feedback systems in government platforms. "The integration of multilingual feedback systems into platforms like CPGRAMS helps enhance citizen engagement and grievance redressal," he said. What is AI-based multimodal LLM A multimodal Large Language Model (LLM) powered by AI is a highly advanced system that can understand and process various types of input, such as text, images, sound, and video. Unlike traditional language models that work only with text, multimodal models can combine different kinds of data. For example, they can look at an image and answer questions about it or watch a video and describe what's happening. These systems are trained on huge and diverse datasets that include more than just written words, which helps them carry out complex tasks across formats. By using multiple types of input, these models can interact in a way that's more similar to how humans understand the world. This makes them highly effective for practical use in areas like healthcare (such as reading scans and reports), education (like using visuals alongside text to aid learning), and accessibility (for instance, describing images to people with visual impairments). Multimodal LLMs are a big step forward in creating AI that's more adaptable, aware of context, and user-friendly.


NDTV
2 days ago
- Health
- NDTV
'Bharat Gen', Indigenous AI-Based Model For Indian Languages, Launched
New Delhi: Science and Technology Minister Jitendra Singh launched 'Bharat Gen', an indigenously developed artificial intelligence-based multimodal Large Language Model (LLM) for Indian languages, here on Monday. Developed under the National Mission on Interdisciplinary Cyber-Physical Systems (NM-ICPS) and implemented through TIH Foundation for IoT (Internet of Things) and IoE (Internet of Everything) at IIT Bombay, Bharat Gen aims to revolutionize AI development across India's linguistic and cultural spectrum, Mr Singh said. The initiative is supported by the Department of Science and Technology (DST) and brings together a consortium of leading academic institutions, experts, and innovators. "Launched India's first-of-its-kind, indigenously developed, Artificial Intelligence (AI) driven , government-funded Multimodal "Large Language Model" (LLM) for Indian languages. "BharatGen" is not a mere technology venture but indeed a national mission to create AI that is… — Dr Jitendra Singh (@DrJitendraSingh) June 2, 2025 Mr Singh described Bharat Gen as a "national mission to create AI that is ethical, inclusive, multilingual, and deeply rooted in Indian values and ethos". The platform integrates text, speech, and image modalities, offering seamless AI solutions in 22 Indian languages. "This initiative will empower critical sectors such as healthcare, education, agriculture, and governance, delivering region-specific AI solutions that understand and serve every Indian," Mr Singh said. The minister recounted a success story from his own constituency Udhampur where an AI doctor communicates fluently in the patient's native language. "It not only builds trust but has a placebo-like psychological effect, enabling better care in remote regions connected with superspeciality hospitals across India," he said. Mr Singh emphasised the transformative role of Generative AI in grassroots governance, citing the integration of multilingual feedback systems into platforms like CPGRAMS to enhance citizen engagement and grievance redressal.


Time of India
3 days ago
- Science
- Time of India
AI datasets by IIT-Bombay to simplify Indian texts, help in AI research
AI datasets by IIT-Bombay to simplify Indian texts, help in AI research (ANI) MUMBAI: For years, research in Indian knowledge systems, often available in Indian languages such as Sanskrit, was challenging for researchers. However, a data curation exercise carried out by the premier IIT-Bombay, as part of its contribution to the central govt's AIKosh portal, has simplified it to some extent by digitising 30 different textbooks. A dataset containing around 2.18 lakh sentences with 1.5 million words from these textbooks, covering diverse topics such as astronomy, medicine, and mathematics, with some even as old as 18 centuries, is now available on the govt portal. AIKosh, launched in March, is a source for datasets, models, toolkits, and more from diverse sources that aim to help AI-based innovation and research. IIT-Bombay, one of the leading contributors to the AIKosh platform, along with BharatGen, a consortium of seven institutes again led by IIT-Bombay, has contributed 37 diverse models and datasets on the portal so far. IIT-Bombay alone launched around 16 culturally significant datasets on the platform to contribute to the country's AI mission. BharatGen, funded through a section 8 company formed by the Department of Science and Technology with IIT-Bombay, IIT-Kanpur, IIT-Madras, IIT-Hyderabad, IIT-Mandi, IIM-Indore, and IIIT-Hyderabad as partners, launched 21 models on the portal. 'We are not only researching Large Language Models (LLMs) and other generative models for AI that are effective and data and compute efficient, but also building sovereign models for India from the ground up. We are creating datasets for training these models and fine-tuning them for downstream tasks such as conversation and question-answering, while creating benchmarking datasets towards calibrating the performance of these models,' said Prof Ganesh Ramakrishnan from IIT-Bombay, who is spearheading the project. The team has not only put out datasets relevant to the Indian knowledge systems but also others that can help in audio-visual learning, such as tutorials capturing practical skills like waste-to-toy creation or organic farming. There is also one on Sanskrit translation for contemporary prose, a math word problems dataset in Hindi and English which will train the AI in mathematical reasoning, and culturally-grounded multi-lingual question-answering datasets, including questions and answers from historian Dharampal's books, among others. One of the datasets also enables the AI to answer questions about images using external knowledge, and another interesting one is on recognising text in videos with camera movements. Most of these models are trained from scratch, not just fine-tuned, said Prof Ramakrishnan. The models also uniquely balance Indian data alongside English data, ensuring relevance to our country, he said. 'We are creating benchmarks for the AI ecosystem in the country, but these can be pulled out by researchers, enterprisers, companies, or even academia and developed further,' he added.


Hindustan Times
3 days ago
- Science
- Hindustan Times
IIT-Bombay leads push for India-centric AI
Mumbai: Indian Institute of Technology (IIT) Bombay has released 16 new datasets on AIKosh, the central government's platform that provides a repository of datasets to enable artificial intelligence (AI) innovation. This is a major step in developing AI that understands India's linguistic and cultural landscape, professor Ganesh Ramakrishnan, from IIT Bombay. These datasets will support innovation and research in AI and machine learning (ML), especially in areas involving Indian languages, scripts, documents, media, and audiovisual content. The effort is part of BharatGen, a multilingual large language model (LLM) initiative led by IIT Bombay and funded by the Department of Science and Technology. So far, BharatGen has contributed 16 India centric datasets and launched 21 AI models on AIKosh. The initiative includes top institutions such as the International Institute of Information Technology in Hyderabad and the IITs of Kanpur, Mandi, Madras, Hyderabad, Indore. IIT Bombay's datasets are designed to build a solid foundation for developing Indian AI tools and applications. These include over 218,000 sentences for improving digitisation of Sanskrit texts, audio-visual data on practical skills like upcycling discarded materials into toys and organic farming, English-Sanskrit translations with 53,000 sentences for modern prose, over 78 hours of Sanskrit audio for speech recognition, multilingual question-answer sets in 11 Indian languages, including Hindi and English, math word problems in Hindi and English for AI reasoning, and table detection datasets in 14 Indian languages. The datasets include visual question answering models (a system capable of answering questions related to an image), datasets to improve translation accuracy and recognize text in videos, a comprehensive overview of Indian Knowledge Systems (IKS), cross-lingual video and text retrieval in seven Indian languages (allowing AI to retrieve relevant information when the document is written in a different language from the query), and handwritten and printed text detection datasets. These datasets and models are part of a broader effort by IIT Bombay and BharatGen to build sovereign AI models for India aligned with the India AI Mission, a central government initiative that aims to build an ecosystem that allows AI innovation by enhancing data quality and facilitating computer access. The team is not just fine-tuning existing models, but training new ones from scratch using Indian data. They are also building benchmarks to test these models for Indian use in conversation and education. A major highlight of this initiative is the launch of 'Param 1', a bilingual foundational language model with 2.9 billion parameters. It supports both English and Hindi and has been trained on 36% Indic language data—significantly more than international models like Meta's Llama, which had less than 0.01%. 'Pre-training (the initial stage of training a machine learning model on a large dataset) is an enormous undertaking and often a barrier for many. That's why we've taken on this challenge,' professor Ramakrishnan, lead of BharatGen. Developers can now fine-tune Param 1 to build Indic chatbots, copilots (virtual assistants for research), and knowledge systems. 'We hope our efforts toward creating a sovereign Generative AI ecosystem and milestones such as the release of such LLM model checkpoints, serves as a foundation for India-specific solutions,' said professor Ramakrishnan. Alongside Param 1, BharatGen has launched over 20 speech models across 19 Indian languages. These include speaker adaptive text-to-speech (TTS) systems that can mimic a speaker's voice in languages like Hindi, Tamil, Telugu, Marathi, and Bengali. Advanced speaker-conditioned TTS models and automatic speech recognition systems have also been developed to make voice-based applications more natural and inclusive. 'Our goal is not just to build AI models but to provide resources that startups and system integrators can leverage,' said professor Ramakrishnan.