Latest news with #Kandaswamy


The Print
27-07-2025
- Entertainment
- The Print
‘Villa Swagatam' announces 34 residents for third edition
'Translation is a form of transformation, and I want to discover what my poetry becomes when it breathes in French. I hope to also use this residency to create new work, letting the landscape and its spirit of resistance inspire a renewed poetic voice,' said Kandaswamy, who will be visiting literary center Maison de la Poésie de in France's Nantes, said in a statement. The upcoming cycle of the residency will take place from August 2025 to August 2026. New Delhi, Jul 27 (PTI) Poet Meena Kandaswamy, French author Maylis de Kerangal, choreographer Gayatri Shetty, and French artist Johanna de Clisson are among 34 artists and cultural practitioners from India and France selected for the third edition of the Villa Swagatam residency programme. In its third edition, the initiative, spearheaded by the French Institute in India, seeks to foster cross-cultural dialogue and artistic collaboration between creatives from France and South Asia. The selected residents will spend between one and three months at partner residency spaces across India, Bangladesh, Sri Lanka, and France. Likewise, Clisson, a French ceramist, designer, and artistic director known for exploring the relationship between minimalism, architecture, and tactile materials, is eager to discover how 'earth meets textiles while ceramics and weaving become one' during her stay at Nila House, Jaipur. 'Trained in a European context where minimalism, sober lines and the predominance of white occupy a central place in my work, I wish to initiate a dialogue between these two aesthetic universes. My project at Nila House will focus on blending cultures and know-how,' she added. Among the other selected residents are writer Ruchir Joshi, art critic Sukanya Deb, visual artist Sajid Wajid Shaikh, poet Selim-a Atallah Chettaoui, illustrator Daniele Pasin, and poet Monia Ben Romdan. The list also includes multidisciplinary artist Clémence Vazard, visual artist Marion Flament, designer Gabriel Hafner, children's book author Séraphine Menu and literary translator Subhashree Beeman. Reflecting on the success of the previous two editions, French Ambassador to India, Thierry Mathou, remarked: 'Since its very inception, Villa Swagatam has been envisioned as a flagship initiative of our cultural cooperation with India, fostering a vibrant network of creative talents from both countries, with arts and crafts, and literature as key areas of exchange.' This year's edition saw a total of 520 applications — 353 from India and 167 from France — marking a significant rise in Indian participation, with more than twice as many Indian applicants compared to the previous round. PTI MG MG MG This report is auto-generated from PTI news service. ThePrint holds no responsibility for its content.


Time of India
30-06-2025
- Science
- Time of India
Cut through clutter: Tamil dataset to train AI models
Netizens engaging with AI models in Tamil or any regional language often come across incoherent translations, jumbled sentences, bizarre choices of words and poor grammar, but overlook them as the babels of a budding ecosystem. But not Raju Kandaswamy, a senior IT professional, who believes such errors throw a spanner in the works of Tamil's linguistic integrity. "Current training datasets are heavily distributed in English language and therefore do not accurately represent Tamil language or its cultural context. Users over a period of time absorb and internalise these biases leading to slow erosion of cultural values," he said. An increasing number of people, including senior citizens, rarely pick up books and consume Tamil content only through the internet or through speech. In a world increasingly mediated by LLMs, involved in web searches, shopping and education, this creates a problem. Kandaswamy is principal consultant at Thoughtworks in Coimbatore and part of AI Tamil Nadu, a non-profit community aiming to improve how AI models work in Tamil. The team is building a large-scale Tamil language dataset to train AI models, and is collaborating with authors and other organisations to curate large, high-quality datasets. Their plan is using these data repositories to fine-tune open-source models such as Meta's LLama and make it available for anyone to build Tamil-specific models. You Can Also Check: Chennai AQI | Weather in Chennai | Bank Holidays in Chennai | Public Holidays in Chennai He believes that these models can be used to deliver govt services, communicate welfare schemes, and enable vernacular education to the masses, especially the rural population. Abinaya Mahendiran, a natural language processing expert and member of AI Tamil Nadu, is leading the initiative named Vidhai. She too thinks it is crucial for preserving Tamil culture. "Access to high-quality datasets in less represented languages is limited, and Tamil is no exception. Machine-translated content is often inaccurate. So, we collect original Tamil texts such as books, essays and articles from various sources, clean them, and annotate them with the help of volunteers, students, linguists, retirees, and teachers. A trove of Tamil books and printed material is yet to be digitised," she said. Today, many independent researchers and language enthusiasts are spending their own money to improve Tamil AI models. But as Abinaya notes, lack of computing resources and difficulty in mobilising volunteers are major barriers. The Tamil Virtual Academy (TVA) has a digital library of more than 1 lakh books containing around 1.5 crore pages, spanning subjects from science to history. It is also developing tools like syntactic parsers, morphological analysers, and 'parts of speech' taggers, resources critical for NLP research. Yet, fragmented efforts, siloed developments, and fuzzy copyright guidelines hinder collaboration. A senior official confirmed that TVA could collaborate with AI technologists, but ambiguity around copyright and fair use remains a bottleneck. Navaneeth Malingan, founder of AI Tamil Nadu, is attempting to bridge the ecosystem, by bringing together various elements -- from scouting for students volunteers and linguists to getting access to computing resources through corporate sponsorship. He says these kinds of models are crucial for delivery of govt services for locals, while commercial AI models will be useful for most business cases. "The govt can use it to fill forms through voice, give instructions to farmers and teach Tamil to the younger generation. Various stakeholders including govt and companies should be brought together to build these models suitable for the use cases," he asserted. The community is currently fine-tuning existing AI models to improve its performance in Tamil, but is ambitious about building one from scratch - albeit a small-domain focused model. It will use a tokenisation method inspired by Nannul, the 13th century Tamil grammar treatise, to better reflect the language's morphological structure instead of the currently widely used Byte-Pair Encoding (BPE) method. Tokenisation refers to the process of breaking down text into smaller units called tokens. From adopting the printing press in the 1500s (the first in India), to adopting Unicode for the internet, Tamil has consistently been an early adopter of new communication technologies. Now, it should be able to find its place in the AI age.