logo
#

Latest news with #RadhaBasu

iMerit believes better-quality data, not more data, is the future of AI
iMerit believes better-quality data, not more data, is the future of AI

Yahoo

time09-07-2025

  • Business
  • Yahoo

iMerit believes better-quality data, not more data, is the future of AI

AI data platform iMerit believes the next step toward integrating AI tools at the enterprise level is not more data, but better data. And better data doesn't come from hordes of gig workers, but from experts across mathematics, medicine, healthcare, finance, autonomy, and other cognitive fields, the company says. 'What's become exceedingly important is the ability to attract and retain the best cognitive experts, because we have to take these large models and make them very customized towards solving enterprise AI problems,' Radha Basu, CEO and founder of iMerit, told TechCrunch. The California- and India-based startup has for the past nine years quietly built itself into a trusted data annotation partner for companies working in computer vision, medical imaging, autonomous mobility, and other AI applications that require high-accuracy, human-in-the-loop labeling. Now, iMerit is bringing its Scholars program out of beta, the company exclusively told TechCrunch. The goal of the program is to build a growing workforce of experts to fine-tune generative AI models for enterprise applications and, increasingly, foundational models. iMerit already calls some of the top AI firms customers, including three of the big seven generative AI companies, eight of the top autonomous vehicle companies, three large U.S. government agencies, and two of the top three cloud providers, according to the company. The news comes as Scale AI, arguably the biggest name in AI data annotation, has lost its founder and CEO Alexandr Wang to Meta, which also acquired a 49% in the company. In the wake of Meta's investment, many of Scale's major clients pulled back, including Google, OpenAI, Microsoft, and xAI, out of concerns that Meta could gain access to their product roadmaps. iMerit doesn't claim to replace Scale AI's core offering of high-throughput, developer-focused 'blitz data.' Instead, it's betting that now is the right moment to double down on expert-led, high-quality data, the kind that requires deep human judgment and domain-specific oversight. 'We're the adults in the room,' Rob Laing, iMerit's VP of global specialist workforce, told TechCrunch. 'A lot of money is being spent on AI right now. There are some very intelligent people building large platforms of human workforces. The output that they're getting from that mass approach and that very quick speed to market approach is not at the level of quality that enterprises need.' Basu brought up the example of healthcare scribes that have come to market off the back of foundational large language models. 'If you don't have the expertise of the cardiologist or the physician, what you're doing is basically creating something that's maybe 50% or 60% accurate,' Basu said. 'You want that to be 99%. You want to question the model. You want to break it. You want to fix it. That is what expert-led AI is making possible for enterprise.' iMerit's experts are tasked with finetuning, or 'tormenting,' enterprise and foundational AI models using the startup's proprietary platform Ango Hub. Ango allows iMerit's 'Scholars' to interact with the customer's model to generate and evaluate problems for the model to solve. For iMerritt, attracting and retaining cognitive experts is key to success because the experts aren't just doing a few tasks and disappearing; they're working on projects for multiple years. iMerit boasts a 91% retention rate, with 50% of its experts being women. Laing, whose experience founding the human translation platform myGengo helped him understand how to crowdsource, said it's relatively easy to get warm bodies to perform menial tasks. Creating community requires a more human-centered approach. 'Instead of someone being a name on a database, when someone joins the Scholars program, they actually meet folks on the team,' Laing said. 'They have collaborative discussions. They're very much pushed to work at the highest possible level. And we are very, very, very selective about how we bring people in.' 'I think what we're going to see over the next couple of years is that companies like iMerit that are really focusing on that engagement, that retention, and that quality, are going to be the go-to companies for people to train the AI,' Laing added. Today, iMerit works with over 4,000 Scholars and hopes to bring on more as it scales. Basu told TechCrunch that even though the company hasn't raised since 2020 — when it brought on investors like Khosla Ventures, Omidyar Network, and British International Investment — iMerit is sustainable and profitable. With its own cash reserves, iMerit can afford to scale to 10,000 experts, Basu said. To scale further would require more outside investment, which iMerit is open to, but not desperate for. iMerit has been working on Scholars for the past year, mainly with a focus on healthcare. The goal is to grow across other enterprise applications, including finance and medicine. Laing noted that generative AI is its fastest-growing area as top AI firms work with iMerit to improve their foundation models. 'The free data out there on the internet is gone, and the lower level of human input data has also become commoditized,' Laing said. 'Where these folks are going is really trying to tune these things to achieve AGI or superintelligence.'

iMerit believes better quality data, not more data, is the future of AI
iMerit believes better quality data, not more data, is the future of AI

TechCrunch

time09-07-2025

  • Business
  • TechCrunch

iMerit believes better quality data, not more data, is the future of AI

AI data platform iMerit believes the next step towards integrating AI tools at the enterprise level is not more data, but better data. And better data doesn't come from hordes of gig workers, but from experts across mathematics, medicine, healthcare, finance, autonomy, and other cognitive fields. 'What's become exceedingly important is the ability to attract and retain the best cognitive experts, because we have to take these large models and make them very customized towards solving enterprise AI problems,' Radha Basu, CEO and founder of iMerit, told TechCrunch. The California- and India-based startup has for the past nine years quietly built itself into a trusted data annotation partner for companies working in computer vision, medical imaging, autonomous mobility, and other AI applications that require high-accuracy, human-in-the-loop labeling. Now, iMerit is bringing its Scholars program out of beta, the company exclusively told TechCrunch. The goal of the program is to build a growing workforce of experts to fine-tune generative AI models for enterprise applications and, increasingly, foundational models. iMerit already calls some of the top AI firms customers, including three of the big seven generative AI companies, eight of the top autonomous vehicle companies, three large U.S. government agencies, and two of the top three cloud providers, according to the company. The news comes as Scale AI, arguably the biggest name in AI data annotation, has lost its founder and CEO Alexandr Wang to Meta, which also acquired a 49% in the company. In the wake of Meta's investment, many of Scale's major clients pulled back, including Google, OpenAI, Microsoft, and xAI, out of concerns that Meta could gain access to their product roadmaps. iMerit doesn't claim to replace Scale AI's core offering of high-throughput, developer-focused 'blitz data.' Instead, it's betting that now is the right moment to double down on expert-led, high-quality data, the kind that requires deep human judgment and domain-specific oversight. Techcrunch event Save up to $475 on your TechCrunch All Stage pass Build smarter. Scale faster. Connect deeper. Join visionaries from Precursor Ventures, NEA, Index Ventures, Underscore VC, and beyond for a day packed with strategies, workshops, and meaningful connections. Save $450 on your TechCrunch All Stage pass Build smarter. Scale faster. Connect deeper. Join visionaries from Precursor Ventures, NEA, Index Ventures, Underscore VC, and beyond for a day packed with strategies, workshops, and meaningful connections. Boston, MA | REGISTER NOW 'We're the adults in the room,' Rob Laing, iMerit's VP of global specialist workforce, told TechCrunch. 'A lot of money is being spent on AI right now. There are some very intelligent people building large platforms of human workforces. The output that they're getting from that mass approach and that very quick speed to market approach is not at the level of quality that enterprises need.' Basu brought up the example of healthcare scribes that have come to market off the back of foundational large language models. 'If you don't have the expertise of the cardiologist or the physician, what you're doing is basically creating something that's maybe 50% or 60% accurate,' Basu said. 'You want that to be 99%. You want to question the model. You want to break it. You want to fix it. That is what expert-led AI is making possible for enterprise.' iMerit Scholars workflow example focused on mathematics. Image Credits:iMerit iMerit's team of experts are tasked with finetuning, or 'tormenting,' enterprise and foundational AI models using the startup's proprietary platform Ango Hub. Ango allows iMerit's 'Scholars' to interact with the customer's model to generate and evaluate problems for the model to solve. For iMerritt, attracting and retaining cognitive experts is key to success because the experts aren't just doing a few tasks and disappearing; they're working on projects for multiple years. iMerit boasts a 91% retention rate, with 50% of its experts being women. Laing, whose experience founding the human translation platform myGengo helped him understand how to crowdsource, said it's relatively easy to get warm bodies to perform menial tasks. Creating community requires a more human-centered approach. 'Instead of someone being a name on a database, when someone joins the Scholars program, they actually meet folks on the team,' Laing said. 'They have collaborative discussions. They're very much pushed to work at the highest possible level. And we are very, very, very selective about how we bring people in.' 'I think what we're going to see over the next couple of years is that companies like iMerit that are really focusing on that engagement, that retention and that quality, are going to be the go-to companies for people to train the AI,' Laing added. Today, iMerit works with over 4,000 Scholars and hopes to bring on more as it scales. Basu told TechCrunch that even though the company hasn't raised since 2020 – when it brought on investors like Khosla Ventures, Omidyar Network, and British International Investment – iMerit is sustainable and profitable. With its own cash reserves, iMerit can afford to scale to 10,000 experts, Basu said. To scale further would require more outside investment, which iMerit is open to, but not desperate for. iMerit has been working on Scholars for the past year, mainly with a focus on healthcare. The goal is to grow across other enterprise applications, including finance and medicine. Laing noted that generative AI is its fastest growing area as top AI firms work with iMerit to improve their foundation models. 'The free data out there on the internet is gone, and the lower level of human input data has also become commoditized,' Laing said. 'Where these folks are going is really trying to tune these things to achieve AGI or superintelligence.'

iMerit Unveils Scholars - A Handpicked Global Network of Cognitive Experts for Advanced GenAI Training
iMerit Unveils Scholars - A Handpicked Global Network of Cognitive Experts for Advanced GenAI Training

Cision Canada

time09-07-2025

  • Business
  • Cision Canada

iMerit Unveils Scholars - A Handpicked Global Network of Cognitive Experts for Advanced GenAI Training

Invite-only community of expert talent is shaping the next wave of Generative AI and AGI SAN JOSE, Calif., July 9, 2025 /CNW/ -- iMerit, a leader in software-delivered AI data, today announced wide availability of its Scholars program. The global network of subject matter experts, handpicked for their specialized knowledge and cognitive skills, serves a large and urgent ongoing need for secure and high quality model tuning among Generative AI foundation model and applied AI customers. Expertise as Infrastructure The Generative AI arms race focuses on compute and algorithms, while iMerit focuses on the third pillar: expert-led data. iMerit Scholars curates and matches thousands of cognitive specialists, many with advanced degrees in computer science, medicine, biology, finance, law, and policy. They frequently have multilingual or multi-domain capabilities. The launch comes at a pivotal moment as enterprises and research labs increasingly seek to differentiate their AI systems through specialized knowledge and human oversight. The last mile of specialization of these models relies on expert prompting and reasoning inputs by subject matter experts in domains like medicine, STEM, and coding. At the same time, outcomes are highly dependent on the quality and sophistication of the tuning data created by experts. "Foundation models and their applications are moving into a crucial phase where they have to be tuned and validated by extremely clean expert-led data. Our Scholars team includes PhDs, MDs, lawyers, linguists, and engineers. Their exceptional problem-setting skills directly shape the quality and performance of the models," said Radha Basu, CEO of iMerit. "Our decade-long background in AI data gives us deep insights into how to assemble the most creative and engaged experts to lead this next wave." Scholars is backed by Deep Reasoning Lab (DRL) - a specialized generative AI module in iMerit's AI data software, Ango Hub. Ango DRL connects automated pipelines, model evaluators, and specialized human judgment at production scale. Ango DRL supports multi-domain and multimodal prompting, chain-of-thought reasoning and other interaction modes that allow the human expert to teach, question and correct models under development. "The industry needs this capability at a large scale but is also highly sensitive to the quality and variety of the tuning data. Scholars curates the most elite AI teachers," said Robert Laing, VP of Global Specialist Workforce at iMerit. "It's not just about degrees but about motivation, engagement and about a cognitive toolbox of skills like meta-cognition, critical thinking, creativity and cultural empathy. That's the only way to be accountable for the mission-critical results iMerit is known for." Recent projects include ambient scribe tuning, where physicians shape a model to perform better at creating clinical notes from an audio recording of a doctor-patient interaction. In another assignment, mathematicians participated in a chain-of-thought project, where they improved the ability of the model to solve complex problems by iteratively coaching it past the failed steps in its reasoning. iMerit has also helped create language-vision models which use speech generation to describe the actions of an autonomous vehicle, thus improving safety and explainability. "From the start, I felt as though I had already been part of the team for a long time. Communication was seamless, everyone was approachable and supportive, and we handled challenges with ease," said Burak Ekseli, a Language Specialist, based in Turkey. "It's been one of the most positive and enriching experiences I've had in this field, strengthening both my skills and my confidence." To learn more about Scholars, visit or contact [email protected]. About iMerit iMerit is a leading AI data company that powers advanced machine learning and artificial intelligence models. iMerit delivers high-quality data across industries such as autonomous mobility, medical AI, high-tech — enabling trusted, ethical, and scalable AI through its software Ango Hub. iMerit is backed by Khosla Ventures, Omidyar Network, and British International Investment (BII). Learn more at

AI Development Issues Synthetic Data Can Help You Overcome
AI Development Issues Synthetic Data Can Help You Overcome

Forbes

time07-07-2025

  • Business
  • Forbes

AI Development Issues Synthetic Data Can Help You Overcome

As AI systems become more sophisticated, the challenges of training them effectively—and responsibly—continue to grow. The use of real-world data often comes with concerns and roadblocks—privacy risks, inconsistent formats, gaps in edge cases and regulatory hurdles can all slow development or skew outcomes. Synthetic data offers a promising alternative, delivering clean, scalable and customizable datasets that can augment—or even replace—traditional data in key use cases. Below, members of Forbes Technology Council share real-world challenges that come with training AI systems and how synthetic data can help address them. Their insights highlight how developers can overcome data-related barriers while building smarter, safer AI models. 1. Lack Of Edge Case Data Synthetic data can help address the challenge of edge cases in your real-world data, which, by definition, doesn't have enough examples to create a training set. The real-world data can be used to identify an edge case your AI may encounter, but you leverage synthetic data to create variations of that edge case for machine learning. This hybrid approach is often most effective in terms of cost, time and so on. - Radha Basu, iMerit 2. Inconsistency And Lack Of Control One of the major challenges is inconsistency and lack of control. Real-world data is messy, biased and often incomplete, making it hard to scale or use reliably in training high-performance models. Synthetic data solves this by offering precision, balance and control at scale. Synthetic data gives AI developers the ability to test, stress and scale models in ways real-world data simply can't match. - Alexandre de Vigan, Nfinite Forbes Technology Council is an invitation-only community for world-class CIOs, CTOs and technology executives. Do I qualify? 3. Unpredictable Training Environments Remember the debate about synthetic versus petroleum-based oil? Similar to how older engines were run with nonsynthetic oil, legacy businesses use messy, sensitive and unpredictable real-world data, resulting in poor AI. For smart, modern businesses, synthetic data, like synthetic oil, allows developers to train models in predictable and controllable environments, ensuring strong performance in the real world. - Robert Clark, Cloverleaf Analytics 4. Privacy And Scale Limitations Synthetic data avoids sensitive personal information, which reduces privacy concerns, and can be generated efficiently at scale, making it ideal for training large models. Real-world data—like patient records—is often messy and unpredictable. However, training only on synthetic data can limit a model's ability to perform well in complex, real-world scenarios. - Tim O'Connell, emtelligent 5. Incomplete And Biased Datasets One challenge with real-world data is that it can be biased or incomplete—like teaching someone to drive only on city streets but not highways or country roads. Synthetic data fills these gaps, adding detail and diversity that real-world data might lack. However, synthetic data only leads to smarter, fairer, more reliable AI if it's high-quality and generated with standards to minimize bias. - China Widener, Deloitte 6. Sensitive Industry Restrictions AI developers face the significant challenge of addressing issues related to data privacy. Obtaining relevant information within sensitive industries presents unique difficulties, especially when dealing with regulated elements. Synthetic data helps alleviate this burden. - Michael Gargiulo, 7. 'Class Imbalance' Real data has bias due to 'class imbalance.' Imagine a scenario where using AI for résumé screening fails since the model has been unintentionally trained on a dominant class (favoring male over female or giving more weight to a certain demographic), as that's what was available as historical data. Synthetic data can overcome this, as long as proper context is given for data generation. - Arjun Srinivasan, Wesco 8. Scarcity Of Rare Scenarios I believe that AI developers often struggle with obtaining large, diverse datasets that include rare edge cases—such as unusual driving scenarios for autonomous vehicles—which are difficult and costly to capture in the real world. Synthetic data can generate these rare but critical conditions at scale, improving model robustness without compromising user privacy. - Mark Vena, SmartTech Research 9. Privacy And Distribution Barriers Real-world data often comes with privacy constraints and regulatory friction. Synthetic data solves distribution gaps in training sets by generating edge cases for mission-critical systems. It allows AI developers to simulate realistic, diverse datasets without exposing sensitive information. - Andrey Kalyuzhnyy, 8allocate 10. Rare Event Modeling Needs For AI models, the rule is often 'the more data, the better the model.' However, if you are trying to model events that rarely happen, such as communications involving insider trading, bribery, harassment and other such events, the only way to get enough data is to create it. By using synthetic data that is then reviewed by subject matter experts, you can have enough examples to create great models. - Vall Herard, Saifr 11. Voice Diversity Challenges One key challenge with real-world voice data is obtaining sufficient diversity and volume, especially for rare accents, speaking styles or noisy environments. Synthetic voice data overcomes this by generating limitless, tailored examples, including difficult-to-capture scenarios, without privacy concerns. This enables training more robust AI models. - Harshal Shah 12. High Data Acquisition Costs The simple acquisition of real-world data can be difficult and costly. Synthetic data can help, but it needs to be evaluated carefully for quality and assessed for its potential impact on the training of a model. - Leonard Lee, neXt Curve 13. Privacy-Conscious Experimentation Real-world data often limits innovation due to privacy and regulatory barriers. Synthetic data helps AI developers simulate edge cases and future scenarios that don't yet exist. This enables safer experimentation, faster iteration and smarter models without compromising sensitive information. - Rishi Kumar, MatchingFit 14. Ignored Edge Users In CX Data Real-world customer experience data often ignores edge users—the 'silent majority' who never complain; they just leave. Synthetic data enables you to simulate and operationalize a retention strategy before it's too late. It's not just a data problem; it's a CX risk. - April Ho-Nishimura, Infineon Technologies AG 15. Corrupted Or Low-Quality Real Data The use of corrupted real-world datasets for training can silently compromise AI models and cause unreliable results. Synthetic data eliminates this risk by providing clean, controlled datasets when real-world data quality issues are affecting model performance. - Chongwei Chen, DataNumen, Inc. 16. Simulating Future Scenarios Real-world data is stuck in yesterday's world—permissioned, fragmented and slow. Synthetic data isn't just a privacy workaround; it's a simulation engine. Developers can now model edge-case chaos, future scenarios or AI-on-AI interactions at scale, long before reality catches up. That's not a patch. That's evolution. - Akhilesh Sharma, A3Logics Inc. 17. Cross-Silo Collaboration Barriers AI developers attempting to collaborate across silos (such as government agencies) where data sharing is challenging or explicitly forbidden are able to exchange synthetic datasets. This improves model stability and time to release by allowing multiple parties to share in the model evaluation process and reproduce bugs to broaden the troubleshooting audience. - Matthew Peters, CAI 18. Reactive Versus Proactive Modeling A major challenge with real-world data is its stagnancy. It reflects what has been, not what could be. Synthetic data allows AI developers to generate rich, forward-looking scenarios that model emerging trends, unseen behaviors or disruptive events. It shifts AI from reactive to proactive, enabling systems to anticipate and adapt in a world that evolves faster than yesterday's data. - Sandipan Biswas 19. Inconsistent Labeling Real-world data is messy. Labels are often inconsistent, even among experts, and that noise quietly limits how far your models can go. Synthetic data gives us clean, perfectly labeled ground truth. We use it to identify annotation errors and train models that handle uncertainty more effectively and surpass accuracy ceilings—without incurring the costs of relabeling. - Gavita Regunath, Advancing Analytics

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into a world of global content with local flavor? Download Daily8 app today from your preferred app store and start exploring.
app-storeplay-store