Latest news with #RAG

Elastic named Leader in 2025 Gartner Magic Quadrant for observability

Techday NZ

18 hours ago

Business
Techday NZ

Elastic named Leader in 2025 Gartner Magic Quadrant for observability

Elastic has been recognised as a Leader in the 2025 Gartner Magic Quadrant for Observability Platforms for the second consecutive year. Gartner recognition The company earned this placement for its Elastic Observability offering after an evaluation of its Completeness of Vision and Ability to Execute. The recognition acknowledges Elastic's work in developing AI-driven capabilities, support for open standards, and the scalability and cost-efficiency of its observability platform. Santosh Krishnan, General Manager, Observability & Security at Elastic, commented on the company's approach to observability, saying: "Visibility alone isn't enough; customers need rapid context-rich insights to troubleshoot complex systems. We feel Elastic's recognition as a Leader in this year's Gartner Magic Quadrant reflects how our open, scalable architecture with AI-driven capabilities is transforming observability from a reactive tool into a solution for real-time investigations while keeping costs low." Key features highlighted The company stated that its differentiation lies in several areas, including native integration with OpenTelemetry, a built-in AI Assistant, and zero-configuration AIOps for anomaly detection. Elastic's AI Assistant leverages Retrieval Augmented Generation (RAG) technology to connect with enterprise knowledge, supporting incident resolution through natural language queries. This allows operational teams to reduce time-to-insight across logs, metrics, and traces. Elastic's zero-config AIOps deploys machine learning capabilities out-of-the-box to automatically detect anomalies, forecast trends, and reveal patterns within large datasets. The piped query language, ES|QL, aims to simplify the complexity of large-scale IT investigations by enabling advanced queries across observability data. Krishnan stated that Elastic's placement in the Magic Quadrant demonstrates the effectiveness of continued investments in open standards and deployment flexibility, alongside scalable performance and cost optimisations. He described the solution's impact on organisations moving from reactive troubleshooting to real-time investigation of incidents and anomalies. Enterprise adoption Elastic's approach to observability has also been adopted by enterprises seeking to consolidate monitoring tools and improve operational efficiency. Eva Ulicevic, Director, Technology, Architecture, Strategy, and Analytics at Telefónica Germany, shared the impact the platform has had within the organisation: "By using Elastic and consolidating multiple tools, we reduced our root cause analysis time by 80%. We also reduced incidents that could severely impact our business." The platform is built on Elastic's Search AI Platform, supporting the monitoring and optimisation of applications, infrastructure, and end-user experience. Elastic's Search AI Lake is designed for petabyte-scale data retention, supporting efficient storage and search for structured and unstructured data. Industry context The Gartner Magic Quadrant evaluates vendors in the observability sector based on criteria such as vision, innovation, ability to execute, and breadth of capabilities. Elastic's leadership listing for the second year underscores continued investment in tools that address the challenges of managing, searching, and analysing large volumes of operational data. Elastic's commitment to open-source standards is emphasised by its native support for OpenTelemetry, enabling organisations to standardise instrumentation and data collection processes without requiring proprietary connectors. The observability platform is positioned to support organisations as they address the growing complexity of cloud-based architectures and meet increased demand for real-time performance monitoring, anomaly detection, and automated root cause analysis.

IIT-K, UP Police launch AI bot for instant access to info on theft, crime guidelines

Time of India

6 days ago

Time of India

IIT-K, UP Police launch AI bot for instant access to info on theft, crime guidelines

Lucknow: IIT Kanpur and UP Police have jointly introduced an AI-powered bot -- Retrieval-Augmented Generation (RAG)-- providing quick access to information from Hindi police circulars. Tired of too many ads? go ad free now This innovative system enables officers and public to instantly retrieve details from over 1,000 circulars using natural language queries. Users can access the software at If one wants to see the circulars related to elections or their brief description, they need to put the election prompt in the search bot. A page displaying the summary of the elections will be available. The summary includes different circulars and guidelines issued by UP Police during elections. The collaboration, said Shubham Sahay, faculty at IITK's electrical engineering department, transformed a concept into a deployed solution that strengthens public safety and digital governance. "The idea was to help officers and the public get the relevant information that is stacked all over quickly. Students at the Science and Technology (S&T) Council digitised all Hindi circulars/notices using the optical character recognition (OCR) technique that brought 1,000 circulars related to governance, theft, crime at one place," Sahay said. Institute secretary, S&T Council, IIT-K, Om Shrivastava, said: "We got a chance to collaborate with UP Police to build something that could make policing smarter and more citizen-friendly. What started as an idea to cut short the time spent on searching documents has now become an example of how technology can serve those who serve us, under pressure, with limited resources and with immense responsibility." Providing assistance and guidance in developing the RAGBOT, ASP, Aligarh, Mayank Pathak explained that the circulars compiled together are those giving out instructions to subordinate officers and to the public in terms of law and order and investigation issues. Tired of too many ads? go ad free now "For instance, someone who wants to know steps taken in investigating a vehicle theft case or others seeking information on how a passport is verified by the police can seamlessly get details by using this RAGBOT," said Pathak, an IIT-K alumnus. "In both cases, the AI bot will list steps taken to make the process more transparent and also make the public more aware of the processes involved in different cases," added Pathak, an IIT-K alumnus.

Moroccan founder raises $4.2M for her YC-backed startup building the next layer of AI search

Yahoo

6 days ago

Business
Yahoo

Moroccan founder raises $4.2M for her YC-backed startup building the next layer of AI search

As generative AI reshapes industries, one of its most important yet invisible challenges is retrieval, the process of fetching the right data with relevant context from messy knowledge bases. Large language models (LLMs) are only as accurate as the information they can retrieve. That's where ZeroEntropy wants to make its mark. The San Francisco-based startup, co-founded by CEO Ghita Houir Alami and CTO Nicolas Pipitone, has raised $4.2 million in seed funding to help models retrieve relevant data quickly, accurately, and at scale. The round was led by Initialized Capital, with participation from Y Combinator, Transpose Platform, 22 Ventures, a16z Scout, and a long list of angels, including operators from OpenAI, Hugging Face, and Front. ZeroEntropy joins a growing wave of infrastructure companies hoping to use retrieval-augmented generation (RAG) to power search for the next generation of AI agents. Competitors range from MongoDB's VoyageAI to early fellow YC startups like 'We've met a lot of teams building in and around RAG, but Ghita and Nicolas's models outperform everything we've seen,' says Zoe Perret, partner at Initialized Capital. 'Retrieval is undeniably a critical unlock in the next frontier of AI, and ZeroEntropy is building it.' Retrieval-augmented generation (RAG) grabs data from external documents and has become a go-to architecture for AI agents, whether it's a chatbot surfacing HR policies or a legal assistant citing case law. Yet the ZeroEntropy founders believe that for many AI apps, this layer is fragile: a cobbled collection of vector databases, keyword search, and re-ranking models. ZeroEntropy offers an API that manages ingestion, indexing, re-ranking, and evaluation. What that means is that — unlike a search product for enterprise employees like Glean — ZeroEntropy is strictly a developer tool. It quickly grabs data, even across messy internal documents. Houir Alami likens her startup to a 'Supabase for search' referring to the popular open-source database that automates much of the database management. 'Right now, most teams are either stitching together existing tools from the market or dumping their entire knowledge base into an LLM's context window. The first approach is time-consuming to build and maintain,' Houir Alami said. 'The second approach can cause compounding errors. We're building a developer-first search infrastructure—think of it like a Supabase for search—designed to make deploying accurate, fast retrieval systems easy and efficient.' At its core is its proprietary re-ranker called ze-rank-1, which the company claims currently outperforms similar models from Cohere and Salesforce on both public and private retrieval benchmarks. It makes sure that when an AI system looks for answers in a knowledge base, it grabs the most relevant information first. More than 10 early-stage companies building AI agents across verticals such as healthcare, law, customer support, and sales are already using ZeroEntropy, she adds. Born and raised in Morocco, Houir Alami left home at 17 to pursue engineering education in France, attending École Polytechnique, a prestigious military and mathematics-focused institution. There, she discovered her love for machine learning. She moved to California two years ago to complete a master's in mathematics at UC Berkeley, where she deepened her interest in building intelligent systems. Before founding ZeroEntropy, Houir Alami dabbled in building an AI assistant—her take on a conversational agent—before ChatGPT became mainstream. She says the insight gained from trying to build that assistant, particularly the realization of how important it was to provide the right context and information to the LLM to be useful, partly inspired her to start ZeroEntropy. In a field often criticized for its lack of diversity, the 25-year-old Houir Alami is one of the few female CEOs building deep infrastructure for one of the hardest problems in AI. Yet, she hopes it doesn't stay that way for long. 'There aren't many women in DevTools or AI infra,' she said. 'But I'd tell any young woman interested in technical problems: don't let that stop you. If you're drawn to complex, technical problems, don't let anyone make you feel like you're not capable of pursuing them. You should go for it.' She also stays connected to her roots by giving talks at high schools and universities in Morocco, aiming to inspire more young girls to pursue STEM.

The Hidden Flaw in RAG Systems That's Costing You Accuracy : Cut RAG Hallucinations

Geeky Gadgets

6 days ago

Science
Geeky Gadgets

The Hidden Flaw in RAG Systems That's Costing You Accuracy : Cut RAG Hallucinations

What if the very systems designed to enhance accuracy were the ones sabotaging it? Retrieval-Augmented Generation (RAG) systems, hailed as a breakthrough in how large language models (LLMs) integrate external data, face a persistent and costly flaw: hallucinations. These errors often stem from irrelevant or noisy information slipping through the cracks of traditional re-ranking methods, leading to responses that are less reliable and sometimes outright misleading. The problem isn't just about prioritizing relevance—it's about eliminating irrelevance altogether. That's where the idea of context pruning comes into play, offering a sharper, more deliberate approach to managing retrieved data. In this feature, Prompt Engineering explore why re-ranking alone isn't enough to tackle the hallucination problem and how context pruning can transform the way RAG systems handle information. You'll discover the Provenance model, a innovative solution that doesn't just rearrange data but actively removes the noise, making sure LLMs work with only the most relevant inputs. Along the way, we'll unpack the limitations of current methods, the mechanics of pruning, and the broader implications for efficiency and accuracy in LLM applications. By the end, you might just see why cutting away the excess is more powerful than merely reshuffling it. Improving RAG with Context Pruning Why Context Matters in RAG Systems RAG systems rely on retrieving external information to supplement LLM outputs. The quality of this retrieved context directly influences the accuracy and reliability of the system's responses. When irrelevant or noisy data is included, it not only increases the likelihood of hallucinations but also burdens the LLM with unnecessary processing. For you, this results in less accurate outputs and diminished trust in the system. Traditional RAG systems often employ re-ranking methods to prioritize retrieved data based on relevance. While this approach helps surface useful information, it fails to eliminate irrelevant or partially noisy content. Consequently, large amounts of unnecessary data are still passed to the LLM, diluting the quality of the final response and increasing computational inefficiency. The Limitations of Re-Ranking Re-ranking is a widely used technique that reorders retrieved text chunks or documents based on their relevance to a query. However, this method has several inherent shortcomings: Even after re-ranking, irrelevant data often persists. For instance, a paragraph may contain a few relevant sentences surrounded by unrelated or distracting information. Re-ranking does not address partial relevance. High-ranking chunks may still include tangential or noisy content, which can confuse the LLM and degrade the quality of its responses. These limitations underscore the need for a more refined approach to context management—one that not only prioritizes relevance but actively removes irrelevant information. This is where the Provenance model offers a fantastic solution. Prune, Don't Just Re-Rank — Cut RAG Hallucinations Watch this video on YouTube. Find more information on Retrieval-Augmented Generation (RAG) by browsing our extensive range of articles, guides and tutorials. What Is the Provenance Model? The Provenance model represents a significant advancement in context engineering for RAG systems. Unlike re-ranking, which merely rearranges retrieved data, the Provenance model actively prunes irrelevant parts of the text while preserving the overall context. By assigning relevance scores to individual sentences, it ensures that only the most pertinent information is retained. This model can be implemented in two primary ways: As a secondary step after re-ranking, further refining the top-ranked chunks by removing irrelevant content. As a standalone replacement for re-ranking, directly identifying and retaining only the most relevant sentences. For example, if re-ranking identifies three paragraphs as the most relevant, the Provenance model can prune these paragraphs to retain only the sentences that directly address the query. This dual-layered refinement process minimizes noise and ensures the LLM receives a cleaner, more focused input. Performance and Efficiency Benefits The Provenance model offers substantial performance improvements for RAG systems. By compressing input context by up to 80% without sacrificing relevance, it reduces the computational load on LLMs while improving response quality. For developers, this translates to faster processing times, reduced resource consumption, and more reliable outputs. Consider a scenario where a RAG system retrieves 10 paragraphs of text. Traditional re-ranking might prioritize the top three paragraphs, but these could still contain irrelevant sentences. The Provenance model goes further by pruning those paragraphs to retain only the most relevant sentences. This results in a more concise and accurate input for the LLM, enhancing both efficiency and output quality. How to Integrate the Provenance Model The Provenance model is readily available on platforms like Hugging Face, complete with detailed documentation to guide implementation. While its current licensing restricts commercial use, the open source community is likely to develop similar alternatives in the near future. This provides an excellent opportunity for you to experiment with context pruning and explore its potential to improve your RAG systems. Integration is straightforward and can be tailored to your specific needs: Use it as a post-re-ranking refinement step to further filter retrieved data. Adopt it as a replacement for re-ranking, directly identifying the most relevant sentences. This flexibility makes the Provenance model an attractive option for developers aiming to enhance the performance and reliability of their systems. By incorporating this model, you can ensure that your RAG system delivers cleaner, more focused inputs to the LLM, ultimately improving the quality of its outputs. Future Implications for RAG Systems Context pruning is poised to become a standard feature in retrieval-augmented systems, driven by the growing demand for more accurate and efficient LLM-based applications. As the Provenance model and similar approaches gain traction, you can expect broader adoption across industries such as customer support, academic research, and content generation. By focusing on refining input context, RAG systems can achieve new levels of reliability and efficiency. For developers and users alike, this represents a significant step forward in addressing hallucinations and making sure that LLMs deliver accurate, high-quality responses. The Provenance model exemplifies how targeted innovations in context management can redefine the capabilities of retrieval-augmented systems. Redefining Standards in RAG Systems The Provenance model's context pruning approach effectively addresses the limitations of traditional re-ranking methods in RAG systems. By actively removing irrelevant information while preserving the global context, it enhances response quality and reduces computational overhead. As this technology evolves, it has the potential to set a new standard for accuracy and efficiency in retrieval-augmented generation. For developers and users, this marks a pivotal advancement in how LLMs interact with external data, paving the way for more reliable and effective applications. Media Credit: Prompt Engineering Filed Under: AI, Guides Latest Geeky Gadgets Deals Disclosure: Some of our articles include affiliate links. If you buy something through one of these links, Geeky Gadgets may earn an affiliate commission. Learn about our Disclosure Policy.

TechCrunch

6 days ago

Business
TechCrunch

Moroccan founder raises $4.2M for her YC-backed startup building the next layer of AI search

As generative AI reshapes industries, one of its most important yet invisible challenges is retrieval, the process of fetching the right data with relevant context from messy knowledge bases. Large language models (LLMs) are only as accurate as the information they can retrieve. That's where ZeroEntropy wants to make its mark. The San Francisco-based startup, co-founded by CEO Ghita Houir Alami and CTO Nicolas Pipitone, has raised $4.2 million in seed funding to help models retrieve relevant data quickly, accurately, and at scale. The round was led by Initialized Capital, with participation from Y Combinator, Transpose Platform, 22 Ventures, a16z Scout, and a long list of angels, including operators from OpenAI, Hugging Face, and Front. ZeroEntropy joins a growing wave of infrastructure companies hoping to use retrieval-augmented generation (RAG) to power search for the next generation of AI agents. Competitors range from MongoDB's VoyageAI to early fellow YC startups like 'We've met a lot of teams building in and around RAG, but Ghita and Nicolas's models outperform everything we've seen,' says Zoe Perret, partner at Initialized Capital. 'Retrieval is undeniably a critical unlock in the next frontier of AI, and ZeroEntropy is building it.' Retrieval-augmented generation (RAG) grabs data from external documents and has become a go-to architecture for AI agents, whether it's a chatbot surfacing HR policies or a legal assistant citing case law. Yet the ZeroEntropy founders believe that for many AI apps, this layer is fragile: a cobbled collection of vector databases, keyword search, and re-ranking models. ZeroEntropy offers an API that manages ingestion, indexing, re-ranking, and evaluation. Techcrunch event Save up to $475 on your TechCrunch All Stage pass Build smarter. Scale faster. Connect deeper. Join visionaries from Precursor Ventures, NEA, Index Ventures, Underscore VC, and beyond for a day packed with strategies, workshops, and meaningful connections. Save $450 on your TechCrunch All Stage pass Build smarter. Scale faster. Connect deeper. Join visionaries from Precursor Ventures, NEA, Index Ventures, Underscore VC, and beyond for a day packed with strategies, workshops, and meaningful connections. Boston, MA | REGISTER NOW What that means is that — unlike a search product for enterprise employees like Glean — ZeroEntropy is strictly a developer tool. It quickly grabs data, even across messy internal documents. Houir Alami likens her startup to a 'Supabase for search' referring to the popular open-source database that automates much of the database management. 'Right now, most teams are either stitching together existing tools from the market or dumping their entire knowledge base into an LLM's context window. The first approach is time-consuming to build and maintain,' Houir Alami said. 'The second approach can cause compounding errors. We're building a developer-first search infrastructure—think of it like a Supabase for search—designed to make deploying accurate, fast retrieval systems easy and efficient.' L-R: Nicolas Pipitone (CTO) and Ghita Houir Alami (CEO) Image Credits:ZeroEntropy At its core is its proprietary re-ranker called ze-rank-1, which the company claims currently outperforms similar models from Cohere and Salesforce on both public and private retrieval benchmarks. It makes sure that when an AI system looks for answers in a knowledge base, it grabs the most relevant information first. More than 10 early-stage companies building AI agents across verticals such as healthcare, law, customer support, and sales are already using ZeroEntropy, she adds. Born and raised in Morocco, Houir Alami left home at 17 to pursue engineering education in France, attending École Polytechnique, a prestigious military and mathematics-focused institution. There, she discovered her love for machine learning. She moved to California two years ago to complete a master's in mathematics at UC Berkeley, where she deepened her interest in building intelligent systems. Before founding ZeroEntropy, Houir Alami dabbled in building an AI assistant—her take on a conversational agent—before ChatGPT became mainstream. She says the insight gained from trying to build that assistant, particularly the realization of how important it was to provide the right context and information to the LLM to be useful, partly inspired her to start ZeroEntropy. In a field often criticized for its lack of diversity, the 25-year-old Houir Alami is one of the few female CEOs building deep infrastructure for one of the hardest problems in AI. Yet, she hopes it doesn't stay that way for long. 'There aren't many women in DevTools or AI infra,' she said. 'But I'd tell any young woman interested in technical problems: don't let that stop you. If you're drawn to complex, technical problems, don't let anyone make you feel like you're not capable of pursuing them. You should go for it.' She also stays connected to her roots by giving talks at high schools and universities in Morocco, aiming to inspire more young girls to pursue STEM.

Latest news with #RAG

Elastic named Leader in 2025 Gartner Magic Quadrant for observability

IIT-K, UP Police launch AI bot for instant access to info on theft, crime guidelines

Moroccan founder raises $4.2M for her YC-backed startup building the next layer of AI search

The Hidden Flaw in RAG Systems That's Costing You Accuracy : Cut RAG Hallucinations

Moroccan founder raises $4.2M for her YC-backed startup building the next layer of AI search

Get Started Now: Download the App