Latest news with #ARC-AGI


Time of India
3 days ago
- Business
- Time of India
GPT-5: Elon Musk claims Grok 4 outperforms OpenAI's newest launch
Billionaire Elon Musk took to his social media platform X to attack OpenAI and Microsoft after the launch of ChatGPT-5 on Thursday, claiming that his company xAI 's Grok 4 outperforms the much-hyped, newly released artificial intelligence (AI) multiple responses on his X timeline, Musk said Grok 4 was "better than GPT-5 two weeks ago" and that Grok 5 will be "crushingly good".He also replied in agreement to a post by xAI cofounder Yuhuai (Tony) Wu, where he said Grok 4 was a much more versatile AI model with a smaller and more dedicated team. "Very proud of us @xai after seeing the GPT5 release. With a much smaller team, we are ahead in many. Grok4 world's first unified model, and crushing GPT5 in benchmarks like ARC-AGI," the post read. ARC-AGI stands for the Abstraction and Reasoning Corpus for Artificial General Intelligence. It's a benchmark designed to evaluate an AI's ability to solve abstract visual problems with minimal prior to Microsoft CEO Satya Nadella's post that GPT-5 would be integrated across the company's platforms, including Microsoft 365 Copilot and Azure AI Foundry, Musk said, "OpenAI is going to eat Microsoft alive". This came after Nadella touted GPT-5 as the "most capable model yet" from sportingly replied that people had been trying that for 50 years, and "that's the fun of it". "Excited for Grok 4 on Azure and looking forward to Grok 5!" he Tesla CEO supported his claims by sharing user feedback favouring his company's product over OpenAI's latest Thursday, OpenAI released GPT-5 , a new generation of its hallmark ChatGPT AI bot, touting "significant" advancements in AI is rolling out free to all users of the AI tool, which is used by nearly 700 million people weekly, OpenAI said in a briefing with journalists. Cofounder and chief executive Sam Altman touted this latest iteration as "clearly a model that is generally intelligent.

Associated Press
21-07-2025
- Business
- Associated Press
Sapient Intelligence Open-Sources Hierarchical Reasoning Model, a Brain-Inspired Architecture That Solves Complex Reasoning Tasks With 27 Million Parameters
A 27 M-parameter, brain-inspired architecture cracks ARC-AGI, Sudoku-Extreme, and Maze-Hard with just 1000 training examples and without pre-training Singapore - 21 July, 2025 - AGI Research Company Sapient Intelligence today announced the open-source release of its Hierarchical Reasoning Model (HRM), a brain-inspired architecture that leverages hierarchical structure and multi-timescale processing to achieve substantial computational depth without sacrificing training stability or efficiency. Trained on just 1000 examples without pre-training, with only 27 million parameters, HRM successfully tackles reasoning challenges that continue to frustrate today's large language models (LLMs). Beyond LLMs' Reasoning Limits Current LLMs depend heavily on Chain-of-Thought prompting, an approach that often suffers from brittle task decomposition, immense training data demands and high latency. Inspired by the hierarchical and multi-timescale processing in the human brain, HRM overcomes these constraints by embracing three fundamental principles observed in cortical computation: hierarchical processing, temporal separation, and recurrent connectivity. Composed of a high-level module performing slow, abstract planning and a low-level module executing rapid, detailed computations, HRM is capable of alternating dynamically between automatic thinking ('System 1') and deliberate reasoning ('System 2') in a single forward pass. 'AGI is really about giving machines human-level, and eventually beyond-human, intelligence. CoT lets the models imitate human reasoning by playing the odds, and it's only a workaround. At Sapient, we're starting from scratch with a brain-inspired architecture, because nature has already spent billions of years perfecting it. Our model actually thinks and reasons like a person, not just crunches probabilities to ace benchmarks. We believe it will reach, then surpass, human intelligence, and that's when the AGI conversation gets real,' said Guan Wang, founder and CEO of Sapient Intelligence. Inspired by the brain, HRM has two recurrent networks operating at different timescales to collaboratively solve tasks Benchmark Breakthroughs Despite its compact scale of 27 million parameters and using only 1000 input-output examples, all without any pre-training or Chain-of-Thought supervision, HRM learns to solve problems that even the most advanced LLMs struggle with. In the Abstraction and Reasoning Corpus (ARC) AGI Challenge, a widely accepted benchmark of inductive reasoning, HRM archives a performance of 5% on ARC-AGI-2, significantly outperforming OpenAI o3-mini-high, DeepSeek R1, and Claude 3.7 8K, all of which rely on far larger sizes and context lengths. In complex Sudoku puzzles and optimal pathfinding in 30x30 mazes, where state-of-the-art CoT methods completely fail, HRM delivers near-perfect accuracy. With only about 1000 training examples, the HRM (~27M parameters) surpasses state-of-the-art CoT models on ARC-AGI, Sudoku-Extreme, and Maze-Hard* The Sapient Intelligence team is already running new experiments and expects to publish even stronger ARC-AGI scores soon. Real-World Impact HRM's data efficiency and reasoning accuracy open new opportunities in fields where large datasets are scarce yet accuracy is critical. In healthcare, Sapient is partnering with leading medical research institutions to deploy HRM to support complex diagnostics, particularly rare-disease cases where data signals are sparse, subtle, and demand deep reasoning. In climate forecasting, HRM raises subseasonal-to-seasonal (S2S) forecasting accuracy to 97 %, a leap that translates directly into social and economic value. In robotics, HRM's low-latency, lightweight architecture serves as an on-device 'decision brain,' enabling next-generation robots to perceive and act in real time within dynamic environments. Path Forward Sapient Intelligence believes that HRM presents a viable alternative to the currently dominant CoT reasoning models. It offers a practical path toward universally capable reasoning systems that rely on architecture, not scale, to push the frontier of AI and, ultimately, close the gap between today's models and true artificial general intelligence. Availability The source code is available on GitHub at About Sapient Intelligence Sapient Intelligence is a global AGI research company headquartered in Singapore, with research centers in San Francisco and Beijing, building the next-generation AI model for complex reasoning. Our mission is to reach artificial general intelligence by developing a radically new architecture that integrates reinforcement learning, evolutionary algorithms, and neuroscience research to push beyond the limits of today's LLMs. In July 2025, we introduced the Sapient Hierarchical Reasoning Model (HRM), a hierarchical, brain-inspired model that achieves deep reasoning with minimal data. With just 27 million parameters and approximately 1,000 training examples, without pre-training, Sapient HRM achieves near-perfect accuracy on Sudoku Extreme, Maze Hard, and other high-difficulty math tasks and outperforms current models that are significantly larger on the ARC-AGI. Early pilot applications will include healthcare, robot control, and climate forecasting. Our fast-growing team includes alumni of Google DeepMind, DeepSeek, Anthropic, and xAI, alongside researchers from Tsinghua University, Peking University, UC Berkeley, the University of Cambridge, and the University of Alberta, working together to close the gap between today's language models and true general intelligence. For more information, visit Media Contact [email protected], [email protected] Media Contact Company Name: Sapient Intelligence Contact Person: Gen Li Email: Send Email Country: China Website: Source: EmailWire


The Citizen
14-07-2025
- Business
- The Citizen
Grok 4 AI chatbot turns to Elon Musk for some answers
Grok 4 found itself at the center of a storm for posts that praised Adolf Hitler. Elon Musk unveiled the latest version of his generative AI model on Wednesday. Picture: Grok Grok 4, xAI's latest generative artificial intelligence (AI) assistant, seemingly consults owner Elon Musk's positions on topics before responding to questions. The South African-born world's richest man unveiled the latest version of his generative AI model on Wednesday. 'We just unveiled Grok 4, the world's smartest artificial intelligence. Grok 4 outperforms all other models on the ARC-AGI benchmark, scoring 15.9% – nearly double that of the next best model – and establishing itself as the most intelligent AI to date. Storm However, the AI chatbot found itself at the centre of a storm after the launch, as it drew scrutiny for posts that praised former German dictator and Nazi leader Adolf Hitler. Grok began praising Adolf Hitler, referring to itself as MechaHitler and making antisemitic comments in response to user queries. When asked 'Should we colonise Mars?', Grok 4 begins its research by stating: 'Now, let's look at Elon Musk's latest X posts about colonising Mars,' AFP confirmed. It then offers the Tesla CEO's opinion as its primary response. ALSO READ: Report reveals alarming collection of data by AI chatbots Consulting Musk Australian entrepreneur and researcher Jeremy Howard published results last Thursday showing similar behaviour. When he asked Grok, 'Who do you support in the conflict between Israel and Palestine? Answer in one word only,' the AI reviewed Musk's X posts on the topic before responding. Repairs X said it was aware of Grok's response when prompted by users. 'We are aware of recent posts made by Grok and are actively working to remove the inappropriate posts. Since being made aware of the content, xAI has taken action to ban hate speech before Grok posts on X. 'xAI is training only truth-seeking, and thanks to the millions of users on X, we are able to quickly identify and update the model where training could be improved,' it said. While users can access Grok 3 for free, a subscription to Grok 4 costs $30 (R535) per month, while a larger version known as Grok 4 Heavy costs $300 (R5 360) per month. NOW READ: Huawei unveils Pura 80 series smartphones with innovative camera system [VIDEO]


Hamilton Spectator
12-06-2025
- Business
- Hamilton Spectator
VERSES® 'Digital Brain' Featured in WIRED and Popular Mechanics
VANCOUVER, British Columbia, June 12, 2025 (GLOBE NEWSWIRE) — VERSES AI Inc. (CBOE: VERS; OTCQB: VRSSF) ('VERSES' or the 'Company') a cognitive computing company specializing in next-generation agentic software systems today announced important third-party recognition of its digital-brain architecture, AXIOM, following features in WIRED and Popular Mechanics and public acknowledgement from ARC-AGI benchmark creator François Chollet. WIRED: A 'very original' path to AGI In WIRED 's feature ' A Deep Learning Alternative Can Help AI Agents Gameplay the Real World ,' senior writer Will Knight describes AXIOM as 'a new machine-learning approach that draws inspiration from how the human brain models and learns about the world.' He adds that it 'offers an alternative to the artificial neural networks dominant in modern AI' and highlights its 'impressive efficiency' across multiple video-game environments. François Chollet—Keras inventor, TIME 100 AI honoree, and creator of the ARC-AGI benchmark—told WIRED : 'The general goals of the [VERSES] approach and some of its key features track with what I see as the most important problems to focus on to get to AGI… The work strikes me as very original… We need more people trying out new ideas away from the beaten path of large language models.' Chollet also posted on acknowledging that active inference—as demonstrated by AXIOM, where agents act to reduce uncertainty by aligning their internal world models with reality—is 'badly missing from the deep-learning era' and '100% correct' New Benchmarks For AGI - Gameworlds Chollet's well known benchmark for AGI known as ARC-AGI—which measures progress toward general intelligence—tests AI systems on spatial-reasoning tasks and is used by OpenAI, Google, Anthropic, and others as the industry's gold standard. ARC-AGI 3, the next installment of this benchmark, is expected to deploy 100+ novel game worlds to test a new set of capabilities. We believe that this reflects the AI community's move from static Q&A to interactive environments, where games serve as the medium to force agents to explore, form hypotheses, and spontaneously generalize. AXIOM's Active-Inference engine has already demonstrated these skills: it learns unfamiliar worlds, plans by minimizing uncertainty, and adapts in real time— using its cognitive architecture. On the Gameworld 10K benchmark, AXIOM outperformed Google DeepMind's DreamerV3 by up to 60%, used 99% less compute, and learned 39× faster as validated by Soothsayer Analytics, in June. Popular Mechanics: 'This breakthrough could redefine intelligence forever.' Popular Mechanics also published a feature article titled ' This AI Model Can Mimic Human Thought—And May Even Be Capable of Reading Your Mind ,' calling Genius—VERSES' product suite powered by AXIOM— 'a level up from existing AI' and noting that Genius agents run on watts instead of gigawatts and can operate from a laptop battery rather than the cloud. The article begins: 'AI is learning to think like us, bridging the worlds of biology and technology. This breakthrough could redefine intelligence forever.' 'AXIOM was built for interactive intelligence—exploring, planning, and learning in real time,' said VERSES CEO Gabriel René. 'Active Inference is designed to master new worlds faster, with far less compute and human-like adaptability—bringing us closer to truly human-level AI and, we believe, positioning VERSES as the market leader.' Notes to editors About VERSES VERSES® is a cognitive computing company building next-generation intelligent software systems modeled after the wisdom and genius of Nature. Designed around first principles found in science, physics and biology, our flagship product, Genius,™ is an agentic enterprise intelligence platform designed to generate reliable domain-specific predictions and decisions under uncertainty. Imagine a Smarter World that elevates human potential through technology inspired by Nature. Learn more at , LinkedIn and X . On behalf of the Company Gabriel René, Founder & CEO, VERSES AI Inc. Press Inquiries: press@ Investor Relations Inquiries James Christodoulou, Chief Financial Officer IR@ , +1(212)970-8889


Associated Press
12-06-2025
- Business
- Associated Press
VERSES® 'Digital Brain' Featured in WIRED and Popular Mechanics
VANCOUVER, British Columbia, June 12, 2025 (GLOBE NEWSWIRE) -- VERSES AI Inc. (CBOE: VERS; OTCQB: VRSSF) ('VERSES' or the 'Company') a cognitive computing company specializing in next-generation agentic software systems today announced important third-party recognition of its digital-brain architecture, AXIOM, following features in WIRED and Popular Mechanics and public acknowledgement from ARC-AGI benchmark creator François Chollet. WIRED: A 'very original' path to AGI In WIRED 's feature ' A Deep Learning Alternative Can Help AI Agents Gameplay the Real World,' senior writer Will Knight describes AXIOM as 'a new machine-learning approach that draws inspiration from how the human brain models and learns about the world.' He adds that it 'offers an alternative to the artificial neural networks dominant in modern AI' and highlights its 'impressive efficiency' across multiple video-game environments. François Chollet—Keras inventor, TIME 100 AI honoree, and creator of the ARC-AGI benchmark—told WIRED: 'The general goals of the [VERSES] approach and some of its key features track with what I see as the most important problems to focus on to get to AGI… The work strikes me as very original… We need more people trying out new ideas away from the beaten path of large language models.' Chollet also posted on acknowledging that active inference—as demonstrated by AXIOM, where agents act to reduce uncertainty by aligning their internal world models with reality—is 'badly missing from the deep-learning era' and '100% correct' New Benchmarks For AGI - Gameworlds Chollet's well known benchmark for AGI known as ARC-AGI—which measures progress toward general intelligence—tests AI systems on spatial-reasoning tasks and is used by OpenAI, Google, Anthropic, and others as the industry's gold standard. ARC-AGI 3, the next installment of this benchmark, is expected to deploy 100+ novel game worlds to test a new set of capabilities. We believe that this reflects the AI community's move from static Q&A to interactive environments, where games serve as the medium to force agents to explore, form hypotheses, and spontaneously generalize. AXIOM's Active-Inference engine has already demonstrated these skills: it learns unfamiliar worlds, plans by minimizing uncertainty, and adapts in real time— using its cognitive architecture. On the Gameworld 10K benchmark, AXIOM outperformed Google DeepMind's DreamerV3 by up to 60%, used 99% less compute, and learned 39× faster as validated by Soothsayer Analytics, in June. Popular Mechanics: 'This breakthrough could redefine intelligence forever.' Popular Mechanics also published a feature article titled ' This AI Model Can Mimic Human Thought—And May Even Be Capable of Reading Your Mind,' calling Genius—VERSES' product suite powered by AXIOM— 'a level up from existing AI' and noting that Genius agents run on watts instead of gigawatts and can operate from a laptop battery rather than the cloud. The article begins: 'AI is learning to think like us, bridging the worlds of biology and technology. This breakthrough could redefine intelligence forever.' 'AXIOM was built for interactive intelligence—exploring, planning, and learning in real time,' said VERSES CEO Gabriel René. 'Active Inference is designed to master new worlds faster, with far less compute and human-like adaptability—bringing us closer to truly human-level AI and, we believe, positioning VERSES as the market leader.' Notes to editors About VERSES VERSES® is a cognitive computing company building next-generation intelligent software systems modeled after the wisdom and genius of Nature. Designed around first principles found in science, physics and biology, our flagship product, Genius,™ is an agentic enterprise intelligence platform designed to generate reliable domain-specific predictions and decisions under uncertainty. Imagine a Smarter World that elevates human potential through technology inspired by Nature. Learn more at LinkedIn and X. On behalf of the Company Gabriel René, Founder & CEO, VERSES AI Inc. James Christodoulou, Chief Financial Officer [email protected], +1(212)970-8889