logo
#

Latest news with #AlphaGeometry

Google AI system wins gold medal in International Mathematical Olympiad
Google AI system wins gold medal in International Mathematical Olympiad

The Star

time24-07-2025

  • Science
  • The Star

Google AI system wins gold medal in International Mathematical Olympiad

SAN FRANCISCO: An artificial intelligence system built by Google DeepMind, the tech giant's primary AI lab, has achieved 'gold medal' status in the annual International Mathematical Olympiad, a premier math competition for high school students. It was the first time that a machine – which solved five of the six problems at the 2025 competition, held in Australia this month – reached that level of success, Google said in a blog post Monday. The news is another sign that leading companies are continuing to improve their AI systems in areas such as math, science and computer coding. This kind of technology could accelerate the research of mathematicians and scientists and streamline the work of experienced computer programmers. Two days before Google revealed its feat, an OpenAI researcher said in a social media post that the startup had built technology that achieved a similar score on this year's questions, although it did not officially enter the competition. Both systems were chatbots that received and responded to the questions much like humans. Other AI systems have participated in the International Mathematical Olympiad, or IMO, but they could answer questions only after human experts translated them into a computer programming language built for solving math problems. 'We solved these problems fully in natural language,' Thang Luong, a senior staff research scientist at Google DeepMind, said in an interview. 'That means there was no human intervention – at all.' After OpenAI started the AI boom with the release of ChatGPT in late 2022, the leading chatbots could answer questions, write poetry, summarise news articles, even write a little computer code. But they often struggled with math. Over the past two years, companies such as Google and OpenAI have built AI systems better suited to mathematics, including complex problems that the average person cannot solve. Last year, Google DeepMind unveiled two systems that were designed for math: AlphaGeometry and AlphaProof. Competing in the IMO, these systems achieved 'silver medal' performance, solving four of the competition's six problems. It was the first time a machine reached silver-medal status. Other companies, including a startup called Harmonic, have built similar systems. But systems such as AlphaProof and Harmonic are not chatbots. They can answer questions only after mathematicians translate the questions into Lean, a computer programming language designed for solving math problems. This year, Google entered the IMO with a chatbot that could read and respond to questions in English. This system is not yet available to the public. Called Gemini Deep Think, the technology is what scientists call a 'reasoning' system. This kind of system is designed to reason through tasks involving math, science and computer programming. Unlike previous chatbots, this technology can spend time thinking through complex problems before settling on an answer. Other companies, including OpenAI, Anthropic and China's DeepSeek, offer similar technologies. Like other chatbots, a reasoning system initially learns its skills by analysing enormous amounts of text culled from across the internet. Then it learns additional behaviour through extensive trial and error in a process called reinforcement learning. A reasoning system can be expensive, because it spends additional time thinking about a response. Google said Deep Think had spent the same amount of time with the IMO as human participants did: 4 1/2 hours. But the company declined to say how much money, processing power or electricity had been used to complete the test. In December, an OpenAI system surpassed human performance on a closely watched reasoning test called ARC-AGI. But the company ran afoul of competition rules because it spent nearly US$1.5mil (RM6.3mil) in electricity and computing costs to complete the test, according to pricing estimates. – ©2025 The New York Times Company This article originally appeared in The New York Times.

OpenAI won gold at the world's toughest math exam. Why the Olympiad gold matters
OpenAI won gold at the world's toughest math exam. Why the Olympiad gold matters

India Today

time21-07-2025

  • Science
  • India Today

OpenAI won gold at the world's toughest math exam. Why the Olympiad gold matters

In a jaw-dropping achievement for the world of artificial intelligence, OpenAI's latest experimental model has scored at the gold medal level at the International Mathematical Olympiad (IMO) -- one of the toughest math exams on the is the same event held on the Sunshine Coast in Australia where India won six medals this year and ranked 7th amongst 110 participating HITS GOLD IN THE WORLD'S TOUGHEST MATH TESTThe IMO is no ordinary competition. Since its launch in 1959 in Romania, it has become the gold standard for testing mathematical genius among high school students globally. Over two intense days, participants face a gruelling four-and-a-half-hour paper with only three questions each day. These are not your average exam questions -- they demand deep logic, creativity and problem-solving that, OpenAI's model solved five out of six questions correctly -- under the same testing conditions as human DOUBTED AI COULD DO THIS -- UNTIL NOWEven renowned mathematician Terence Tao -- an IMO gold medallist himself -- had doubts. In a podcast in June, he suggested that AI wasn't yet ready for the IMO level and should try simpler math contests first. But OpenAI has now proven otherwise."Also this model thinks for a *long* time. o1 thought for seconds. Deep Research for minutes. This one thinks for hours. Importantly, it's also more efficient with its thinking," Noam Brown from OpenAI wrote on LinkedIn."It's worth reflecting on just how fast AI progress has been, especially in math. In 2024, AI labs were using grade school math (GSM8K) as an eval in their model releases. Since then, we've saturated the (high school) MATH benchmark, then AIME, and now are at IMO gold," he THIS IS A BIG DEAL FOR GENERAL AIThis isn't just about math. OpenAI says this shows their AI model is breaking new ground in general-purpose reasoning. Unlike Google DeepMind's AlphaGeometry -- built just for geometry -- OpenAI's model is a general large language model that happens to be great at math too."Typically for these AI results, like in Go/Dota/Poker/Diplomacy, researchers spend years making an AI that masters one narrow domain and does little else. But this isn't an IMO-specific model. It's a reasoning LLM that incorporates new experimental general-purpose techniques," Brown explained in his Sam Altman called it 'a dream' when OpenAI began. 'This is a marker of how far AI has come in a decade.'advertisementBut before you get your hopes up, this high-performing AI isn't going public just yet. Altman confirmed it'll be 'many months' before this gold-level model is STILL REMAINNot everyone is fully convinced. AI expert Gary Marcus called the model's results 'genuinely impressive' -- but raised fair questions about training methods, how useful this is for the average person, and how much it all the win marks a huge leap in what artificial intelligence can do -- and how fast it's improving.- EndsMust Watch

OpenAI just won gold at the world's most prestigious math competition. Here's why that's a big deal.
OpenAI just won gold at the world's most prestigious math competition. Here's why that's a big deal.

Business Insider

time19-07-2025

  • Science
  • Business Insider

OpenAI just won gold at the world's most prestigious math competition. Here's why that's a big deal.

OpenAI's latest experimental model is a math whiz, performing so well on an insanely difficult math exam that everyone's now talking about it. "I'm excited to share that our latest @OpenAI experimental reasoning LLM has achieved a longstanding grand challenge in AI: gold medal-level performance on the world's most prestigious math competition — the International Math Olympiad (IMO)," Alexander Wei, a member of OpenAI's technical staff, said on X. The International Math Olympiad is a global competition that began in 1959 in Romania and is now considered one of the hardest in the world. It's divided into two days, during which participants are given a four-and-a-half-hour exam, each with three questions. Some famous winners include Grigori Perelman, who helped advance geometry, and Terence Tao, recipient of the Fields Medal, the highest honor in mathematics. In June, Tao predicted on Lex Fridman's podcast that AI would not score high on the IMO. He suggested researchers shoot a bit lower. "There are smaller competitions. There are competitions where the answer is a number rather than a long-form proof," he said. Yet OpenAI's latest model solved five out of six of the problems correctly, working under the same testing conditions as humans, Wei said. Wei's colleague, Noam Brown, said the model displayed a new level of endurance during the exam. "IMO problems demand a new level of sustained creative thinking compared to past benchmarks," he said. "This model thinks for a long time." Wei said the model is an upgrade in general intelligence. The model's performance is "breaking new ground in general-purpose reinforcement learning," he said. DeepMind's AlphaGeometry, by contrast, is specifically designed just to do math. "This is an LLM doing math and not a specific formal math system; it is part of our main push towards general intelligence," Altman said on X. "When we first started openai, this was a dream but not one that felt very realistic to us; it is a significant marker of how far AI has come over the past decade," Altman wrote, referring to the model's performance at IOM. Altman added that a model with a "gold level of capability" will not be available to the public for "many months." The achievement is an example of how fast the technology is developing. Just last year, "AI labs were using grade school math" to evaluate models, Brown said. And tech billionaire Peter Thiel said last year it would take at least another three years before AI could solve US Math Olympiad problems. Still, there are always skeptics. Gary Marcus, a well-known critic of AI hype, called the model's performance "genuinely impressive" on X. But he also posed several questions about how the model was trained, the scope of its "general intelligence," the utility for the general population, and the cost per problem. Marcus also said that the IMO has not independently verified these results.

AI travel startup Airial has built a tool that can plan your holiday in seconds
AI travel startup Airial has built a tool that can plan your holiday in seconds

Mint

time02-07-2025

  • Business
  • Mint

AI travel startup Airial has built a tool that can plan your holiday in seconds

Planning a holiday often means juggling flights, hotels, local transport and activities across multiple websites or paying a premium to a travel agency to do it for you. Now, a wave of AI travel startups is working to change that. With artificial intelligence gaining traction in the travel industry, companies are racing to build tools that can handle end-to-end trip planning. One such player is Airial, a travel-tech startup founded by two former Meta engineers, Archit Karandikar and Sanjeev Shenoy, that promises to simplify the process. Its AI-powered platform creates personalised itineraries within seconds, covering everything from bookings to restaurant suggestions, all in one place. Airial lets users input basic travel details, such as starting and ending locations, to generate a full itinerary. The platform includes flights, hotel bookings, local transport, restaurant suggestions, and tourist attractions. Users can either set their preferences up front or make changes after viewing the suggested plan. The interface offers an overview of the entire trip, with clickable segments for daily plans. Users can explore each activity to view location, reviews, suggestions and alternatives. The tool also includes a map view that displays all the places scheduled for the day, helping users understand the distance and time required to travel between points. It also accounts for transit time, wait periods at stations and the possibility of nearby day trips. In addition, the platform can answer queries about specific places and tailor recommendations to user preferences. Social media and creator integration One of Airial's newer features allows users to add content from creators. A user can link a blog, TikTok or Instagram Reel and add locations mentioned in the content to their itinerary. The tool can also surface relevant TikTok videos based on the user's destination and preferences. Additionally, Airial supports trip sharing and collaborative planning. It has included cars and buses in multi-city travel options. Users can view trips created by friends and make modifications, bringing a social element to the planning process. Founded by former Meta engineers Airial was founded by Archit Karandikar and Sanjeev Shenoy, who were college friends in India. Karandikar previously worked in engineering roles at Meta, Google, and Waymo, focusing on AI-based products. Shenoy worked at Meta as well, with the Instagram Reels team. The founders said their shared interest in travel led them to build a product that could function like a detailed digital travel agent. 'Most platforms just help you build a rough plan. We focus on logistics, connecting dozens of APIs and factoring in multiple parameters like transfer time, hotel proximity, and transit availability,' said Karandikar in an interview with TechCrunch. The startup's AI model is based in part on research from a DeepMind paper called AlphaGeometry, which focuses on solving complex geometry problems. Airial combines this inference method with large language models (LLMs) to create personalised travel plans. Airial has raised $3 million in seed funding led by Montage Ventures, with participation from South Park Commons, Peak XV (formerly Sequoia India) and angel investors from companies like Meta, Dropbox, and UiPath. The company claims to have 'tens of thousands' of monthly users. For now, its focus is on user growth rather than monetisation. In the near future, Airial plans to launch iOS and Android apps and add vertical search options for hotels, activities and influencer-generated content.

AI breakthrough: AlphaGeometry 2 and Symbolic AI outperform Maths Olympiad gold medallists
AI breakthrough: AlphaGeometry 2 and Symbolic AI outperform Maths Olympiad gold medallists

The Hindu

time29-04-2025

  • Science
  • The Hindu

AI breakthrough: AlphaGeometry 2 and Symbolic AI outperform Maths Olympiad gold medallists

The world was astonished a year ago, but it is now shocked. Last year, AlphaGeometry, an AI problem solver developed by Google DeepMind, astonished the world by placing second in the International Mathematical Olympiad (IMO). The DeepMind team now claims that the performance of their improved system, AlphaGeometry 2, has surpassed that of the typical gold medallist. The findings are detailed in a preprint available on the arXiv service. International Mathematical Olympiad The International Mathematical Olympiad (IMO) is the world's most prominent mathematics competition. The inaugural competition, which took place in Romania in 1959 among seven Soviet Bloc nations, began to grow swiftly, reaching 50 nations in 1989 and surpassing 100 countries for the first time in 2009. The competition has always aimed to help school-age mathematicians improve their problem-solving abilities. In India, the Homi Bhabha Centre for Science Education (HBCSE) organises the Mathematical Olympiad Programme on behalf of the National Board for Higher Mathematics (NBHM) of the Department of Atomic Energy (DAE), Government of India. The Indian team to compete in the international competition is chosen using a broad-based Indian Olympiad Qualifier in Mathematics (IOQM). For additional information, click here. Questions are picked from four topic areas: algebra, combinatorics, geometry, and number theory, with no necessity or expectation that students can utilise calculus. The competition consists of six problems. The tournament lasts two days and consists of three problems per day; each day, participants get four and a half hours to complete three questions. Each problem is worth 7 points, with a maximum of 42 points. AI in the race In 2024, the IMO was hosted in Bath, United Kingdom, with 609 high school students from 108 countries participating. Chinese student Haojia Shi finished first in the individual rankings with a perfect score- 42 points. In the country rankings, the United States team came out on top, and China came in second. The human problem-solvers won 58 gold medals, 123 silver and 145 bronze. One of the event's highlights was the presence of two unofficial contestants: AlphaGeometry 2 and AlphaProof, both artificial intelligence algorithms built by Google DeepMind. The two programs were able to solve four out of six tasks. Mathematician and Fields Medallist Timothy Gowers, a past IMO gold medallist, and mathematician Joseph K. Myers, another previous IMO gold medallist, evaluated the two AI systems' solutions using the same criteria as the human competitors. According to these standards, the programs received an excellent 28 points out of a potential 42 points, equivalent to a silver medal. This means that the AI came close to earning a gold medal, which was granted for a score of 29 points or higher. Furthermore, just 60 pupils achieved higher completion scores. Furthermore, AlphaGeometry 2 solved the geometry problem correctly in just 19 seconds. Meanwhile, AlphaProof solved one number theory and two algebra problems, including one that only five human participants could figure out. Training the tools Training an AI requires a large quantity of data. AlphaProof's training was restricted by the amount of mathematical material accessible in a formal mathematical language. The DeepMind researchers then used the Gemini AI tool to translate millions of problems on the Internet that people have solved step-by-step in natural language into the Lean programming language, allowing the proof assistant to learn about them. Using this huge data, AlphaProof was taught using reinforcement learning, as AI systems were taught to master chess, shogi, and Go previously. Reinforcement Learning (RL) is similar to instructing a dog. The dog (agent) learns tricks by performing actions (such as sitting). If it sits appropriately, it gets a treat (reward); otherwise, no treat. Over time, it learns which acts result in rewards. Similarly, RL systems learn using trial and error to maximise rewards in tasks such as gaming or robotics. AlphaProof repeatedly competes with itself and improves step by step; if the process does not result in a win, it is penalised and learns to explore alternative techniques. What works for number theory or algebra does not work in geometry, necessitating a new methodology. As a result, DeepMind created AlphaGeometry, a unique AI system designed to solve geometry difficulties. The experts initially created an exhaustive list of geometric 'premises,' or basic building pieces of geometry, such as a triangle having three sides, and so on. Just as the Architect studies the design, AlphaGeometry's deduction engine algorithm evaluates the 'problems'. It picks the appropriate blocks (premises) assembled step by step to build the home (proof). The AI was able to manipulate the geometric objects around a 2D plane, like adding a fourth point and converting a triangle into a quadrilateral or moving a point to change the triangle's height. The 'proof' is complete when all components fit together correctly, and the home is sturdy. Unlike trial-and-error learning (RL), this is equivalent to following an instruction manual with unlimited LEGO parts. Going for gold The DeepMind team has now produced an improved version, AlphaGeometry 2, that trains the model with more data and accelerates the process. The AI system is now able to solve linear equations. With the upgrade, the AI was recently proved capable of answering 84% of all geometry problems set in IMOs during the last 25 years, compared to 54% for the previous version of AlphaGeometry. Future developments in AlphaGeometry will include dealing with mathematical problems containing inequalities and nonlinear equations, which will be necessary to solve geometry completely. A team of researchers, including IIIT Hyderabad's Ponnurangam Kumaraguru, has made a breakthrough with their 'Symbolic AI', outperforming AlphaGeometry's capabilities. Furthermore, the hybrid symbolic AI, which complemented Wu's technique with AlphaGeometry, outperformed human gold medallists on IMO geometry problems. This Symbolic AI system solves geometry problems by combining algebraic methods—primarily Wu's Method—with synthetic approaches, such as the deduction engine algorithm. The heart of this technique is 'Wu's method,' which is analogous to systematically completing a gigantic jigsaw puzzle. Consider solving a jigsaw puzzle; it is challenging to complete if some elements (such as variables and equations in geometry) are concealed behind clutter. Thus, initial decluttering is valuable, such as sorting the puzzle pieces by colour/edge. Wu's approach rearranges geometric equations into a more organised hierarchy. We can answer one equation at a time, just as we would place corner pieces in a jigsaw puzzle first, then use the results to simplify the next. Wu's Method simplifies complex geometry into a step-by-step assembly line. 'With very low computational requirements, this performs comparably to an IMO silver medalist. When combined with AlphaGeometry, the hybrid system successfully solves 27 out of 30 IMO problems,' according to Mr. Kumaraguru. 'This system is remarkably efficient — on most consumer laptops, with no access to a GPU, it can solve these problems within a few seconds. It also requires no learning or training phase.' China's endeavours are not far behind. TongGeometry, a system for proposing and solving Euclidean geometry problems that bridge numerical and spatial reasoning, was developed by scholars at the Beijing Institute for General Artificial Intelligence (BIGAI) and the Institute for Artificial Intelligence, Peking University, Beijing. It has solved all International Mathematical Olympiad geometry problems for the last 30 years, outperforming gold medallists for the first time. 'Their work, including their analysis on novel problem generation, is quite interesting,' says Mr. Kumaraguru. However, he declined to express 'informed opinions' on their work due to the lack of publicly available material. Are mathematicians redundant? Are we nearing the point when mathematicians are obsolete? Gowers concurs: 'I would guess that we are still a breakthrough or two short of that.' While the AlphaProof outperformed the humans, it took almost 60 hours to answer the problem, when the humans were only given 41/2 hours. If human competitors had been given that time for each task, they would surely have scored higher. Another prerequisite is that problems were manually translated into the proof assistant Lean, meaning humans did the auto formalisation. In contrast, the AI program did the necessary mathematics. Autoformalization converts ambiguous human language into precise logical or mathematical assertions that computers can reason about. For example, the sentence 'If it rains, then the ground gets wet' in plain English has to be translated into Propositional Logic: Rain → WetGround, which is symbolic form: First-Order Logic: ∀x (rain(x) → wetground(x)). Humans could do their own auto formalisation. Furthermore, in the previous IMO, DeepMind's algorithms did not even attempt to solve combinatorial issues since they were difficult to transfer into programming languages like Lean. Alphageometry could only work with Euclidean plane geometry problems in 2D. Another significant inherent difficulty with AI is 'hallucinations', which are nonsensical or erroneous assertions that might occur, especially when dealing with intricate thinking. (T.V. Venkateswaran is a science communicator and visiting faculty member at the Indian Institute of Science Education and Research, Mohali.)

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into a world of global content with local flavor? Download Daily8 app today from your preferred app store and start exploring.
app-storeplay-store