logo
How artificial intelligence is tackling mathematical problem-solving

How artificial intelligence is tackling mathematical problem-solving

The Hindu3 days ago
The International Mathematical Olympiad (IMO) is arguably the leading mathematical problem-solving competition. Every year, high school students from around the world attempt six problems over the span of three hours. Students whose scores cross a threshold, roughly corresponding to solving five of the six problems, obtain Gold medals, with Silver and Bronze medals for those crossing other thresholds. The problems do not require advanced mathematical knowledge, but instead test for mathematical creativity. They are always new, and it is ensured that no similar problems are online or in the literature.
The AI gold medallist
IMO 2025 had some unusual participants. Even before the Olympiad closed, OpenAI, the maker of ChatGPT, announced that an experimental reasoning model of theirs had answered the Olympiad at the Gold medal level, following the same time limits as the human participants. Remarkably, this was not a model specifically trained or designed for the IMO, but a general-purpose reasoning model with reasoning powers good enough for an IMO Gold.
The OpenAI announcement raised some issues. Many felt that announcing an AI result while the IMO had not concluded overshadowed the achievements of the human participants. Also, the Gold medal score was graded and given by former IMO medalists hired by OpenAI, and some disputed whether the grading was correct. However, a couple of days later, another announcement came. Google-DeepMind attempted the IMO officially, with an advanced version of Gemini Deep Think. Three days after the Olympiad, with the permission of the IMO organisers, they announced that they had obtained a score at the level of a Gold medal. The IMO president Prof. Gregor Dolinar stated, 'We can confirm that Google DeepMind has reached the much-desired milestone, earning 35 out of a possible 42 points — a gold medal score. Their solutions were astonishing in many respects. IMO graders found them to be clear, precise and most of them easy to follow.'
Stages of development
Even as it became a popular sensation, ChatGPT was infamous both for hallucinations (making up facts) and for simple arithmetic mistakes. Both these would make solving even modest mathematical problems mostly impossible.
The first advance that greatly reduced these errors, which came a few months after the launch of ChatGPT, was the use of so-called agents. Specifically, models were now able to use web searches to gather accurate information, and Python interpreters to run programs to perform calculations and check reasoning using numerical experiments. These made the models dramatically more accurate, and good enough to solve moderately hard mathematical problems. However, as a single error in a mathematical solution makes the solution invalid, these were not yet accurate enough to reach IMO (or research) level.
Greater accuracy can be obtained by pairing language models with formal proof systems such as the Lean prover — a computer software that can understand and check proofs. Indeed, for IMO 2024 such a system from Google-DeepMind called AlphaProof obtained a silver medal score (but it ran for two days).
Finally, a breakthrough came with the so-called reasoning models, such as o3 from OpenAI and Google-DeepMind's Gemini-2.5-pro. These models are perhaps better described as internal monologue models. Before answering a complex question, they generate a monologue considering approaches, carrying them out, revisiting their proposed solutions, sometimes dithering and starting all over again, before finally giving a solution with which they are satisfied. It were such models, with some additional advances, that got Olympiad Gold medal scores.
Analogical reasoning and combining ingredients from different sources gives language models some originality, but probably not enough for hard and novel problems. However, verification either through the internal consistency of reasoning models or, better still, checking by the Lean prover, allows training by trying a large number of things and seeing what works, in the same way that AI systems became chess champions starting with just the rules.
Such reinforcement learning has allowed recent models to go beyond training data by creating their own synthetic data.
The implications
Olympiad problems, for both humans and AIs, are not ends in themselves but tests of mathematical problem-solving ability. There are other aspects of research besides problem-solving.
Growing anecdotal experiences suggest that AI systems have excellent capabilities in many of these too, such as suggesting approaches and related problems.
However, the crucial difference between problem-solving and research/development is scale. Research involves working for months or years without errors creeping in, and without wandering off in fruitless directions. As mentioned earlier, coupling models with the Lean prover can prevent errors. Indications are that it is only a matter of time before this is successful.
In the meantime, these models can act as powerful collaborators with human researchers, greatly accelerating research and development in all areas involving mathematics. The era of the super-scientist is here.
Siddhartha Gadgil is a professor in the Department of Mathematics, IISc
Orange background

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

Related Articles

Top 10 AI companies in the world: See who's winning the race between Sam Altman, Elon Musk and other tech giants
Top 10 AI companies in the world: See who's winning the race between Sam Altman, Elon Musk and other tech giants

Indian Express

time26 minutes ago

  • Indian Express

Top 10 AI companies in the world: See who's winning the race between Sam Altman, Elon Musk and other tech giants

Top 10 AI companies in the world Forbes: Artificial intelligence isn't slowing down, it is only growing bigger, bolder, and more embedded in the way modern businesses operate. More than two years after ChatGPT took the world by storm, AI remains a top priority for venture capitalists and tech leaders. But the conversation has shifted: instead of only racing to build the most powerful AI models, many startups are focusing on real-world applications—tools that save time, cut repetitive work, and make tasks easier across sectors like engineering, healthcare, legal services, and sales. This shift is reflected in Forbes' seventh annual AI 50 list, created in partnership with Sequoia and Meritech Capital, which spotlights the most promising privately-owned AI companies worldwide—from established giants to rising newcomers.. Newcomers to the list include Anysphere—better known as Cursor—a three-year-old AI coding assistant valued at $2.5 billion and generating over $100 million in annual revenue. Speak, an AI-powered language tutoring app worth $1 billion, serves around 10 million learners of English and Spanish. Massachusetts-based OpenEvidence, another unicorn, offers an AI-driven medical search tool that distils complex information into concise summaries for doctors. While rising startups are making waves, AI model-building heavyweights still dominate the top tier. OpenAI and Anthropic—two of the sector's most well-funded players—have raised a combined $81 billion, more than half of the total $142.45 billion secured by this year's AI 50 companies. But the race is intensifying: Elon Musk's xAI has raised $12.1 billion; former OpenAI CTO Mira Murati is launching Thinking Machine Labs, aiming for $1 billion at a $9 billion valuation; and AI pioneer Fei-Fei Li, dubbed the 'godmother of AI,' has entered the fray with World Labs, backed by $291.5 million to build systems that interpret physical spaces. On the enterprise side, Writer has secured $326 million to develop proprietary AI models for corporate tasks like drafting marketing blogs or combing through large document archives. Powering all of these AI dreams are the often-overlooked infrastructure providers. AI companies need huge amounts of computing power, expensive chips, and energy-intensive data centers to train and run their systems. That demand has boosted companies like Crusoe ($2.8 billion valuation), Lambda ($2.5 billion), and Together AI ($3.3 billion), all working to supply the raw computing power AI development requires. Some startups are proving that AI model training doesn't have to burn through endless amounts of money. Chinese company DeepSeek is an example, it's shown that building powerful models can be done more cost-efficiently. While it isn't on the AI 50 list this year due to unclear details about its funding, revenue, and operations, DeepSeek represents a growing group of Chinese AI companies that are becoming serious contenders in the global AI race. From billion-dollar app startups to massive model-makers and the infrastructure that powers them, AI's ecosystem has never been more varied, or more competitive. And if this year's AI 50 list is anything to go by, the race is shifting from who can build the biggest model to who can build the most useful AI for real-world needs. Source: Forbes

Google finds workaround for lobbying that omits big bosses
Google finds workaround for lobbying that omits big bosses

Economic Times

time26 minutes ago

  • Economic Times

Google finds workaround for lobbying that omits big bosses

AP It was the end of 2018, and Google's leaders were tired of being Number One. For the second year in a row, federal records showed the search giant had spent more than any other individual company on lobbying in Washington. Executives in Mountain View were sick of seeing that mentioned in the press, according to a former Google employee who asked not to be identified discussing private conversations. Then Google apparently found a workaround. A new analysis of federal lobbying data by the nonprofit Tech Transparency Project shows that Google and its parent company, Alphabet Inc., used an internal reorganisation to exclude the value of lobbying by its senior executives from disclosures. The move helped keep Google off the top of the lobbying charts even as it maintained a robust network of advocates pushing its interests in the capital, during federal challenges to its dominance in search and advertising and the beginnings of artificial intelligence regulation. The findings, which were confirmed by a Bloomberg analysis of lobbying records, show that the effect of the accounting change was to lower the amount that Google reported spending to influence the federal government, likely by millions of dollars. The reorganization 'has allowed the company to shield a significant portion of its lobbying expenditures from public view,' the Tech Transparency Project said in its report.A Google spokesperson, José Castañeda, disputed the report and said the company has followed all relevant disclosure laws.'These are inaccurate claims about a technical change that simply brought us in line with how many other companies report their lobbying activities,' he said. 'Our lobbying expenditures began decreasing in 2018, after we restructured our government affairs team and cut spending on consultants.' Internal reshuffle Starting in 2019, Google began cutting ties with some of its external lobbying firms, a move it acknowledged publicly as part of an overhaul of its Washington the shuffling of external lobbying firms doesn't explain the whole of the decline in Google's reported lobbying expenses, which fell from more than $22 million in 2018 to $8.9 million in the Covid-disrupted year of 2020, and have subsequently remained well below pre-pandemic levels. There's been another, quieter change: in early 2020, Google moved its in-house lobbyists into a new subsidiary, called Google Client Services LLC. It's that unit which now files spending disclosures for Google's lobbying activities. The reorganization meant that the parent companies Google and Alphabet no longer directly employed any lobbyists – defined under federal disclosure law as people spending at least 20% of their time on influencing Congress or the executive branch. Companies that file lobbying disclosure reports are supposed to also account for the time that other senior executives — those who don't meet the 20% threshold – devote to lobbying, according to legal experts and the compliance guide for the Lobbying Disclosure Act published by Congressional leaders. That generally involves prorating their annual compensation to account for the days they spend influencing the government. But since Google moved lobbyists into the Google Client Services subsidiary, the parent company no longer meets the threshold for filing disclosures under the Lobbying Disclosure Act, according to the TTP analysis. That means Google no longer reports the lobbying expenses of high-ranking managers who aren't part of the Client Services unit — like Chief Executive Officer Sundar Pichai and chief legal officer Kent Walker — to the public, as it once did. As a result, in 2020 Google dropped out of the top 20 in corporate lobbying expenses for the first time in nearly a decade, the TTP analysis Google's reported annual spending has since edged back up again, it hasn't come close to the No.1 slot in the company lobbying rankings that it used to occupy. For the past five years, that position has alternated between two other tech giants: Meta Platforms Inc. and Inc. Antitrust challenge There's been plenty going on in Washington over the period that was crucial for Google's business. For one thing, the company — like many peers — is betting heavily on AI, a field where decisions in the US capital will shape the commercial has also been under assault from antitrust authorities over its dominance in search and digital advertising. The company has maintained in those lawsuits that its success is down to consumer choice and superior innovation, rather than a result of its power to shape laws and regulations. Publicity around its lobbying spending has the potential to undercut such arguments and alienate executives are as highly paid as many in Silicon Valley, the prorated amounts can add up to millions — even for just a few days' worth of lobbying. Google reported total compensation for Pichai of more than $225 million in 2022, thanks to grants of stock. His total compensation was $10.7 million in 2024. Walker's total compensation was more than $30 million last year, the company say the new structure Google is employing flouts the spirit of the federal disclosure law – if not the letter itself. 'This is just too cute by half,' said William Luneburg, a professor emeritus at the University of Pittsburgh School of Law, and the co-editor of the manual for lobbying compliance published by the American Bar Association. 'On the face of it, it's wrong,' he said. 'They have to report all of their expenses, which would include the time of officers and directors and other employees that spend their time engaging in lobbying activity.''We always comply with disclosure laws and any suggestion of improper reporting is false,' said Castañeda, the Google said it examined lobbying disclosures of several other companies that filed reports via a similar subsidiary model, but didn't find any that had used the structure to remove executive lobbying from their disclosures. Elevate your knowledge and leadership skills at a cost cheaper than your daily tea. Tariffs, tantrums, and tech: How Trump's trade drama is keeping Indian IT on tenterhooks Good, bad, ugly: How will higher ethanol in petrol play out for you? As big fat Indian wedding slims to budget, Manyavar loses lustre As 50% US tariff looms, 6 key steps that can safeguard Indian economy Stock Radar: JSPL forms Ascending Triangle pattern on weekly charts, could hit fresh 52-week high soon Nifty and business are different species: 5 small-cap stocks from different sectors with upside potential of up to 30% F&O Radar | Deploy Bear Put Spread in Nifty to play index's negative stance amid volatility Wealth creation: Look beyond the obvious in some things; 10 fertilizer sector companies worth watching

Apple denies allegations of favoritism to ChatGPT after Elon Musk says 'Apple is promoting...'
Apple denies allegations of favoritism to ChatGPT after Elon Musk says 'Apple is promoting...'

India.com

time26 minutes ago

  • India.com

Apple denies allegations of favoritism to ChatGPT after Elon Musk says 'Apple is promoting...'

Elon Musk (File) Elon Musk vs ChatGPT: In a significant development from the US tech industry, Tech giant Apple has denied any foul play, insisting its platform is 'fair and free of bias' after world's richest man, Tesla and SpaceX owner Elon Musk accused the iPhone maker of favouritism in support of ChatGPT in App Store rankings. For those unversed, Elon Musk had earlier claimed that that OpenAI's ChatGPT ranks first because of Apple's favouritism, while his apps, X and xAI's Grok, are being sidelined. Here are all the details you need to know about the recent controversy between Elon Musk and Apple regrading the favouritism allegation. How has Apple reacted to Elon Musk's allegations? As per a report carried by IANS, Apple denied allegations that its App Store algorithms or curated lists favour ChatGPT over Musk's offerings. ' The App Store is designed to be fair and free of bias,' the company insisted, adding that recommendations are based on charts, algorithms, and expert editorial curation using objective criteria, according to multiple media reports. 'Our goal is to offer safe discovery for users and valuable opportunities for developers, collaborating with many to increase app visibility in rapidly evolving categories,' the company's statement continued. Why did Musk criticise Apple? Musk on X criticised Apple for allegedly making it 'impossible for any AI company besides OpenAI to reach #1.' He claimed that although X led the News chart and Grok received major updates, including making Grok 4 free for all users, his chatbot only reached fifth overall and second in the Productivity category. Notably, Apple is facing a challenging situation in the US as it contends with a prominent antitrust case from the US Department of Justice. (With inputs from agencies)

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into a world of global content with local flavor? Download Daily8 app today from your preferred app store and start exploring.
app-storeplay-store