logo
Google and OpenAI's AI models win milestone gold at global math competition

Google and OpenAI's AI models win milestone gold at global math competition

Reuters5 days ago
July 21 (Reuters) - Alphabet's (GOOGL.O), opens new tab Google and OpenAI said their artificial-intelligence models won gold medals at a global mathematics competition, signaling a breakthrough in math capabilities in the race to build powerful systems that can rival human intelligence.
The results marked the first time that AI systems crossed the gold-medal scoring threshold at the International Mathematical Olympiad for high-school students. Both companies' models solved five out of six problems, achieving the result using general-purpose "reasoning" models that processed mathematical concepts using natural language, in contrast to the previous approaches used by AI firms.
The achievement suggests AI is less than a year away from being used by mathematicians to crack unsolved research problems at the frontier of the field, according to Junehyuk Jung, a math professor at Brown University and visiting researcher in Google's DeepMind AI unit.
"I think the moment we can solve hard reasoning problems in natural language will enable the potential for collaboration between AI and mathematicians," Jung told Reuters.
The same idea can apply to research quandaries in other fields such as physics, said Jung, who won an IMO gold medal as a student in 2003.
Of the 630 students participating in the 66th IMO on the Sunshine Coast in Queensland, Australia, 67 contestants, or about 11%, achieved gold-medal scores.
Google's DeepMind AI unit last year achieved a silver medal score using AI systems specialized for math. This year, Google used a general-purpose model called Gemini Deep Think, a version of which was previously unveiled at its annual developer conference in May.
Unlike previous AI attempts that relied on formal languages and lengthy computation, Google's approach this year operated entirely in natural language and solved the problems within the official 4.5-hour time limit, the company said in a blog post.
OpenAI, which has its own set of reasoning models, similarly built an experimental version for the competition, according to a post by researcher Alexander Wei on social media platform X. He noted that the company does not plan to release anything with this level of math capability for several months.
This year marked the first time the competition coordinated officially with some AI developers, who have for years used prominent math competitions like IMO to test model capabilities. IMO judges certified the results of those companies, including Google, and asked them to publish results on July 28.
"We respected the IMO Board's original request that all AI labs share their results only after the official results had been verified by independent experts and the students had rightly received the acclamation they deserved," Google DeepMind CEO Demis Hassabis said on X on Monday.
However, OpenAI, which did not work with the IMO, self-published its results on Saturday, allowing it to be first among AI firms to claim gold-medal status.
In turn, the competition on Monday allowed cooperating companies to publish results, Gregor Dolinar, president of IMO's board, told Reuters.
Orange background

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

Related Articles

Competition shows humans are still better than AI at coding
Competition shows humans are still better than AI at coding

The Guardian

time32 minutes ago

  • The Guardian

Competition shows humans are still better than AI at coding

Computers have taken the crown in chess, Go and poker, but when it comes to competitive coding, humans still have the edge – just. Przemysław Dębiak, a Polish coder and mind sports champion, narrowly clinched a victory over OpenAI's entrant in the AtCoder World Tour Finals 2025 in Tokyo earlier this month. The elite coder, however, who goes by the online name Psyho, predicts he may be the last human to win the prestigious title given the incredible pace of technological progress. 'That's probable,' said Psyho, 41, who worked at OpenAI before retiring five years ago. 'I would prefer not, mostly because I like these competitions and knowing there's this magical entity that can do it better than me would be a little bit frustrating.' There is an irony, Psyho acknowledged, in the fact that coders have contributed to their own professional demise. 'Before the contest, I tweeted 'live by the sword, die by the sword',' he said. 'I helped developing AI and I would be the one who would be the loser of the match. Although I won, in the end, for now.' The AtCoder euristic division included 11 human participants invited on the basis of world rankings and a coding algorithm designed by OpenAI, which finished in second place, 9.5% behind Psyho's winning score. Sam Altman, OpenAI's CEO, tweeted his congratulations. The 10-hour contest involves solving a complex optimisation problem. A classic in the genre is the 'travelling salesman problem', where the salesman needs to figure out the shortest possible route between various cities, each visited once. These problems are simple to state, but finding an optimal solution is computationally very complex. So while ChatGPT is now routinely used to write boilerplate code, the AI's performance on an open-ended logic problem will be viewed as impressive. 'At the current state, humans – top humans, to be clear – are still much better at reasoning and solving complex problems,' said Psyho. But humans are 'bottlenecked' by how quickly they can type code, while an AI can try out lots of small adjustments very rapidly. 'The model is like cloning a single human multiple times and working in parallel,' he said. 'AI might not be the smartest right now but it's definitely the fastest. And sometimes multiplying a single average person many many times produces a better result than a single, special human being.' The result comes as major tech companies, including Meta and Microsoft, are turning to AI to write software code. The Anthropic CEO, Dario Amodei, said in May that AI could take 20% of white-collar jobs in the next one to five years. 'Every profession has this right now, more or less,' said Psyho. 'Some people have it coming right now – all of the white collar jobs. For manual jobs, robotics is lagging by several years.' Like many in the industry, Psyho is ambivalent about the potential impact of ever more powerful AI models. 'We have a tonne of issues,' he said. 'Disinformation, social impact, humans not having a purpose in life. Historically society moves at a very slow pace. Technological progress right now is moving at a faster and faster and faster pace.'

AI and the Future of Finance: Sam Altman at the Federal Reserve: By Vipin Kumar Sharma
AI and the Future of Finance: Sam Altman at the Federal Reserve: By Vipin Kumar Sharma

Finextra

time2 hours ago

  • Finextra

AI and the Future of Finance: Sam Altman at the Federal Reserve: By Vipin Kumar Sharma

Sam Altman joined the Federal Reserve this week to talk about AI's growing role in finance, and every FinTech leader should be paying attention. On July 22, 2025, OpenAI CEO Sam Altman joined the Federal Reserve Vice Chair Michelle Bowman in Washington, D.C. to explore how artificial intelligence is reshaping banking, security, and the future of financial services. Yes, the CEO of OpenAI walked into one of the most traditional institutions in global finance and painted a future that's anything but conventional. Here are some key takeaways from their discussion: AI voice cloning is becoming highly realistic. Altman pointed out that AI can now mimic voices so well that traditional tools, such as voice-based authentication, may no longer be reliable, especially for high-value transactions. It's time to rethink security. Banks and regulators must develop new, more advanced methods to verify identities and prevent fraud as AI becomes increasingly sophisticated. AI can drive real improvements. From faster credit checks to better customer service, Altman believes AI can help financial institutions become more efficient and cost-effective if implemented thoughtfully. AI should be accessible, not exclusive. He emphasized the importance of 'democratizing AI,' ensuring that the benefits of this technology are widely shared and not concentrated among a few large players. Job roles may shift, but opportunities can also change. While some roles will change due to automation, Altman stressed the need for retraining and smart policy to help workers transition and grow with new tools. While no formal partnership was announced, both OpenAI and the Fed expressed interest in continued dialogue and collaboration. AI in finance isn't just a future concept; it's already part of the conversation at the highest levels. Now is the time for leaders in finance, technology, and policy to collaborate and shape the future.

Doge reportedly using AI tool to create ‘delete list' of federal regulations
Doge reportedly using AI tool to create ‘delete list' of federal regulations

The Guardian

time2 hours ago

  • The Guardian

Doge reportedly using AI tool to create ‘delete list' of federal regulations

The 'department of government efficiency' (Doge) is using artificial intelligence to create a 'delete list' of federal regulations, according to a report, proposing to use the tool to cut 50% of regulations by the first anniversary of Donald Trump's second inauguration. The 'Doge AI Deregulation Decision Tool' will analyze 200,000 government regulations, according to internal documents obtained by the Washington Post, and select those which it deems to be no longer required by law. Doge, which was run by Elon Musk until May, claims that 100,000 of those regulations can then be eliminated, following some staff feedback. A PowerPoint presentation made public by the Post claims that the Department of Housing and Urban Development (HUD) used the AI tool to make 'decisions on 1,083 regulatory sections', while the Consumer Financial Protection Bureau used it to write '100% of deregulations'. The Post spoke to three HUD employees who told the newspaper AI had been 'recently used to review hundreds, if not more than 1,000, lines of regulations'. During his 2024 campaign, Donald Trump claimed that government regulations were 'driving up the cost of goods' and promised the 'most aggressive regulatory reduction' in history. He repeatedly criticized rules which aimed to tackle the climate crisis, and as president he ordered the heads of all government agencies to undertake a review of all regulations in coordination with Doge. Asked about the use of AI in deregulation by the Post, White House spokesperson Harrison Fields said 'all options are being explored' to achieve the president's deregulation promises. Fields said that 'no single plan has been approved or green-lit', and the work is 'in its early stages and is being conducted in a creative way in consultation with the White House'. Fields added: 'The Doge experts creating these plans are the best and brightest in the business and are embarking on a never-before-attempted transformation of government systems and operations to enhance efficiency and effectiveness.' Musk appointed a slew of inexperienced staffers to Doge, including Edward Coristine, a 19-year-old who was previously known by the online handle 'Big Balls'. Earlier this year, Reuters reported that Coristine was one of two Doge associates promoting the use of AI across the federal bureaucracy.

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into a world of global content with local flavor? Download Daily8 app today from your preferred app store and start exploring.
app-storeplay-store