
Explained: Why this mathematician thinks OpenAI isn't acing the International Mathematical Olympiad — and might be ‘cheating' to win gold
AI isn't taking the real test: GPT models 'solving' International Math
Olympiad
(IMO) problems are often operating under very different conditions—rewrites, retries, human edits.
Tao's warning: Fields Medalist
Terence Tao
says comparing these AI outputs to real IMO scores is misleading because the rules are entirely different.
Behind the curtain: Teams often cherry-pick successes, rewrite problems, and discard failures before showing the best output. It's not cheating, but it's not fair play: The AI isn't sitting in silence under timed pressure—it's basically Iron Man in a school exam hall.
Main takeaway: Don't mistake polished AI outputs under ideal lab conditions for human-level reasoning under Olympiad pressure.
Led Zeppelin
once sang, 'There's a lady who's sure all that glitters is gold.'
But in the age of artificial intelligence, even the shimmer of mathematical brilliance needs closer scrutiny. These days, social media lights up every time a language model like GPT-4 is said to have solved a problem from the International Mathematical Olympiad (IMO) — a competition so elite it makes Ivy League entrance exams look like warm-up puzzles.
'This AI solved an IMO question!'
'Superintelligence is here!'
'We're witnessing the birth of a digital Newton!'
Or so the chorus goes.
But one of the greatest living mathematicians isn't singing along. Terence Tao, a Fields Medal–winning professor at UCLA, has waded into the hype with a calm, clinical reminder: AI models aren't playing by the same rules. And if the rules aren't the same, the gold medal doesn't mean the same thing.
The Setup: What the IMO Actually Demands
The International Mathematical Olympiad is the Olympics of high school math. Students from around the world train for years to face six unspeakably hard problems over two days. They get 4.5 hours per day, no calculators, no internet, no collaboration — just a pen, a problem, and their own mind.
Solving even one problem in full is an achievement. Getting five perfect scores earns you gold. Solve all six and you enter the realm of myth — which, incidentally, is where Tao himself resides. He won a gold medal in the IMO at age 13.
So when an AI is said to 'solve' an IMO question, it's important to ask: under what conditions?
Enter Tao: The IMO, Rewritten (Literally)
In a detailed Mastodon post, Tao explains that many AI demonstrations that showcase Olympiad-level problem solving do so under dramatically altered conditions. He outlines a scenario that mirrors what's actually happening behind the scenes:
'The team leader… gives them days instead of hours to solve a question, lets them rewrite the question in a more convenient formulation, allows calculators and internet searches, gives hints, lets all six team members work together, and then only submits the best of the six solutions… quietly withdrawing from problems that none of the team members manage to solve.'
In other words: cherry-picking, rewording, retries, collaboration, and silence around failure.
It's not quite cheating — but it's not the IMO either. It's an AI-friendly reconstruction of the Olympiad, where the scoreboard is controlled by the people training the system.
From Bronze to Gold (If You Rewrite the Test)
Tao's criticism isn't just about fairness — it's about what we're really evaluating.
He writes,
'A student who might not even earn a bronze medal under the standard IMO rules could earn a 'gold medal' under these alternate rules, not because their intrinsic ability has improved, but because the rules have changed.'
This is the crux. AI isn't solving problems like a student. It's performing in a lab, with handlers, retries, and tools. What looks like genius is often a heavily scaffolded pipeline of failed attempts, reruns, and prompt rewrites. The only thing the public sees is the polished output.
Tao doesn't deny that AI has made remarkable progress. But he warns against blurring the lines between performance under ideal conditions and human-level problem-solving in strict, unforgiving settings.
Apples to Oranges — and Cyborg Oranges
Tao is careful not to throw cold water on AI research. But he urges a reality check.
'One should be wary of making apples-to-apples comparisons between the performance of various AI models (or between such models and the human contestants) unless one is confident that they were subject to the same set of rules.'
A tweet that says 'GPT-4 solved this problem' often omits what really happened:
– Was the prompt rewritten ten times?
– Did the model try and fail repeatedly?
– Were the failures silently discarded?
– Was the answer chosen and edited by a human?
Compare that to a teenager in an exam hall, sweating out one solution in 4.5 hours with no safety net. The playing field isn't level — it's two entirely different games.
The Bottom Line
Terence Tao doesn't claim that AI is incapable of mathematical insight. What he insists on is clarity of conditions. If AI wants to claim a gold medal, it should sit the same exam, with the same constraints, and the same risks of failure.
Right now, it's as if Iron Man entered a sprint race, flew across the finish line, and people started asking if he's the next
Usain Bolt
.
The AI didn't cheat. But someone forgot to mention it wasn't really racing.
And so we return to that Led
Zeppelin
lyric: 'There's a lady who's sure all that glitters is gold.' In 2025, that lady might be your algorithmic feed. And that gold? It's probably just polished scaffolding.
FAQ: AI, the IMO, and Terence Tao's Critique
Q1: What is the International Mathematical Olympiad (IMO)?
It's the world's toughest math competition for high schoolers, with six extremely challenging problems solved over two 4.5-hour sessions—no internet, no calculators, no teamwork.
Q2: What's the controversy with AI and IMO questions?
AI models like GPT-4 are shown to 'solve' IMO problems, but they do so with major help: problem rewrites, unlimited retries, internet access, collaboration, and selective publishing of only successful attempts.
Q3: Who raised concerns about this?
Terence Tao, one of the greatest mathematicians alive and an IMO gold medalist himself, called out this discrepancy in a Mastodon post.
Q4: Is this AI cheating?
Not exactly. But Tao argues that changing the rules makes it a different contest altogether—comparing lab-optimised AI to real students is unfair and misleading.
Q5: What's Tao's main point?
He urges clarity. If we're going to say AI 'solved' a problem, we must also disclose the conditions—otherwise, it's like comparing a cyborg sprinter to a high school track star and pretending they're equals.
Q6: Does Tao oppose AI?
No. He recognises AI's impressive progress in math, but wants honesty about what it means—and doesn't mean—for genuine problem-solving ability.
Q7: What should change?
If AI is to be judged against human benchmarks like the IMO, it must be subjected to the same constraints: time limits, no edits, no retries, no external tools.
Tao's verdict? If you want to claim gold, don't fly across the finish line in an Iron Man suit and pretend you ran.
AI Masterclass for Students. Upskill Young Ones Today!– Join Now
Hashtags

Try Our AI Features
Explore what Daily8 AI can do for you:
Comments
No comments yet...
Related Articles


Economic Times
3 hours ago
- Economic Times
Meta names former OpenAI researcher Shengjia Zhao as Chief scientist for superintelligence lab
Meta taps former OpenAI researcher Shengjia Zhao as Chief Scientist of its new Superintelligence AI Lab, signaling a bold move in the AGI race. Meta Platforms has appointed Shengjia Zhao, a former OpenAI researcher and co-author of the original ChatGPT paper, as the Chief Scientist of its new Superintelligence AI group, the company confirmed Friday (July 25). Zhao, who joined Meta in June, was a key contributor to OpenAI's early breakthroughs, including the development of ChatGPT and the company's first reasoning model, known as o1. The model helped set off a new wave of 'chain-of-thought' AI systems adopted by companies such as Google and DeepSeek. Meta CEO Mark Zuckerberg announced Zhao's formal appointment in a post on Threads, calling him 'our lead scientist from day one.' Zuckerberg added: 'Now that our recruiting is going well and our team is coming together, we have decided to formalize his leadership role.' Zhao will report to Alexandr Wang, the former CEO of Scale AI, who joined Meta in June as Chief AI Officer. Wang is leading Meta's efforts in building artificial general intelligence (AGI), AI that can think and reason at or beyond human capability. The move comes amid Meta's aggressive recruitment campaign in the AI sector. Over the past two months, the company has hired more than a dozen researchers from OpenAI, Apple, Google, and Anthropic. That includes two top Apple AI scientists, Tom Gunter and Mark Lee, according to Bloomberg. Meta launched the Superintelligence Lab in June 2025 as part of its renewed focus on developing advanced AI models. The lab is separate from Meta's long-standing AI research group FAIR, which will continue to be led by Yann LeCun, who now reports to Wang. Meta's recent hires have drawn attention for the size of the compensation packages involved. Some reports suggested offers exceeding $100 million, although the company has denied rumors of higher figures, including claims of $300 million current open-source model, LLaMA 4, has not yet matched the capabilities of OpenAI's GPT-4 or Google's Gemini models. The company is expected to release a more advanced model, internally codenamed 'Behemoth,' later this expressed optimism about the lab's future, saying, 'Together we are building an elite, talent-dense team that has the resources and long-term focus to push the frontiers of superintelligence research.'


Mint
10 hours ago
- Mint
‘AI's real risk is human intent, not rogue machines,' warns OpenAI boss Sam Altman
OpenAI chief executive Sam Altman has voiced his growing concerns over the misuse of artificial intelligence, warning that the real danger lies not in autonomous machines going rogue, but in people using AI tools to cause deliberate harm. Speaking on a recent episode of Theo Von's podcast, Altman addressed the long-debated question of AI risks. Rather than echoing dystopian fears of machines turning against humanity, he shifted the spotlight to human intent. 'I worry more about people using AI to do bad things than the AI deciding to do bad things on its own,' Altman said. His remarks mark a departure from the typical science-fiction narrative of killer robots and self-aware systems, instead highlighting a more immediate and realistic challenge, the potential for malicious actors to exploit advanced AI models. 'The risk is if someone really wants to cause harm and they have a very powerful tool to do it,' he noted, pointing to the ease with which powerful AI systems could be weaponised if left unchecked. Altman acknowledged the difficulty of designing AI systems that remain safe and beneficial while in the hands of millions of users. 'We're trying to build guardrails as we go. That's hard, but necessary,' he admitted, underlining the ongoing efforts at OpenAI to embed ethical guidelines and technical safeguards into its models. His comments come at a time when OpenAI is facing increased scrutiny from policymakers and civil society, particularly as speculation mounts around the development of GPT-5. With generative AI becoming more accessible and influential in everyday life, questions around governance, accountability, and control are more pressing than ever. Meanwhile, OpenAI has officially begun rolling out its new artificial intelligence agent, ChatGPT Agent, after a week-long delay. Originally announced on 18 July, the feature is now being made available to all ChatGPT Plus, Pro, and Team subscribers, according to a statement posted by the company on social media platform X. The delay in rollout left many users puzzled, with some still reporting the absence of the feature despite OpenAI's claims of a complete deployment. The company has not disclosed the cause behind the delay, and questions raised in the post's comment section remain unanswered.


Indian Express
12 hours ago
- Indian Express
Top 10 people with the highest IQ: Average is 100; #1 scored 276
Top 10 people with the highest IQ: The origins of the IQ test date back to the early 1900s, when French psychologist Alfred Binet developed it as a tool to assess children's cognitive development. His initial goal wasn't to label intelligence, but rather to identify students who might need extra academic support. Over the years, however, the test evolved into a broader measure of cognitive ability and potential across various domains. IQ scores are standardised with an average of 100, and most people fall within the 85 to 115 range. A score above 130 is typically seen as a marker of exceptional intellect, while scores below 70 may indicate cognitive challenges or developmental delays. Today, IQ testing plays a crucial role not just in education but also in psychological evaluations and research aimed at understanding the spectrum of human intelligence. One of the most exclusive groups in the world of high intelligence is the Giga Society, which only admits members who score 190 or above on approved high-range IQ tests. According to its 2025 report, the society listed the top 10 individuals with the highest verified IQs. These extraordinary minds have made remarkable contributions across disciplines—from mathematics and physics to medicine—pushing the boundaries of what we understand as intellectual brilliance. Source: Giga Society Report 2025 According to reports, South Korean native YoungHoon Kim has the highest recorded IQ in the world as of 2024, with an incredible score of 276. Official recognition of his extraordinary intelligence has come from reputable organisations like the World Genius Directory and the Giga Society. In addition to his extraordinary cognitive prowess, Kim is well-known for his vast work in a variety of fields, including psychology, neuroscience, and linguistics, where he has made significant contributions that highlight the depth of his brilliance. Australian-American mathematician Terence Tao is well-known for his contributions to number theory and harmonic analysis. A youthful prodigy, he became a professor at UCLA by the age of 24 after earning his PhD at the age of 20. Tao, who received the Fields Medal in 2006, is respected for his ability to solve problems both individually and collaboratively. He has also written a great deal to simplify difficult math. Marilyn Vos Savant is an American writer and columnist who is best known for her long-running 'Ask Marilyn' column in Parade magazine. She is included in the Guinness Book of World Records for having an extraordinarily high IQ. She is a well-known, though occasionally disputed, voice in public debates on intelligence and critical thinking because of her keen intellect and careful answers to questions about logic, physics, and philosophy. Christopher Hirata, a Japanese-American astrophysicist, made headlines early by winning a gold medal at the International Physics Olympiad at just 13. He began studying at Caltech at 14 and earned a PhD from Princeton by 22. Known for his work on dark energy and the structure of the universe, Hirata has received major honors like the MacArthur Fellowship and remains a key figure in modern cosmology. Sho Yano is a child prodigy who made headlines by earning his bachelor's degree from the University of Chicago at just 12 years old. He later became one of the youngest US doctors after graduating from medical school at 21. His research in genetics and molecular biology has made notable contributions to science, making him a global inspiration for young scholars. Greek psychiatrist Evangelos Katsioulis, who belonged to prestigious high IQ societies, is well-known for possessing one of the highest IQs ever measured. Having earned degrees in philosophy, research methodology, and medicine, he blends a strong interest in intellectual inquiry with practical experience. Through international conferences and scholarly forums, he actively supports education and thought leadership. Christopher Harding, an Australian intellectual born in 1944, was once dubbed the 'Smartest Man in the World' by the Guinness Book of World Records (1966–1988) with an IQ between 196 and 197. He founded the International Society for Philosophical Enquiry (ISPE) in 1974, a prestigious high-IQ group that admits only those scoring in the top 0.1 per cent on intelligence tests. Despite having a difficult childhood and no formal higher education, Christopher Langan, who is frequently quoted as having an IQ of almost 195, became well-known as a 'self-educated autodidact.' He has worked as a forest ranger and a cowboy, but his most well-known contribution is the 'Cognitive-Theoretic Model of the Universe' (CTMU), an ambitious hypothesis that combines philosophy, theology, and science. His mystery is heightened by his life, which has been characterised by both genius and personal struggles. Rick Rosner, with a reported IQ of 192, is known for his unconventional life and sharp intellect. He's held unusual jobs, from bouncer to stripper, and later wrote for shows like Jimmy Kimmel Live! and Who Wants to Be a Millionaire. His game show appearances and work on consciousness and intelligence reflect his eccentric yet deeply analytical mind. In 1985, at the age of 22, renowned Russian chess grandmaster Garry Kasparov became the youngest undisputed World Champion. He ruled the chess world for more than ten years and was well-known for his audacious, calculated play. Beyond the game, he is a respected figure in chess and international politics because of his vocal involvement and attempts to advance democracy in Russia.