
Humans beat AI gold-level score at top maths contest
Neither model scored full marks -- unlike five young people at the International Mathematical Olympiad (IMO), a prestigious annual competition where participants must be under 20 years old.
Google said Monday that an advanced version of its Gemini chatbot had solved five out of the six maths problems set at the IMO, held in Australia's Queensland this month.
"We can confirm that Google DeepMind has reached the much-desired milestone, earning 35 out of a possible 42 points -- a gold medal score," the US tech giant cited IMO president Gregor Dolinar as saying.
"Their solutions were astonishing in many respects. IMO graders found them to be clear, precise and most of them easy to follow."
Around 10 percent of human contestants won gold-level medals, and five received perfect scores of 42 points.
US ChatGPT maker OpenAI said that its experimental reasoning model had scored a gold-level 35 points on the test.
The result "achieved a longstanding grand challenge in AI" at "the world's most prestigious math competition", OpenAI researcher Alexander Wei wrote on social media.
"We evaluated our models on the 2025 IMO problems under the same rules as human contestants," he said.
"For each problem, three former IMO medalists independently graded the model's submitted proof."
Google achieved a silver-medal score at last year's IMO in the British city of Bath, solving four of the six problems.
That took two to three days of computation -- far longer than this year, when its Gemini model solved the problems within the 4.5-hour time limit, it said.
The IMO said tech companies had "privately tested closed-source AI models on this year's problems", the same ones faced by 641 competing students from 112 countries.
"It is very exciting to see progress in the mathematical capabilities of AI models," said IMO president Dolinar.
Hashtags

Try Our AI Features
Explore what Daily8 AI can do for you:
Comments
No comments yet...
Related Articles


Mint
8 minutes ago
- Mint
The new chips designed to solve AI's energy problem
'I can't wrap my head around it," says Andrew Wee, who has been a Silicon Valley data-center and hardware guy for 30 years. The 'it" that has him so befuddled—irate, even—is the projected power demands of future AI supercomputers, the ones that are supposed to power humanity's great leap forward. Wee held senior roles at Apple and Meta, and is now head of hardware for cloud provider Cloudflare. He believes the current growth in energy required for AI—which the World Economic Forum estimates will be 50% a year through 2030—is unsustainable. 'We need to find technical solutions, policy solutions and other solutions that solve this collectively," he says. To that end, Wee's team at Cloudflare is testing a radical new kind of microchip, from a startup founded in 2023, called Positron, which has just announced a fresh round of $51.6 million in investment. These chips have the potential to be much more energy efficient than ones from industry leader Nvidia at the all-important task of inference, which is the process by which AI responses are generated from user prompts. While Nvidia chips will continue to be used to train AI for the foreseeable future, more efficient inference could collectively save companies tens of billions of dollars, and a commensurate amount of energy. There are at least a dozen chip startups all battling to sell cloud-computing providers the custom-built inference chips of the future. Then there are the well-funded, multiyear efforts by Google, Amazon and Microsoft to build inference-focused chips to power their own internal AI tools, and to sell to others through their cloud services. The intensity of these efforts, and the scale of the cumulative investment in them, show just how desperate every tech giant—along with many startups—is to provide AI to consumers and businesses without paying the 'Nvidia tax." That's Nvidia's approximately 60% gross margin, the price of buying the company's hardware. Nvidia is very aware of the growing importance of inference and concerns about AI's appetite for energy, says Dion Harris, a senior director at Nvidia who sells the company's biggest customers on the promise of its latest AI hardware. Nvidia's latest Blackwell systems are between 25 and 30 times as efficient at inference, per watt of energy pumped into them, as the previous generation, he adds. To accomplish their goals, makers of novel AI chips are using a strategy that has worked time and again: They are redesigning their chips, from the ground up, expressly for the new class of tasks that is suddenly so important in computing. In the past, that was graphics, and that's how Nvidia built its fortune. Only later did it become apparent graphics chips could be repurposed for AI, but arguably it's never been a perfect fit. Jonathan Ross is chief executive of chip startup Groq, and previously headed Google's AI chip development program. He says he founded Groq (no relation to Elon Musk's xAI chatbot) because he believed there was a fundamentally different way of designing chips—solely to run today's AI models. Groq claims its chips can deliver AI much faster than Nvidia's best chips, and for between one-third and one-sixth as much power as Nvidia's. This is due to their unique design, which has memory embedded in them, rather than being separate. While the specifics of how Groq's chips perform depends on any number of factors, the company's claim that it can deliver inference at a lower cost than is possible with Nvidia's systems is credible, says Jordan Nanos, an analyst at SemiAnalysis who spent a decade working for Hewlett Packard Enterprise. Positron is taking a different approach to delivering inference more quickly. The company, which has already delivered chips to customers including Cloudflare, has created a simplified chip with a narrower range of abilities, in order to perform those tasks more quickly. The company's latest funding round came from Valor Equity Partners, Atreides Management and DFJ Growth, and brings the total amount of investment in the company to $75 million. Positron's next-generation system will compete with Nvidia's next-generation system, known as Vera Rubin. Based on Nvidia's road map, Positron's chips will have two to three times better performance per dollar, and three to six times better performance per unit of electricity pumped into them, says Positron CEO Mitesh Agrawal. Competitors' claims about beating Nvidia at inference often don't reflect all of the things customers take into account when choosing hardware, says Harris. Flexibility matters, and what companies do with their AI chips can change as new models and use cases become popular. Nvidia's customers 'are not necessarily persuaded by the more niche applications of inference," he adds. Cloudflare's initial tests of Positron's chips were encouraging enough to convince Wee to put them into the company's data centers for more long-term tests, which are continuing. It's something that only one other chip startup's hardware has warranted, he says. 'If they do deliver the advertised metrics, we will open the spigot and allow them to deploy in much larger numbers globally," he adds. By commoditizing AI hardware, and allowing Nvidia's customers to switch to more-efficient systems, the forces of competition might bend the curve of future AI power demand, says Wee. 'There is so much FOMO right now, but eventually, I think reason will catch up with reality," he says. One truism of the history of computing is that whenever hardware engineers figure out how to do something faster or more efficiently, coders—and consumers—figure out how to use all of the new performance gains, and then some. Mark Lohmeyer is vice president of AI and computing infrastructure for Google Cloud, where he provides both Google's own custom AI chips, and Nvidia's, to Google and its cloud customers. He says that consumer and business adoption of new, more demanding AI models means that no matter how much more efficiently his team can deliver AI, there is no end in sight to growth in demand for it. Like nearly all other big AI providers, Google is making efforts to find radical new ways to produce energy to feed that AI—including both nuclear power and fusion. The bottom line: While new chips might help individual companies deliver AI more efficiently, the industry as a whole remains on track to consume ever more energy. As a recent report from Anthropic notes, that means energy production, not data centers and chips, could be the real bottleneck for future development of AI. Write to Christopher Mims at


Mint
an hour ago
- Mint
Weekly Tech Recap: Apple releases iOS 26 Beta update, GPT-5 launch date leaked and more
With news coming in throughout the week, it can be difficult to sift out the important updates from the noise. To keep readers up to date, we've compiled the Weekly Tech Recap, where we take a look at the top news that shook up the world of technology. This week, Apple released the first public beta for iOS 26, GitHub launched its natural language app creation tool, new leaks revealed GPT-5 launch date and more Afte releasing four rounds of developer beta update for its latest iOS 26 software, Apple finally released its latest first publica beta this week giving a look at the new Liquid Glass interface along with a number of other features. While the Cupertino-based tech giant rolled out the iOS 26 Developer Beta almost immediately after its WWDC keynote, public betas are typically released in July after Apple addresses initial issues. The company rolled out the fourth developer beta for iPad and iPhone earlier this week as well. Notably, developer betas are primarily meant for app developers and advanced users, giving them time to test their apps ahead of the official release. Meanwhile, public betas are aimed at a wider audience, allowing them to try out pre-release iOS versions to identify bugs and provide feedback. Public betas are usually a few versions behind developer betas, suggesting they should be more stable. While GitHub has been slowly adding AI-powered features for developers over the last couple of years, the Microsoft-owned company has gone a step further with its latest feature called GitHub Spark, which allows users to create an app by simply giving prompts in natural language. The new feature, currently an experiment under the GitHub Next labs, gives users the choice between using an OpenAI GPT model or a Claude Sonnet model. Notably, while OpenAI has tuned its latest models for developers, Claude's Sonnet models continue to generate buzz in developer circles for their technical reasoning and debugging abilities. With Spark, GitHub allows users to quickly build a small web app or 'micro app' using natural language. Unlike other GitHub tools, Spark doesn't just generate code for the app but also runs it and displays an interactive preview that can be further refined through additional prompts. OpenAI CEO Sam Altman has sounded alarm bells around sharing too much personal data with an AI system, given that there are currently no frameworks in place to safeguard a user's privacy. Notably, there has been a growing trend among young AI users to share their personal problems with AI chatbots, seeking relationship, life, or legal advice, largely because these generative AI systems have access to a wide knowledge base. Speaking on a podcast with Theo Von, Altman said, 'I think we will certainly need a legal or a policy framework for AI. One example that we've been thinking about a lot this is like a maybe not quite what you're asking this is like a very human centric version of that question people talk about the most personal shit in their lives to ChatGPT it's you know people use it young people especially like use it as a therapist a life coach uh having these relationship problems, what should I do? And right now, if you talk to a therapist or a lawyer or a doctor about those problems, there's like legal privilege for it, you know, like it's there's doctor patient confidentiality, there's legal confidentiality' OpenAI's much awaited GPT-5 model could finally make its debut in early August, according to a report by The Verge. GPT-5 will be the latest model running the company's ChatGPT AI bot and will be the first ever LLM from it to come with unified reasoning capabilites. Sam Altman had recently announced in a post on X that GPT-5 will be launching soon. The OpenAI CEO made the announcement in an announcement about an AI model from the company that achieved gold level performance at the 2025 IMO competiton. However, he noted that GPT-5 won't have IMO gold level capabilities for 'many months'. In a recent podcast Altman teased new capabilites of GPT-5 noting that the model helped him answer a difficult email that he should have been able to do but couldn't. Elon Musk's Starlink satellite internet service suffered a major global outage this week that left tens of thousands of users across the US, UK, Germany, Zimbabwe, Romania, and beyond without internet access for over two hours on Thursday.


Indian Express
2 hours ago
- Indian Express
How Lovable became a successful AI-powered app builder
Lovable, a Swedish AI startup, has crossed $100 million in annual recurring revenue, putting it ahead of most other software firms, including OpenAI. Currently, the firm has more than 2.3 million active users, and last reported 180,000 paying subscribers. Here is a look at Lovable, and how it became such a big company so quickly. Lovable is essentially a company that offers an AI-powered app development platform, allowing users to build entire web applications using natural language prompts. It was founded in November 2023 by Anton Osika and Fabian Hedin with an aim to democratise software development by enabling non-coders to turn their ideas into a reality. The company shot to fame after creating something called GPT Engineer, an open-source tool that showcased the ability of large language models (LLMs) to write functional code from simple prompts. LLMs are trained on massive amounts of text data that can understand and generate human language. Following the success of GPT Engineer, Loveable launched GPT which was meant to be used by non-technical users. At the heart of Lovable's success lies its goal to allow anyone to create web apps with natural language, without the need to code. All one has to do is have a vision, and give instructions to the GPT 'The app eliminates the complexity of traditional app-creation environments by combining coding, deployment, and collaboration in a single interface,' according to a report by Contrary Research, a hub for research and analysis of private tech companies. Users can build a wide range of products from simple websites to complex web apps with the help of Lovable. Not only this, the company provides a user-friendly interference which has been a key in its success.