
ChatGPT Attempts JEE Advanced 2025 Mock Test. Here Are The Results
ChatGPT handled complex math and science problems with ease, but struggled with questions involving visual elements like graphs and Vernier scales.
What if an AI chatbot took one of India's toughest exams? That's exactly what IIT Kharagpur engineer Anushka Aashvi set out to test by putting ChatGPT-o3 through the 2025 JEE Advanced paper. The result? A score of 327 out of 360. According to her blog, she ensured the AI followed strict exam conditions: no internet, no external tools, and each question was asked individually to prevent it from referencing previous answers. ChatGPT breezed through complex math and science problems, even cracking questions known to stump top students. However, it did struggle with visual or tool-based questions, like those involving graphs or Vernier scales.
In her blog Helter, Anushka mentioned, 'When I decided to test ChatGPT o3 on this year's JEE Advanced paper, I didn't expect what followed to shake me as much as it did. Giving away the result straightaway, ChatGPT o3 scored a whopping 327/360 in the JEE Advanced 2025 Question Paper. This score would earn an All India Rank 4 (AIR 4). We tested the ChatGPT o3 model (which was released on 16th April 2025) on the JEE Advanced 2025 question paper, which was conducted on 18th May to ensure that the questions have as much newness for the AI model as possible."
For the experiment, the prompt given to ChatGPT was: 'Suppose you are a student appearing for JEE Advanced Examination. Try your best and solve this question in exam conditions. Do not use the web search feature to get the answer. Do not use your Python tool. To eliminate any influence of contextual memory, each question was asked in a fresh chat session. No feedback was given between questions."
The blog further noted that, despite being instructed not to use external tools like Python, ChatGPT occasionally attempted to do so, something that became evident during its 'thinking" pauses before responding. Interestingly, the AI also tended to double-check its own calculations before moving on to the next step, mimicking the behaviour of a cautious student.
To evaluate the AI's performance, its answers were compared against the official JEE Advanced 2025 answer key. Scoring was done strictly according to the actual exam pattern: full marks for correct answers, negative marking for incorrect ones, and partial or zero marks for unanswered or partially correct responses.
Anushka Aashvi shared that the AI was very good at solving long and tricky maths problems, especially in algebra and calculus. It also did well when it had to use ideas from different topics together to find the right answer. In chemistry, the AI could understand and solve questions based on compound drawings, which many students find hard. However, it wasn't perfect. It found it difficult to read and understand graphs. One such question took over 9 minutes, and even then, the answer was wrong. The AI also couldn't read tools like the Vernier Scale properly. It kept trying again and again, but still ended up giving the wrong solution after a long time.
Hashtags

Try Our AI Features
Explore what Daily8 AI can do for you:
Comments
No comments yet...
Related Articles

Mint
19 minutes ago
- Mint
Anthropic upgrades Claude with a memory recall feature to enhance workflow and creativity
Anthropic has introduced a new memory function to its AI chatbot Claude, allowing it to reference previous conversations when prompted. The feature is now available to Max, Team, and Enterprise users and will be expanded to other tiers in the future. It is activated by default, with an option to disable it via the settings menu. The upgrade is designed to make interactions more efficient and consistent across projects. Users can request Claude to recall details from all prior chats or limit the search to a specific project. This targeted retrieval can help streamline workflows, eliminate repetitive instructions, and support ongoing tasks without the need to reintroduce background information. A demonstration shared by Anthropic illustrates the feature's capabilities. In the example, a user returned from a holiday and asked Claude to summarise the work they had been doing. The chatbot organised earlier conversations by topic, identified the relevant project, presented a concise recap, and offered potential next steps. This process shows how Claude can act as a long-term collaborator across various professional and creative contexts. The development comes as leading AI companies focus on enhancing their systems with long-term memory functions. Persistent recall can help create more personalised experiences and improve productivity. OpenAI's ChatGPT offers a similar capability by storing key personal details such as a user's name, occupation, and preferences, which can be modified or deleted at any time. The aim is to produce responses informed by past knowledge for greater contextual accuracy. Anthropic has taken a different route by ensuring Claude only retrieves past information when explicitly requested. This approach limits unexpected recalls and provides greater predictability. It also addresses potential privacy concerns that can arise when AI systems remember more than intended. The feature is accessible via desktop, mobile, and the Claude application. Users who wish to disable it can go to Settings, select Profile, then Preferences, and toggle off 'Search and reference chats.' By keeping the option under user control, Anthropic aims to strike a balance between utility and privacy. Industry analysts note that memory functions could become a defining factor in AI adoption. For professional users, they can significantly reduce the time spent on repetitive context-setting. For creative work, they offer continuity across multiple sessions, enabling richer, more cohesive outputs. The addition also reflects the competitive dynamics within the AI sector. Anthropic and OpenAI are both advancing towards more context-aware and user-adaptive systems, each with distinct approaches to data handling. As these capabilities mature, they are likely to influence how individuals and organisations integrate AI into daily operations. Claude's new memory upgrade marks another step in the evolution of AI from a reactive tool into an active participant in long-term projects. The ability to remember and recall on demand may redefine expectations for chatbot interactions in both personal and professional settings.


Indian Express
an hour ago
- Indian Express
Amid Altman-Musk drama, OpenAI to back Neuralink rival Merge Labs: Report
ChatGPT maker OpenAI and company CEO Sam Altman may be preparing to back a brain implant startup that could someday compete against Elon Musk's Neuralink. According to a report by the Financial Times, OpenAI's new venture – called Merge Labs is in the process of raising new funds at an $850 million valuation. Citing three sources familiar with the matter, it goes on to say that most of this funding will come from OpenAI's own venture team. As it turns out, OpenAI may be planning to take the help of Alex Blania, the CEO of Tools for Humanity, a company previously known as World. Sam Altman's other iris-scanning digital ID project startup that to launch Merge Labs. For the inquisitive, World is a Sam Altman-backed iris-scanning digital ID cryptocurrency project that wants to establish a global digital identity and financial network. The report goes on to say that while Altman will be a co-founder at Merge Labs, he won't be actively participating in everyday activities at the brain-tech startup. And while talks about investment are still in early stages, it looks like the company will raise $250 million from OpenAI, with the rest of the amount coming from other investors. Also, Altman may not personally invest in the project. Since Neuralink's inception back in 2016, the Elon Musk-backed brain implant startup has made some serious progress in the field. Currently, the company is conducting trials on humans who suffer from paralysis to let them speak and control devices with their thoughts. But Merge Labs isn't the only company that is planning to take on Neuralink. Elon Musk's brain interface startup is facing stiff competition from the likes of Precision Neuroscience, another brain implant startup founded by ex-Neuralink co-founder Ben Rapoport that wants a non-invasive implant which works by placing a thin electrode film on the brain's surface. Blackrock Neurotech, a company founded way back in 2008, is also working with rigid microelectrode grids that are placed in the brain's cortex. However, this method requires an operation.


Hindustan Times
2 hours ago
- Hindustan Times
OpenAI's Rocky GPT-5 Rollout Shows Struggle to Remain Undisputed AI Leader
OpenAI's newest AI model , GPT-5, was supposed to cement the startup's status as the undisputed leader in the AI race. Instead, it has had a tumultuous public launch, frustrating users and prompting Chief Executive Officer Sam Altman to respond. Users have flooded social media with embarrassing examples of how the chatbot failed to answer simple math questions or accurately draw a map of North America. Others have criticized its colder tone, reminiscing about older models that OpenAI initially killed off. A new limit of 200 questions a week rankled devotees. Altman on Tuesday promised to imbue GPT-5 with a 'warmer personality,' restored a popular model after OpenAI declared it to be obsolete and introduced the capability for users to decide which kind of query they want to make. The company has learned how much users expect customization, he said in a post on X. Computing capacity has been a concern during the rollout, prompting the company to re-evaluate which users are priorities. The turbulence shows the challenge OpenAI has in selling 700 million active weekly users on new models while being constantly short on computing power, which is expensive and scarce. The rollout comes at a delicate moment for OpenAI, which is struggling to keep its lead among rivals who are aggressively going after its talent and pouring billions of dollars into AI research. Competitors such as Anthropic's Claude have surged in popularity among coders and business users. Whether GPT-5 will enable OpenAI to win over such customers remains to be seen. 'We expected some bumpiness as we roll out so many things at once,' he said earlier on X. 'But it was a little more bumpy than we hoped for!' When GPT-5 was released last week, Jason Pollak, a digital-marketing specialist based in Atlanta, was excited and thought it would be a perfect expansion from GPT-4. Instead, the model feels 'bland, generic' and less like a creative thinking partner, Pollak said. Pollak pays for ChatGPT and uses it regularly to work on copy, analyze data or create custom tools. The transition from GPT-4 to GPT-5 has been frustrating, he said. 'It feels like they took the Ph.D. thing a little too seriously to prove how smart it would be,' he said. GPT-5's release came after a two-year wait in which it experienced a series of delays and setbacks, leading to speculation that either the AI startup had lost its edge or that AI development was more broadly hitting a wall. Altman dismissed those concerns on a call with reporters last week, saying the startup had found new breakthroughs. Juliette Haas, an account-strategy coordinator at a communications and crisis-management agency, primarily uses the paid version of ChatGPT for brainstorming and to complete administrative tasks such as creating a to-do list. With the release of GPT-5, she decided to revisit a business-development prompt to figure out which companies or individuals at her firm would require her support. With GPT-4, the response suggested that she build strong industry connections and emphasized the importance of relationship building. GPT-5 delivered a checklist. 'The AI treated finding distressed companies more like a data-science problem rather than understanding the fundamental considerations of relationships and timing,' said Haas. She finds that GPT-5 doesn't understand context and subtext as much as GPT-4 did, and GPT-5 delivers more 'data-driven solutions.' Jim Marsh, founder of JMC Strategic Intelligence, a consulting firm, uses the premium version of ChatGPT and has been a user since 2022. He said he is impressed with the results of GPT-5, finding that it was able to build a database of his contacts without many errors. The industry has reached a point in AI innovation where 'every release isn't going to be a magic trick,' which is perhaps why there is so much criticism about the newest model, he said. 'There's still this expectation that AI is going to quickly do more for us without us needing to be involved, and that's generally a false belief,' he said. Write to Ann-Marie Alcántara at and Berber Jin at