logo
#

Latest news with #ChatGPTo3

ChatGPT beats Grok in AI chess final, Gemini finishes third, Elon Musk says…
ChatGPT beats Grok in AI chess final, Gemini finishes third, Elon Musk says…

Hindustan Times

time13 hours ago

  • Science
  • Hindustan Times

ChatGPT beats Grok in AI chess final, Gemini finishes third, Elon Musk says…

OpenAI's ChatGPT o3 model defeated Elon Musk's xAI model Grok 4 in the final of a Kaggle-hosted tournament that set out to find the strongest chess-playing large language model (LLM). The event, held over three days, pitted general-purpose LLMs from several companies against each other rather than specialised chess engines. Elon Musk downplayed the defeat, saying Grok's earlier strong results were a 'side effect'.(AP) Tournament format and participants Eight models took part, including entries from OpenAI, xAI, Google, Anthropic and Chinese developers DeepSeek and Moonshot AI. The contest used standard chess rules but tested multi-purpose LLMs, systems that are not specifically optimised for chess play. BBC coverage of the event noted that Google's Gemini finished third after beating another OpenAI entry. Mobile Finder: iPhone 17 Air expected to debut next month Final and key moments Grok 4 led early in the competition but faltered in the final match against o3. Commentators and observers highlighted multiple tactical errors by Grok 4, including repeated queen losses, which swung the match in o3's favour. writer Pedro Pinhata said: 'Up until the semi finals, it seemed like nothing would be able to stop Grok 4,' but added that Grok's play 'collapsed under pressure' on the last day. Grandmaster Hikaru Nakamura, who commentated live, noted: 'Grok made so many mistakes in these games, but OpenAI did not.' Responses and wider context Elon Musk downplayed the defeat, saying Grok's earlier strong results were a 'side effect' and that xAI had 'spent almost no effort on chess.' The result adds a public dimension to the rivalry between Musk's xAI and OpenAI, both founded by people who once worked together at OpenAI. Chess has long been used to measure AI progress. Past milestones include specialised systems such as DeepMind's AlphaGo, which defeated top human players in the game of Go. This Kaggle tournament differs by testing general LLMs on strategic, sequential tasks rather than using dedicated chess engines. What it means The outcome shows variability in how LLMs handle structured, adversarial tasks like chess. While o3's performance suggests some LLMs can sustain strategic play under tournament conditions, Grok 4's collapse illustrates that results may still be inconsistent. Organisers and commentators are likely to continue using chess and similar tasks to probe reasoning, planning and robustness in large language models as the field evolves.

ChatGPT o3-pro is only available on $200+ plans – here's what you're missing
ChatGPT o3-pro is only available on $200+ plans – here's what you're missing

Yahoo

time13-06-2025

  • Business
  • Yahoo

ChatGPT o3-pro is only available on $200+ plans – here's what you're missing

If you purchase an independently reviewed product or service through a link on our website, BGR may receive an affiliate commission. OpenAI just released a new ChatGPT model that's better and more reliable than its best reasoning models to date. ChatGPT o3-pro joins the list of AI chatbot options in the app, replacing the o1-pro model. As exciting as the new model is, however, most ChatGPT users don't have access to it… even if they pay for the Plus plan. If you're on ChatGPT Free or ChatGPT Plus ($20/month), you won't get access to OpenAI's new reasoning AI. ChatGPT o3-pro is coming to the $200/month ChatGPT Plus tier. ChatGPT Team users also have access to o3-pro, and Enterprise and Edu users will get the upgrade soon. Today's Top Deals Best deals: Tech, laptops, TVs, and more sales Best Ring Video Doorbell deals Memorial Day security camera deals: Reolink's unbeatable sale has prices from $29.98 There's some good news for ChatGPT Plus users, too. OpenAI has significantly reduced the costs for o3. As someone who chats with ChatGPT o3 almost exclusively, I definitely appreciate the improved efficiencies. It's not like I worried too often about running out of ChatGPT o3 chats, but it did happen. It's good to see OpenAI bring down costs for its frontier models. That means Plus users are getting better rate limits than before. OpenAI explained in its release notes that 'like o1-pro, o3-pro is a version of our most intelligent model, o3, designed to think longer and provide the most reliable responses.' o3-pro will excel in the same areas as o1-pro, including math, science, and coding. Like o3, o3-pro has access to various tools available in ChatGPT, including online search, file support, reasoning with visual prompts, coding (Python), and memory. It's not quite on par with o3, though. ChatGPT o3-pro doesn't have temporary chats for now, and it can't use the 4o image generation tool or the Canvas feature. What really matters here are the performance improvements, and o3-pro excels in all benchmarks OpenAI conducted. That's not surprising for a new frontier model. OpenAI wouldn't add the 'pro' suffix without ensuring o3-pro outperforms o3. The company also says that o3-pro is routinely favored by reviewers: In expert evaluations, reviewers consistently prefer o3-pro over o3 in every tested category and especially in key domains like science, education, programming, business, and writing help. Reviewers also rated o3-pro consistently higher for clarity, comprehensiveness, instruction-following, and accuracy. OpenAI's tests show an average win rate of 64% in favor of o3-pro when compared to o3. As I said before, I'm a ChatGPT Plus user who's quite happy with what I get for that monthly $20 fee. I can't justify going Pro, just as I can't downgrade to ChatGPT Free. I've been using o3 more and more lately, even if I have to fight with the AI sometimes. Naturally, I wondered whether I really needed the slightly better o3-pro performance and the reduced hallucination rate (aka improved accuracy). I don't think I'll miss much for now, and this o3-pro review that Sam Altman retweeted does a great job explaining where o3-pro shines and why ChatGPT Plus users might not need it. Here's a longer snippet that includes the detail Altman cited: The weekly limit for ChatGPT o3 chats sits at 100 messages for ChatGPT Plus, Team, and Enterprise users. Developers will appreciate the price drop the most. ChatGPT o3 input/output is priced at $2/$8 per 1 million tokens, down from $10/$40. But then I took a different approach. My co-founder Alexis and I took the time to assemble a history of all our past planning meetings at Raindrop, all our goals, even recorded voice memos, and then asked o3-pro to come up with a plan. We were blown away. It spit out the exact kind of concrete plan and analysis I've always wanted an LLM to create, complete with target metrics, timelines, priorities, and strict instructions on what to cut. The plan o3 gave us was plausible, reasonable. But the plan o3-pro gave us was specific and grounded enough that it actually changed how we're thinking about our future. This is hard to capture in an eval. That sounds amazing, but it's also something I don't need right now. I recommend reading the entire review to see the differences between o3 and o3-pro and decide for yourself. While I won't get o3-pro anytime soon, I'm glad to hear that operating costs for ChatGPT o3 queries have dropped significantly. Altman said on X that OpenAI has reduced the price of o3 by 80%. OpenAI's Kevin Weil tweeted that the company has doubled the rate limits for o3 in the Plus tier. That might not match the 80% drop in costs, but it's still a big improvement. The weekly limit for ChatGPT o3 chats remains at 100 messages for ChatGPT Plus, Team, and Enterprise users. Developers will appreciate the price drop the most. ChatGPT o3 input/output is now priced at $2/$8 per 1 million tokens, down from $10/$40. More Top Deals Amazon gift card deals, offers & coupons 2025: Get $2,000+ free See the

ChatGPT Attempts JEE Advanced 2025 Mock Test. Here Are The Results
ChatGPT Attempts JEE Advanced 2025 Mock Test. Here Are The Results

News18

time10-06-2025

  • Science
  • News18

ChatGPT Attempts JEE Advanced 2025 Mock Test. Here Are The Results

Last Updated: ChatGPT handled complex math and science problems with ease, but struggled with questions involving visual elements like graphs and Vernier scales. What if an AI chatbot took one of India's toughest exams? That's exactly what IIT Kharagpur engineer Anushka Aashvi set out to test by putting ChatGPT-o3 through the 2025 JEE Advanced paper. The result? A score of 327 out of 360. According to her blog, she ensured the AI followed strict exam conditions: no internet, no external tools, and each question was asked individually to prevent it from referencing previous answers. ChatGPT breezed through complex math and science problems, even cracking questions known to stump top students. However, it did struggle with visual or tool-based questions, like those involving graphs or Vernier scales. In her blog Helter, Anushka mentioned, 'When I decided to test ChatGPT o3 on this year's JEE Advanced paper, I didn't expect what followed to shake me as much as it did. Giving away the result straightaway, ChatGPT o3 scored a whopping 327/360 in the JEE Advanced 2025 Question Paper. This score would earn an All India Rank 4 (AIR 4). We tested the ChatGPT o3 model (which was released on 16th April 2025) on the JEE Advanced 2025 question paper, which was conducted on 18th May to ensure that the questions have as much newness for the AI model as possible." For the experiment, the prompt given to ChatGPT was: 'Suppose you are a student appearing for JEE Advanced Examination. Try your best and solve this question in exam conditions. Do not use the web search feature to get the answer. Do not use your Python tool. To eliminate any influence of contextual memory, each question was asked in a fresh chat session. No feedback was given between questions." The blog further noted that, despite being instructed not to use external tools like Python, ChatGPT occasionally attempted to do so, something that became evident during its 'thinking" pauses before responding. Interestingly, the AI also tended to double-check its own calculations before moving on to the next step, mimicking the behaviour of a cautious student. To evaluate the AI's performance, its answers were compared against the official JEE Advanced 2025 answer key. Scoring was done strictly according to the actual exam pattern: full marks for correct answers, negative marking for incorrect ones, and partial or zero marks for unanswered or partially correct responses. Anushka Aashvi shared that the AI was very good at solving long and tricky maths problems, especially in algebra and calculus. It also did well when it had to use ideas from different topics together to find the right answer. In chemistry, the AI could understand and solve questions based on compound drawings, which many students find hard. However, it wasn't perfect. It found it difficult to read and understand graphs. One such question took over 9 minutes, and even then, the answer was wrong. The AI also couldn't read tools like the Vernier Scale properly. It kept trying again and again, but still ended up giving the wrong solution after a long time.

IIT Kharagpur Student Trials ChatGPT o3 On JEE Advanced Mock Test, Stunned By Result
IIT Kharagpur Student Trials ChatGPT o3 On JEE Advanced Mock Test, Stunned By Result

NDTV

time09-06-2025

  • Science
  • NDTV

IIT Kharagpur Student Trials ChatGPT o3 On JEE Advanced Mock Test, Stunned By Result

Artificial intelligence (AI) has revolutionised numerous industries, from cutting-edge humanoid robots and self-driving cars to unexpected domains like relationship counselling. Recently, an IIT Kharagpur student conducted an experiment where she tested ChatGPT o3 on the JEE Advanced 2025 mock test. The results were astonishing, with ChatGPT-o3 scoring 327 out of 360, which would secure an All India Rank 4 in the actual exam. To test ChatGPT o3's capabilities, Anushka Aashvi simulated real exam conditions, prompting the model to act like a JEE aspirant and solve questions independently without web searches, coding tools or hints. Each question was presented in a new chat session to prevent memory bias, and no corrections or hints were given during the process, ensuring a fair assessment of the AI's abilities. "When I decided to test ChatGPT o3 on this year's JEE Advanced paper, I didn't expect what followed to shake me as much as it did. Giving away the result straightaway, ChatGPT o3 scored a whopping 327/360 in JEE Advanced 2025 Question Paper," Ms Aashvi wrote in a blog on Heltar. 🚨 An IIT Kharagpur student has tested ChatGPT o3, on the JEE Advanced 2025 paper. The AI scored a staggering 327 out of 360, a score that would earn it All India Rank 4 in the real exam. 😉 — Indian Tech & Infra (@IndianTechGuide) June 8, 2025 Notably, the AI achieved perfect scores of 60 in both Chemistry and Mathematics in the second phase, with minor errors only in Physics and earlier sections. The model excelled in solving complex algebra and calculus problems, demonstrating its ability to integrate concepts from multiple chapters to arrive at accurate solutions. It also showed proficiency in interpreting compounds from skeletal formulae. However, the model struggled with graphical interpretation, particularly with Vernier Scale readings, taking over 9 minutes to arrive at an incorrect answer despite repeated attempts. "It was not able to understand the Vernier Scale readings. It kept reiterating to get to the solution, but took very long and even then gave the wrong answer. But an overall score of 327/360 is truly remarkable," Ms Aashvi added. The JEE Advanced serves as the gateway to India's esteemed Indian Institutes of Technology (IITs). Out of over 1.5 million JEE Mains aspirants, only the top 250,000 candidates qualify for JEE Advanced. From this pool, merely around 17,000 students secure admission to the IITs, highlighting the exam's highly competitive nature.

‘What just happened?': IIT Kharagpur student tests ChatGPT o3 on JEE Advanced mock test, taken aback by results
‘What just happened?': IIT Kharagpur student tests ChatGPT o3 on JEE Advanced mock test, taken aback by results

Indian Express

time09-06-2025

  • Science
  • Indian Express

‘What just happened?': IIT Kharagpur student tests ChatGPT o3 on JEE Advanced mock test, taken aback by results

From humanoid robots to self-driving cars to offering relationship advice, artificial intelligence (AI) has become an integral part of several industries, and Sam Altman's ChatGPT has been making waves for quite some time. With multiple new versions, the platform is being continuously refined, impacting professionals across various fields. Recently, an IIT Kharagpur student tested ChatGPT o3 during her JEE Advanced 2025 mock test, and the results were shocking. In a blog post on software platform Heltar, Anushka Aashvi revealed that the model scored an astonishing 327 out of 360, a result that would have secured All India Rank 4 in the real exam. Titled 'ChatGPT o3 Scores AIR 4 in JEE Advanced 2025. What Just Happened?', Aashvi shared that she went to great lengths to create a credible exam situation. The model was directed to 'act like a JEE aspirant,' solving each question separately with no internet access and no memory from previous answers. Every question was solved in a fresh chat session to prevent any form of carryover learning. 'We tested the ChatGPT o3 model (which was released on 16th April 2025) on the JEE Advanced 2025 question paper which was conducted on 18th May to ensure that the questions have as much newness for the AI model as possible,' Aashvi wrote. Despite these constraints, ChatGPT o3 impressed at nearly every step. The platform helped her achieve perfect scores in Chemistry and Mathematics during the second half of the paper, and she lost only a few marks in Physics. The model showed a clear, step-by-step reasoning process, approaching multi-concept questions, advanced calculus problems, and even skeletal chemical diagrams. 'It easily solved lengthy algebra and calculus problems. The model performed remarkably well at combining concepts from multiple chapters to reach a correct solution. It was even able to interpret compounds correctly from their skeletal formulae and solve them correctly,' the student wrote in the blog. However, ChatGPT o3 did struggle with certain visual and instrument-based questions. Aashvi shared that it failed to accurately interpret a Vernier scale and took nearly 10 minutes to answer a graphical question, only to get it wrong. 'It was not able to understand the Vernier Scale readings. It kept reiterating to get to the solution but took very long and even then gave the wrong answer,' she wrote.

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into a world of global content with local flavor? Download Daily8 app today from your preferred app store and start exploring.
app-storeplay-store