Latest news with #SimonWillison


The Star
2 days ago
- Business
- The Star
OpenAI's GPT-5 met with mixed reviews, confusion in first day
For months, OpenAI chief executive officer Sam Altman has been hyping up the capabilities of GPT-5, setting up the launch as a seminal moment for the company. But in the first 24 hours after its release, the new model was met with mixed reviews. In its announcement last Thursday, OpenAI said GPT-5 was better at coding and reasoning through complex problems, and touted it as advanced enough to turn chatbot ChatGPT into a PhD-level expert. Some with early access praised the model, with caveats. "It's my new favourite model,' developer Simon Willison wrote in a blog post, calling it "competent' and "occasionally impressive.' He added: "It's not a dramatic departure from what we've had before.' On various social media platforms, however, ChatGPT users expressed frustration that GPT-5 continued to make up information and trip over simple math and spelling questions. Noah Giansiracusa, an associate professor of mathematics at Bentley University, said he felt the launch was "underwhelming.' While there were "some improvements,' he said, "they were much more marginal than I would've hoped.' At least some of the reaction may come down to confusion over what's happening under the hood. Unlike OpenAI's prior software, GPT-5 automatically switches between models of varying levels of sophistication depending on the query. This approach can help maximise the company's computing resources, but it also means users may not always be engaging with the most powerful version of OpenAI's technology. Asked to identify how many times the letter "b' shows up in "blueberry,' for example, GPT-5 initially said "three' in one test. When told to "think harder,' however, GPT-5 appeared to engage its more advanced reasoning model and came up with the correct answer. Altman responded to some of the feedback and said there was an issue with the system. "GPT-5 will seem smarter starting today,' he said. "Yesterday, the autoswitcher broke and was out of commission for a chunk of the day, and the result was GPT-5 seemed way dumber.' The stakes are high for the rollout. OpenAI is vying to keep ahead of growing AI competition from rivals in the US and China. The company is also fighting to convince businesses and individual users to pay up for its premium services to help offset the enormous amount it's spending on talent, chips and data centers to support AI development. The San Francisco-based company kicked off the generative AI boom nearly three years ago with the release of ChatGPT, which was originally powered by an earlier model called GPT-3.5. Since then, the company has released a series of increasingly sophisticated systems, including multiple options that mimic the process of human reasoning. As AI systems advance, it's become harder to say definitively how various services stack up. As of midday last Friday, GPT-5 had risen to the top of various categories on LMArena, a popular leaderboard for AI models based on user rankings. But a different benchmark, ARC-AGI-2, puts GPT-5 behind the latest version of Grok from Elon Musk's xAI. In the absence of more definitive assessments, the model wars sometimes come down to vibes. And with nearly 700 million people now using ChatGPT each week, some are bound to disagree over how the model feels. It also takes longer than a day to gauge the value of a new AI system in someone's personal and professional life. Ethan Mollick, a professor at the Wharton School of the University of Pennsylvania who frequently experiments with AI models, marveled at GPT-5's ability to do research, come up with clever written responses and make programming simple, even for a novice. "GPT-5 just does stuff, often extraordinary stuff, sometimes weird stuff, sometimes very AI stuff, on its own,' he wrote in a blog post. "And that is what makes it so interesting.' On Reddit, however, the reactions were very different. During an "Ask Me Anything' session last Friday on the platform, Altman fielded pushback from users who were frustrated not to have more say and visibility into which model responds to their queries. Altman said OpenAI would take some steps to address these complaints, including making it "more transparent.' At one point, Altman responded to a Reddit user's question by noting that OpenAI thinks the "writing quality' in one version of GPT-5 is better than GPT-4.5. Then he asked: "Do you find it to be worse?' One user after another were quick to respond: yes. – Bloomberg


Los Angeles Times
4 days ago
- Business
- Los Angeles Times
OpenAI's GPT-5 met with mixed reviews, confusion in first day
For months, OpenAI Chief Executive Officer Sam Altman has been hyping up the capabilities of GPT-5, setting up the launch as a seminal moment for the company. But in the first 24 hours after its release, the new model was met with mixed reviews. In its announcement Thursday, OpenAI said GPT-5 was better at coding and reasoning through complex problems, and touted it as advanced enough to turn chatbot ChatGPT into a Ph.D.-level expert. Some with early access praised the model, with caveats. 'It's my new favorite model,' developer Simon Willison wrote in a blog post, calling it 'competent' and 'occasionally impressive.' He added: 'It's not a dramatic departure from what we've had before.' On various social media platforms, however, ChatGPT users expressed frustration that GPT-5 continued to make up information and trip over simple math and spelling questions. Noah Giansiracusa, an associate professor of mathematics at Bentley University, said he felt the launch was 'underwhelming.' While there were 'some improvements,' he said, 'they were much more marginal than I would've hoped.' At least some of the reaction may come down to confusion over what's happening under the hood. Unlike OpenAI's prior software, GPT-5 automatically switches between models of varying levels of sophistication depending on the query. This approach can help maximize the company's computing resources, but it also means users may not always be engaging with the most powerful version of OpenAI's technology. Asked to identify how many times the letter 'b' shows up in 'blueberry,' for example, GPT-5 initially said 'three' in one test. When told to 'think harder,' however, GPT-5 appeared to engage its more advanced reasoning model and came up with the correct answer. On Friday, Altman responded to some of the feedback and said there was an issue with the system. 'GPT-5 will seem smarter starting today,' he said. 'Yesterday, the autoswitcher broke and was out of commission for a chunk of the day, and the result was GPT-5 seemed way dumber.' The stakes are high for the rollout. OpenAI is vying to keep ahead of growing AI competition from rivals in the US and China. The company is also fighting to convince businesses and individual users to pay up for its premium services to help offset the enormous amount it's spending on talent, chips and data centers to support AI development. The San Francisco-based company kicked off the generative AI boom nearly three years ago with the release of ChatGPT, which was originally powered by an earlier model called GPT-3.5. Since then, the company has released a series of increasingly sophisticated systems, including multiple options that mimic the process of human reasoning. As AI systems advance, it's become harder to say definitively how various services stack up. As of midday Friday, GPT-5 had risen to the top of various categories on LMArena, a popular leaderboard for AI models based on user rankings. But a different benchmark, ARC-AGI-2, puts GPT-5 behind the latest version of Grok from Elon Musk's xAI. In the absence of more definitive assessments, the model wars sometimes come down to vibes. And with nearly 700 million people now using ChatGPT each week, some are bound to disagree over how the model feels. It also takes longer than a day to gauge the value of a new AI system in someone's personal and professional life. Ethan Mollick, a professor at the Wharton School of the University of Pennsylvania who frequently experiments with AI models, marveled at GPT-5's ability to do research, come up with clever written responses and make programming simple, even for a novice. 'GPT-5 just does stuff, often extraordinary stuff, sometimes weird stuff, sometimes very AI stuff, on its own,' he wrote in a blog post. 'And that is what makes it so interesting.' On Reddit, however, the reactions were very different. During an 'Ask Me Anything' session Friday on the platform, Altman fielded pushback from users who were frustrated not to have more say and visibility into which model responds to their queries. Altman said OpenAI would take some steps to address these complaints, including making it 'more transparent.' At one point, Altman responded to a Reddit user's question by noting that OpenAI thinks the 'writing quality' in one version of GPT-5 is better than GPT-4.5. Then he asked: 'Do you find it to be worse?' One user after another were quick to respond: yes. Forgash writes for Bloomberg.


Mint
4 days ago
- Business
- Mint
OpenAI's GPT-5 Met With Mixed Reviews, Confusion in First Day
(Bloomberg) -- For months, OpenAI Chief Executive Officer Sam Altman has been hyping up the capabilities of GPT-5, setting up the launch as a seminal moment for the company. But in the first 24 hours after its release, the new model was met with mixed reviews. In its announcement Thursday, OpenAI said GPT-5 was better at coding and reasoning through complex problems, and touted it as advanced enough to turn chatbot ChatGPT into a Ph.D.-level expert. Some with early access praised the model, with caveats. 'It's my new favorite model,' developer Simon Willison wrote in a blog post, calling it 'competent' and 'occasionally impressive.' He added: 'It's not a dramatic departure from what we've had before.' On various social media platforms, however, ChatGPT users expressed frustration that GPT-5 continued to make up information and trip over simple math and spelling questions. Noah Giansiracusa, an associate professor of mathematics at Bentley University, said he felt the launch was 'underwhelming.' While there were 'some improvements,' he said, 'they were much more marginal than I would've hoped.' At least some of the reaction may come down to confusion over what's happening under the hood. Unlike OpenAI's prior software, GPT-5 automatically switches between models of varying levels of sophistication depending on the query. This approach can help maximize the company's computing resources, but it also means users may not always be engaging with the most powerful version of OpenAI's technology. Asked to identify how many times the letter 'b' shows up in 'blueberry,' for example, GPT-5 initially said 'three' in one test. When told to 'think harder,' however, GPT-5 appeared to engage its more advanced reasoning model and came up with the correct answer. On Friday, Altman responded to some of the feedback and said there was an issue with the system. 'GPT-5 will seem smarter starting today,' he said. 'Yesterday, the autoswitcher broke and was out of commission for a chunk of the day, and the result was GPT-5 seemed way dumber.' The stakes are high for the rollout. OpenAI is vying to keep ahead of growing AI competition from rivals in the US and China. The company is also fighting to convince businesses and individual users to pay up for its premium services to help offset the enormous amount it's spending on talent, chips and data centers to support AI development. The San Francisco-based company kicked off the generative AI boom nearly three years ago with the release of ChatGPT, which was originally powered by an earlier model called GPT-3.5. Since then, the company has released a series of increasingly sophisticated systems, including multiple options that mimic the process of human reasoning. As AI systems advance, it's become harder to say definitively how various services stack up. As of midday Friday, GPT-5 had risen to the top of various categories on LMArena, a popular leaderboard for AI models based on user rankings. But a different benchmark, ARC-AGI-2, puts GPT-5 behind the latest version of Grok from Elon Musk's xAI. In the absence of more definitive assessments, the model wars sometimes come down to vibes. And with nearly 700 million people now using ChatGPT each week, some are bound to disagree over how the model feels. It also takes longer than a day to gauge the value of a new AI system in someone's personal and professional life. Ethan Mollick, a professor at the Wharton School of the University of Pennsylvania who frequently experiments with AI models, marveled at GPT-5's ability to do research, come up with clever written responses and make programming simple, even for a novice. 'GPT-5 just does stuff, often extraordinary stuff, sometimes weird stuff, sometimes very AI stuff, on its own,' he wrote in a blog post. 'And that is what makes it so interesting.' On Reddit, however, the reactions were very different. During an 'Ask Me Anything' session Friday on the platform, Altman fielded pushback from users who were frustrated not to have more say and visibility into which model responds to their queries. Altman said OpenAI would take some steps to address these complaints, including making it 'more transparent.' At one point, Altman responded to a Reddit user's question by noting that OpenAI thinks the 'writing quality' in one version of GPT-5 is better than GPT-4.5. Then he asked: 'Do you find it to be worse?' One user after another were quick to respond: yes. More stories like this are available on


Bloomberg
4 days ago
- Business
- Bloomberg
OpenAI's GPT-5 Met With Mixed Reviews, Confusion in First Day
For months, OpenAI Chief Executive Officer Sam Altman has been hyping up the capabilities of GPT-5, setting up the launch as a seminal moment for the company. But in the first 24 hours after its release, the new model was met with mixed reviews. In its announcement Thursday, OpenAI said GPT-5 was better at coding and reasoning through complex problems, and touted it as advanced enough to turn chatbot ChatGPT into a Ph.D.-level expert. Some with early access praised the model, with caveats. 'It's my new favorite model,' developer Simon Willison wrote in a blog post, calling it 'competent' and 'occasionally impressive.' He added: 'It's not a dramatic departure from what we've had before.'


Forbes
5 days ago
- Business
- Forbes
OpenAI's New Open Source Models Are A Very Big Deal: 3 Reasons Why
The announcement this week of new open source models from OpenAI has significant implications for the industry. Executives should pay close attention as they refine their AI technology roadmaps. Until now, OpenAI's AI models have been closed source and expensive, especially compared to open source alternatives—particularly those from China, which offer approximately 90% of the performance at 90% lower cost. OpenAI Now Offers Open Source AI Models On August 5, OpenAI released two new open source AI models: gpt-oss-120b (a 120-billion-parameter Mixture-of-Experts model) and gpt-oss-20b (a smaller, 20-billion-parameter model). Both are available under the permissive Apache 2.0 license, enabling free download, customization, and local deployment. These models are designed for reasoning tasks, tool use, and agentic capabilities, with a 128K context window. Notably, they can run on a laptop or smartphone. However, they are text-only and do not support multimodal inputs such as images or video. Initial Developer Reaction Has Been Positive Early feedback from developers has been largely positive. Developer Simon Willison highlighted their efficiency and noted that gpt-oss-20b can run on consumer hardware, such as a high-end laptop with a 16GB GPU. I am impressed by the near-parity with OpenAI's proprietary models, such as o4-mini, on reasoning benchmarks. While some developers noted potential performance issues, these new models are remarkably strong initial efforts and are highly competitive with leading Chinese open source models. This Is a Big Deal for Three Reasons OpenAI has maintained market leadership since the introduction of ChatGPT 3.5 and continues to set the pace. The launch of these open source models marks a 'game on' moment for on-premise and on-device markets. AI vendors in these segments now face a formidable new competitor. Implications for AI Leaders The market for AI model is advancing at lightening speed and the China-US AI arms races has heated up. Monitoring these developers are critical for your AI strategy.