logo
OpenAI gets caught vibe graphing

OpenAI gets caught vibe graphing

The Verge2 days ago
During its big GPT-5 livestream on Thursday, OpenAI showed off a few charts that made the model seem quite impressive — but if you look closely, some graphs were a little bit off.
In one, ironically showing how well GPT-5 does in 'deception evals across models,' the scale is all over the place. For 'coding deception,' for example, GPT-5 apparently gets a 50.0 percent deception rate, but that's compared to OpenAI's smaller 47.4 percent o3 score which somehow has a larger bar.
Or this one, where one of GPT-5's scores is lower than o3's but is shown with a bigger bar. In this same chart, o3 and GPT-4o's scores are different but shown with equally-sized bars. That chart was bad enough that CEO Sam Altman commented on it, calling it a 'mega chart screwup.' An OpenAI marketing staffer also apologized for the 'unintentional chart crime.'
OpenAI didn't immediately respond to a request for comment. And while it's unclear if OpenAI used GPT-5 to actually make the charts, it's still not a great look for the company on its big launch day — especially when it is touting the 'significant advances in reducing hallucinations' with its new model.
Posts from this author will be added to your daily email digest and your homepage feed.
See All by Jay Peters
Posts from this topic will be added to your daily email digest and your homepage feed.
See All AI
Posts from this topic will be added to your daily email digest and your homepage feed.
See All News
Posts from this topic will be added to your daily email digest and your homepage feed.
See All OpenAI
Orange background

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

Related Articles

Tesla Stumbles, but Elon Musk Gets a Massive Payday
Tesla Stumbles, but Elon Musk Gets a Massive Payday

Yahoo

time9 minutes ago

  • Yahoo

Tesla Stumbles, but Elon Musk Gets a Massive Payday

Key Points Tesla's sales are struggling in key markets overseas. Consumer backlash from Elon Musk's political stance is real. These 10 stocks could mint the next wave of millionaires › Unless you've been purposely hiding from the news -- which would be understandable -- you know that investors in Tesla (NASDAQ: TSLA) have had plenty to digest. Between allegations Tesla isn't paying its bills and hurting small businesses, to facing consumer backlash from CEO Elon Musk's political tour (and we can't forget the sliding sales and global profits), it's been a full downpour. Let's consider the recent speed bumps, as well as Musk being rewarded with a hefty $29 billion payday. The overseas spiral July figures are seeping in from Europe, and they show that Tesla registrations checked in 41.6% lower compared to the prior year, despite sales of electric vehicles (EVs) surging across the Continent. It's a continuation of the sales spiral the EV maker faced during the first half of 2025. And the problem is that the decline was supposedly due to the new Model Y being in limited supply -- but the issues appear to be deeper than that. The story is similar in China, another crucial market for Tesla. Its sales of China-made EVs dropped 8.4% in July compared to the prior year. That was a reversal from the small gain Tesla posted in June, which at the time reversed an eight-month losing streak. The consumer backlash The consumer displeasure is real, and Musk's political allegiances have pushed some buyers to new and different brands. There's evidence of the effect this is having on Tesla's once-spotless brand image, according to new data from S&P Global Mobility, which tracks sales data across the automotive industry. The new data, shared with Reuters, showed that Tesla's consumer loyalty took a nosedive in July 2024, correlating with Musk's public commitment to an anti-environmental political campaign. According to Reuters, Tesla's loyalty peaked at 73% in June 2024 before bottoming out in March at 49.9%. No matter how you slice it, that's a quick and dramatic decline in consumer sentiment, literally driving buyers to another brand. A massive payday Tesla's board granted Musk 96 million shares, worth roughly $29 billion, in an attempt to keep the billionaire focused on the EV company amid his multiple businesses and ventures. The vote comes after a 2024 Delaware court ruling that voided Musk's 2018 compensation package, which was valued at over $50 billion. The court said the approval process was flawed and unfair to shareholders. According to Automotive News, the special committee that was formed to consider the new pay package said: "While we recognize Elon's business ventures, interests and other potential demands on his time and attention are extensive and wide-ranging ... we are confident that this award will incentivize Elon to remain at Tesla." What it all means Tesla and its investors certainly appear to be at a crossroads. While selling EVs and zero-emission credits keeps the lights on for the young company, it constantly reminds investors that its future may be more in line with artificial intelligence (AI), robotics, and robotaxi services. Long-term investors should stay the course but should also prepare for a bumpy few quarters as the company works through its upcoming identity crisis, the slow ramp-up of the robotaxi, and an aging lineup. Should you buy stock in Tesla right now? The Motley Fool Stock Advisor analyst team just identified what they believe are the for investors to buy now. The 10 stocks that made the cut could produce monster returns in the coming years. Consider when Netflix made this list on December 17, 2004... if you invested $1,000 at the time of our recommendation, you'd have $653,427!* Or when Nvidia made this list on April 15, 2005... if you invested $1,000 at the time of our recommendation, you'd have $1,119,863!* Now, it's worth noting Stock Advisor's total average return is 1,060% — a market-crushing outperformance compared to 182% for the S&P 500. Don't miss out on the latest top 10 list, available when you join Stock Advisor. See the 10 stocks » *Stock Advisor returns as of August 4, 2025 Daniel Miller has no position in any of the stocks mentioned. The Motley Fool has positions in and recommends Tesla. The Motley Fool has a disclosure policy. Tesla Stumbles, but Elon Musk Gets a Massive Payday was originally published by The Motley Fool

Flying Cars Aren't Just Science Fiction Anymore. This Company Is Leading the Charge in eVTOLs -- and Yes, It's Publicly Traded.
Flying Cars Aren't Just Science Fiction Anymore. This Company Is Leading the Charge in eVTOLs -- and Yes, It's Publicly Traded.

Yahoo

time9 minutes ago

  • Yahoo

Flying Cars Aren't Just Science Fiction Anymore. This Company Is Leading the Charge in eVTOLs -- and Yes, It's Publicly Traded.

Key Points Joby's electric air taxis promise emissions-free trips over congested cities at speeds up to 200 mph. Partners like Toyota, Delta, and Blade give Joby cash, infrastructure, and market access ahead of launch. With no revenue yet and trading around 20 times book value, the market is betting on impeccable execution. 10 stocks we like better than Joby Aviation › The idea of hailing a flying car has always belonged to science fiction. But thanks to Joby Aviation (NYSE: JOBY), the idea of catching a flying taxi is slowly edging into reality. The company's electric vertical takeoff and landing (eVTOL) aircraft are designed to carry passengers over congested cities at speeds upwards of 200 mph. And they are quieter than a helicopter and have zero emissions. This vision isn't theoretical. Joby has already demonstrated eVTOLs in New York and Dubai and is moving through the Federal Aviation Administration's (FAA's) certification process as I write. With major strategic partners, a strong cash position, and an aggressive market expansion plan, Joby could be one of the first to make commercial flying taxis a real business. The question for investors is whether this growth stock is ready for takeoff today, or might it be wiser to wait until the company has a little more grounding? From blueprints to boarding passes Joby is trying to solve a problem that most city drivers face everyday: traffic. And not just any traffic. Horribly congested traffic, the kind that makes you wish you were anywhere (even at the DMV, with slow internet) but stuck in it. To get there, however, Joby needs a few things to work in its favor. The first is full FAA-type certification, the regulatory green light that will let it fly passengers in its eVTOLS. The second is the infrastructure to make its vision practical, that is, a network of vertiports, charging stations, and terminals in the right locations so customers can board, fly, and land without the experience feeling like more hassle than just staying in the car. So far, the company has checked off some big early boxes. It already holds FAA Part 135 certification, which means it's cleared to operate as an air carrier with approved aircraft. It's also moving to lock down prime real estate for takeoff and landing, from Manhattan heliports to Dubai's planned aerial taxi hubs. But that doesn't mean the company is smooth flying -- yet. Its biggest "unknown" is time. Every month that slips by without full FAA certification pushes profitability that much further into the future. Add in the fact that it's burning cash each quarter, and you start to see why patience -- and a deep cash cushion -- are non-negotiable. Big names, big bets Speaking of cash, where is Joby getting money for research and development? Well, here's where the story gets interesting. Although Joby is pre-revenue, it has an impressive ecosystem of backers and partners. Back in 2022, Delta Air Lines (NYSE: DAL) invested about $60 million in Joby, with the expectation that Joby would eventually create a premium service for Delta customers. More recently, Toyota (NYSE: TM) committed $894 million to helping Joby with certification and commercial production of its electric air taxis. In perhaps its boldest move, Joby plans to acquire Blade Air Mobility (NASDAQ: BLDE), which would help it gain access to central terminals in New York, Southern California, and Europe. Meanwhile, international expansion is already underway. In Dubai, Joby signed an exclusive six-year agreement with the Roads and Transport Authority (RTA) to launch aerial tax services there in 2026. Finally, Joby recently announced that it's partnering with L3Harris to develop hybrid eVTOLs for defense applications, with demonstrations planned in 2026. Things seem to be rolling. But before we get too bullish, let's look at its finances. The numbers under the hood Here's where reality checks in. Over the last 12 months, Joby generated just $110,000 in revenue, essentially none, while recording a net loss of about $596 million. Just in the first quarter of 2025, it posted a loss of $82 million, or $0.11 per share, driven largely by spending in research and development. To add fuel to the fire, cash burn was $111 million for the quarter. The balance sheet, however, is a strength, largely because of partners. Joby holds about $813 million in cash and short-term investments. That net cash position gives the company some runway to fund operations without immediate dilution. Still, with a market cap near $17 billion, the stock is priced well ahead of fundamentals. It's price-per-book (P/B) ratio -- which measures how richly the market values the company relative to the net assets on its balance sheet -- sits around 20. That's steep compared with the S&P 500's median of about 3. Even Archer Aviation (NYSE: ACHR), Joby's primary competitor, is trading at roughly 5.6 times book value. That doesn't mean Joby can't grow into its valuation, but it does underscore how much future success is baked into today's price. Verdict: Should you hitch a ride? When people talk about Joby's risks, they usually circle the big ones: FAA certification, steady cash burn, competition. But there's a second layer of risks -- call them structural -- that could make investing in Joby choppy in the short term. Start with the skies themselves. Urban airspace is already a juggling act, and air traffic control in cities like New York and LA runs hot most days. If regulators decide to keep eVTOL traffic on a tight leash, Joby's flight schedule could end up thinner than its business plan assumes. On the ground, vertiports have to be built, and convincing neighborhoods to welcome them is another story entirely. Costs could also be a problem: insurance, pilot pay, and battery charging might keep fares higher than commuters are willing to swallow. Still, if you believe Joby can navigate these headwinds and execute on time, the payoff could be big. For long-term investors who can tolerate volatility, this is a speculatve bet on a market that doesn't exist yet but could make ground transportation look radically different ten years from now. Should you invest $1,000 in Joby Aviation right now? Before you buy stock in Joby Aviation, consider this: The Motley Fool Stock Advisor analyst team just identified what they believe are the for investors to buy now… and Joby Aviation wasn't one of them. The 10 stocks that made the cut could produce monster returns in the coming years. Consider when Netflix made this list on December 17, 2004... if you invested $1,000 at the time of our recommendation, you'd have $653,427!* Or when Nvidia made this list on April 15, 2005... if you invested $1,000 at the time of our recommendation, you'd have $1,119,863!* Now, it's worth noting Stock Advisor's total average return is 1,060% — a market-crushing outperformance compared to 182% for the S&P 500. Don't miss out on the latest top 10 list, available when you join Stock Advisor. See the 10 stocks » *Stock Advisor returns as of August 4, 2025 Steven Porrello has positions in Archer Aviation. The Motley Fool recommends Delta Air Lines. The Motley Fool has a disclosure policy. Flying Cars Aren't Just Science Fiction Anymore. This Company Is Leading the Charge in eVTOLs -- and Yes, It's Publicly Traded. was originally published by The Motley Fool Error in retrieving data Sign in to access your portfolio Error in retrieving data Error in retrieving data Error in retrieving data Error in retrieving data

GPT-5 is meant for the casual user, not the advanced user. Here's why that's an upgrade.
GPT-5 is meant for the casual user, not the advanced user. Here's why that's an upgrade.

Business Insider

time12 minutes ago

  • Business Insider

GPT-5 is meant for the casual user, not the advanced user. Here's why that's an upgrade.

OpenAI promotes the new and improved capabilities of every iteration of its flagship model. When OpenAI released GPT-4, it said the model "exhibits human-level performance on various professional and academic benchmarks." GPT-4.5, meanwhile, improved the model's "ability to recognize patterns, draw connections, and generate creative insights without reasoning." GPT-5, OpenAI said on Thursday, marks "a significant leap in intelligence over all our previous models, featuring state-of-the-art performance across coding, math, writing, health, visual perception, and more." But one of the features OpenAI executives — and AI experts — are most excited about is what OpenAI calls a "real-time router," a system that selects the most appropriate model to handle each user request. This is a leap because it means the average user, who might not understand when and why to use a specific model, no longer needs to worry about it. This makes GPT-5 the most user-friendly release yet. "Previously, you had to go deal with the model picker in ChatGPT," OpenAI COO Brad Lightcap said in an interview with Big Technology, a tech newsletter, on Friday. "You had to select a model that you wanted to use for a given task. And then you'd run the process of asking a question, getting an answer. Sometimes you choose a thinking model, sometimes you wouldn't. And that was, I think, a confusing experience for users." The benefit is that "GPT-5 abstracts all of that," he said. "So it makes that decision for you. And it's actually a smarter model. So you're going to get a better answer in all cases, regardless of whether you're using the thinking mode or not." Removing this confusion could help OpenAI attract more casual users, adding to the 700 million people the company says are already using ChatGPT every week. More users could ultimately mean more revenue. Wharton professor Ethan Mollick put it simply in a post he wrote on his Substack on Thursday. "The burden of using AI is lessened." "If you actually look at the way free users have used ChatGPT, most of them have actually not experienced the power of the reasoning models," Lightcap told Big Technology. "They mostly are using GPT 4.0 and they mostly are kind of using it for turn-based — of like very quick back and forth — almost search-like ways." So the automatic decision-making feature marks a turning point. "It'll be the first time that they're experiencing a model making a decision about how long to think about a problem and how good of an answer to give relative to how hard the question is. And so we expect that the average user will feel dramatically different," he said. "Maybe for the kind of upper echelon of power user, it may not feel as different."

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into a world of global content with local flavor? Download Daily8 app today from your preferred app store and start exploring.
app-storeplay-store