GPT-5's most useful upgrade isn't speed — it's the multimodal improvements that matter
What is 'Multimodality'?
In the case of an AI, multimodality is the ability to understand and interact with input beyond just text. That means voice, image or video input. A multimodal chatbot can work with multiple types of input and output.
This week's GPT-5 upgrade to ChatGPT dramatically raises the chatbot's speed and performance when it comes to coding, math and response accuracy. But arguably the most useful improvement in the grand scheme of AI development will be its multimodal capabilities.
ChatGPT-5 brings an enhanced voice mode and a better ability to process visual information. While Sam Altman didn't go into details on multimodality specifically in this week's GPT-5 reveal livestream, he previously confirmed to Bill Gates on an episode of the latter's podcast that ChatGPT is moving towards "speech in, speech out. Images. Eventually video."
The improved voice mode courtesy of GPT-5 now works with custom GPTs and will adapts its tone and speech style based on user instruction. For example, you could ask it to slow down if it's going to fast or make the voice style a bit warmer if you feel the tone is too harsh. OpenAI has also confirmed the old Standard Voice Mode across all its models is being phased out over the next 30 days.
Of course, the majority of interaction with ChatGPT, or any of its best alternatives, will be through text. But as AI becomes an increasing part of every human's digital lives, it will need to transition fully into predominantly multimodal input.
We've seen this before; social media only really got going when it moved off laptops and desktops and onto smartphones.
Suddenly, users could snap pictures and upload them with the same device. Whether or not it's your phone or — as Zuckerberg will have you believe — a set of the best smart glasses is beside the point. The most successful AI will be the one that can make sense of the world around it.
Why does this matter?
GPT‑5 has been designed to natively handle (and generate) across multiple different types of data within a single model. Previous iterations had used a plugin-style approach so moving away from that should result in more seamless interactions, whichever type of input you choose.
There are a huge amount of benefits to a more robust multimodal AI, including for users who may have hearing or sight impairments. The ability to refine the responses from the chatbot to suit disabilities will do wonders for tech accessibility.
There are a huge amount of benefits to a more robust multimodal AI, including for users who may have hearing or sight impairments.
The increasing use of voice mode could be what drives the adoption of ChatGPT Plus, since the premium tier has unlimited responses while free users are still limited to a select number of hours.
Meanwhile, improved image understanding means that, for example, the AI will be less prone to hallucinations when analyzing a chart or a picture you give it. That works in tandem with the tool's "Visual Workspace" feature that means it can interact with charts and diagrams. In turn, this will also train ChatGPT to produce better and more accurate images when prompted.
If you think about this in an educational context, it's going to be a huge help. Especially since GPT-5 can now understand information across much longer stretches of conversation — users can refer back to images earlier in the conversation and it will remember them.
While everyone knows that AI image generation has a dark side, there's no doubt that effective multimodality is the future of AI models and it'll be interesting to see what Google Gemini's response is to these GPT-5 upgrades.
Follow Tom's Guide on Google News to get our up-to-date news, how-tos, and reviews in your feeds. Make sure to click the Follow button.
More from Tom's Guide
ChatGPT-5 is here — 7 biggest upgrades you need to know
I'm a ChatGPT power user — these are the ChatGPT-5 upgrades that I plan on using the most
ChatGPT-5 features — here's the 5 upgrades I would try first

Try Our AI Features
Explore what Daily8 AI can do for you:
Comments
No comments yet...
Related Articles

Miami Herald
an hour ago
- Miami Herald
Man Used AI to Generate a Custom Tune for His BMW 335i
TikTok creator wyattwebsterrr might have just pulled off one of the wildest DIY tuning experiments yet. He recently bought a 2007 BMW 335i for $1,500. Of course, for that price, it wasn't perfect. It sat on 240,000 miles and needed a new transmission, which he paid just $200 for. Other than a straight-pipe exhaust, the BMW was completely stock. Then, he asked ChatGPT to write a custom tune. The result is a beater that would outrun a brand-new BMW 3 Series. Wyatt started with a simple question for ChatGPT: "Can you give me a custom tune for my 335i?" He then supplied the AI with details such as his ECU version, the fact that it has a 6-speed automatic transmission, and the type of turbos it was equipped with. In return, ChatGPT handed over a full custom tune file. Wyatt then used the MHD Tuning app to load the AI's creation straight onto the E90's ECU. Usually, this process would take hours, if not days, when working with a reputable tuner. But thanks to modern technology, it took just a few minutes. People love modifying 335i models because of the N54 engine that lives under the hood. From the factory, the twin-turbo 3.0-liter straight-six sends 300 hp and 300 lb-ft of torque to the rear wheels. Although Wyatt has yet to release all the power figures, he posted a TikTok showing the tuned 335i's new driving experience. Boost pressure jumped to 19.3 psi. The exhaust, once relatively tame, now delivered loud burbles and pops that were impossible to ignore. The rear tyres broke traction much easier than before, and the BMW's 0 to 60 mph time dropped to just 5.17 seconds. That's a full second quicker than before, thanks to a tune written entirely by an AI. Professional tuners spend years learning how to safely squeeze more performance out of an engine, while accounting for wear and tear and long-term reliability. While ChatGPT might be able to make a high-mileage BMW faster, there's always the risk of something breaking. Still, for a cheap 335i and a viral TikTok clip, it is hard to argue that this wasn't worth the gamble. Copyright 2025 The Arena Group, Inc. All Rights Reserved.


CNN
an hour ago
- CNN
Woman with ALS hopes Uruguay is closer to legalizing euthanasia
We process your data to deliver content or advertisements and measure the delivery of such content or advertisements to extract insights about our website. We share this information with our partners on the basis of consent. You may exercise your right to consent, based on a specific purpose below or at a partner level in the link under each purpose. Some vendors may process your data based on their legitimate interests, which does not require your consent. You cannot object to tracking technologies placed to ensure security, prevent fraud, fix errors, or deliver and present advertising and content, and precise geolocation data and active scanning of device characteristics for identification may be used to support this purpose. This exception does not apply to targeted advertising. These choices will be signaled to our vendors participating in the Transparency and Consent Framework. The choices you make regarding the purposes and vendors listed in this notice are saved and stored locally on your device for a maximum duration of 1 year.

Miami Herald
an hour ago
- Miami Herald
Veteran trader highlights crypto miner after Google deal
TheStreet Pro's Stephen Guilfoyle knows what you're thinking. The veteran trader recently turned his attention to TeraWulf (WULF) , which saw its stock skyrocket on Aug. 14. Don't miss the move: Subscribe to TheStreet's free daily newsletter "Sarge, isn't Terawulf a cryptocurrency mining operation?" he wrote. "Yes, but that said, the firm is transitioning into something bigger and potentially far more consequential than that." Guilfoyle said TeraWulf has pivoted toward providing infrastructure to so-called hyperscalers, the large cloud service providers offering massive computing power and storage capacity, with a focus on AI-related workloads. "In short, the firm is likely trying to position itself as a competitor to CoreWeave (CRWV) ," he said, referring to the AI cloud-computing startup. Image source: East Bay Times via Getty Images Founded in 2021, TeraWulf said on its website that it provided "domestically produced bitcoin by using more than 90% zero carbon energy today." Guilfoyle, whose career dates back to the floor of the New York Stock Exchange in the 1980s, said Terawulf reached two 10-year agreements with AI cloud platform company Fluidstack to supply high-performance computing clusters to large cloud providers. Google parent Alphabet (GOOGL) has agreed to provide funding of $1.8 billion to help finance this project. In return, Alphabet received warrants to acquire roughly 41 million shares of TeraWulf that would amount to an 8% stake when exercised. More Experts Stocks & Markets Podcast: Sectors to Avoid With Jay WoodsTrader makes bold call with Boeing stock after defense workers strikeVeteran fund manager sends urgent 9-word message on stocks "These are truly a game changer for TeraWulf," Chief Financial Officer Patrick Fleury told analysts during the second-quarter earnings call. "The Fluid Stack lease and Google support agreement are carefully structured to enhance our credit profile and position us to scale quickly." TeraWulf's stock has surged 55.4% this year and skyrocketed 144% from this time in 2024. TeraWulf beat Wall Street's quarterly earnings expectations, with revenue increasing 34% year-over-year to $47.6 million. The company cited a higher average bitcoin price and expanded mining capacity, offset partly by expected headwinds from increased network difficulty and the April 2024 halving, where bitcoin reduced the block reward by 50%. "My target price is around $9.50," Guilfoyle said. "This is a trade, not an investment, and I expect to be flat the name by the closing bell should short-term traders take profits en masse on Friday." Clear Street analyst Brian Dobson raised the investment firm's price target on TeraWulf to $12 from $9 and affirmed a buy rating on the shares, according to The Fly. The colocation agreements with Fluidstack, supported by Google's $1.8 billion lease backstop and equity stake, and 80-year ground lease at the Cayuga site in New York, "materially enhance" TeraWulf's long-term growth profile, the analyst said. The firm upped its 2027 Ebitda estimate to reflect TeraWulf's expanding high performance computing portfolio. It sees potential upside to its outlook as it does not consider new business wins. Adding Fluidstack as a client, along with Google's commitment, "will create significant momentum and increase the likelihood of additional contract wins going forward," Dobson contended. Citizens JMP analyst Greg Miller raised the firm's price target on TeraWulf to $13 from $7 and maintained an outperform rating on the shares. Related: AI leader stuns Google with move that could reshape the internet TeraWulf reported solid Q2 results, underscoring progress in its strategic pivot toward high-performance computing hosting, the analyst said. The company is likely to exit mining by the next halving event, and it retains the flexibility to redeploy mining capacity toward HPC, aligning with customer demand trends, the firm says. Analysts have noted a shift from bitcoin mining to AI data centers, as both require huge amounts of electricity. A report by the International Energy Association said that electricity demand from data centers worldwide is set to more than double by 2030 to around 945 terawatt-hours, slightly more than the entire electricity consumption of Japan today. "Hyperscalers with generative AI needs are particularly interested in converting to bitcoin mining data centers due to the substantial power requirements and the urgency of deployment timelines," Prakash Vijayan, a senior analyst with Driehaus Capital Management, wrote in November. Vijayan said generative AI applications demand immense computational power and energy, often 10 times more than standard operations. "Bitcoin mining data centers are equipped with advanced cooling systems and have access to cheap, substantial energy sources," he said. "This presents an ideal solution for these needs." By repurposing existing bitcoin mining facilities, Vijayan said, hyperscalers can significantly reduce timelines and meet the growing demand for AI services more efficiently. "Given these trends, bitcoin miners are increasingly transitioning to AI data centers as a strategic move to diversify their revenue streams and leverage their existing infrastructure," he added. Related: The stock market is being led by a new group of winners The Arena Media Brands, LLC THESTREET is a registered trademark of TheStreet, Inc.