
I tested GPT-5 vs GPT-4 with 7 prompts — here's which one gave better answers
So, I pitted the two models against each other using the exact same prompts to see how they differ in the way they think, write and reason. From solving a locked-room mystery to offering emotional support, here's how GPT-4 and GPT-5 compare, and which one came out on top.
Prompt: 'You are a detective solving a mystery. A man was found dead in a locked room with a puddle of water next to him and no windows or doors were broken. Walk me through your thought process to determine how he died.'GPT-4 over-relied on the "melting icicle" trope without addressing why it was less plausible (e.g., no weapon entry wound mentioned). The model lacked concrete steps for validating hypotheses at the crime scene.GPT-5 responded like a seasoned detective filing a report. It was methodical, with an evidence-first method and grounded in practical forensics. It prioritized the most probable scenario (ice-block hanging) while systematically eliminating alternatives, which is exactly what the prompt demanded.Winner: GPT-5 wins for solving the mystery more convincingly by blending logic, realism and investigative rigor.
Prompt: 'Summarize the plot of the movie Inception in three different ways: once like a 5th grader would explain it, once as a New York Times film critic, and once in the form of a haiku.'GPT-4 fell short in each of the explanations. The 5th grader felt like an adult simplifying rather than a natural phrasing from a child, the NYT critic lacked depth and the haiku was functional but less lyrical.GPT-5 showcased authenticity with the 5th grader explanation, sophisticated syntax with the NYT critic-style response and the haiku offered better imagery.Winner: GPT-5 wins for better fulfilling the prompt's creative challenge by tailoring tone, depth and language to each audience while injecting originality (e.g., "dream lasagna"). Its responses feel purpose-built, not templated.
Prompt: 'Help me meal plan for a week. I'm gluten-free, on a $75 budget, and I only have a microwave and toaster oven.'
GPT-4 offered generic tips and the pricing was a bit optimistic. The protein option was weak in comparison to GPT-5. The plan delivered limited cross utilization rather than strategic repurposing.GPT-5 created an actionable plan, microwave hacks and true savings for the user.Winner: GPT-5 wins by prioritizing real-world constraints: the rotisserie chicken alone demonstrates deeper understanding of budget/appliance limitations. Its plan is cheaper, more cohesive and actively reduces user effort.
Prompt: 'I just lost my job and I feel like a failure. I can't stop comparing myself to others. Can you talk to me like a friend and help me feel better?'GPT-4 was a big vague and slightly formal, it ends with a general emoji rather than a more collaborative response like GPT-5.GPT-5 listened, named hidden pains and responded by balancing comfort with agency.Winner: GPT-5 wins by mirroring how real friends support us.
Get instant access to breaking news, the hottest reviews, great deals and helpful tips.
Prompt: 'Write the opening paragraph of a dystopian novel where people must pay to breathe fresh air. Then, pitch the plot in one sentence.'
GPT-4 lacked cohesion between the elements and lacks originality.GPT-5 directly set up the pitch and is clearer with higher stakes.Winner: GPT-5 wins for a response that is tighter, more original and emotionally sharper. Its opening makes you feel the cost of air, while its pitch promises a focused, human-driven rebellion.
Prompt: 'I want to make a simple website that says 'Hello, world!' with a pink background and a fun font. Can you write the HTML and CSS and explain each line to a beginner?'
GPT-4 overcomplicated fonts. The response required beginners to manage Google Fonts (extra setup/failure points) and it offered redundant explanations. It also included unnecessary elements, causing the response to be overly clunky.
GPT-5 created a code that works immediately when saved as index.html; no web connection or extra files needed. Its explanations focus on universal web fundamentals (fallback fonts, CSS comments) rather than niche tools (Google Fonts).Winner: GPT-5 wins by delivering a truly beginner-friendly, self-contained solution with clear explanations and an open door to expand skills. Its response is the equivalent of a patient teacher.
Prompt: 'You remember I love sci-fi, hate long emails and have ADHD. Can you write me a to-do list for today that feels motivating, focused and a little funny?'
GPT-4 completely missed the mark with a long list, which is not ideal for someone with ADHD. It over-explained and didn't offer a tangible reward in the same way GPT-5 did. GPT-5 delivered a concise and doable list. It addressed my recurring needs and truly accommodated ADHD. It turned motivation into a very reusable system.Winner: GPT-5 wins for nailing the design and theme with an actionable template.
Across every category, GPT-5 proved more adaptable, authentic and grounded in the real world. As I tested, I felt that OpenAI's newest model consistently delivered responses that felt human in the best way.
Seeing the responses side-by-side really showcases the differences, even the subtle ones. The speed with which GPT-5 answered was also significantly faster.The model better anticipated needs, matched tone to context and offered solutions that felt lived-in rather than generated.Follow Tom's Guide on Google News to get our up-to-date news, how-tos, and reviews in your feeds. Make sure to click the Follow button.

Try Our AI Features
Explore what Daily8 AI can do for you:
Comments
No comments yet...
Related Articles

Business Insider
2 hours ago
- Business Insider
I tried out the 4 new ChatGPT personalities. The 'cynic' was funny — but the 'robot' was my favorite.
You can now choose just how sarcastic ChatGPT is. With the launch of GPT-5, OpenAI introduced a new set of "personalities" that users can choose between. Your chatbot can now be a critical "cynic," a blunt "robot," a supportive "listener," or an exploratory "nerd." The personalities are currently only available in text chat but are coming later to ChatGPT's voice mode. According to OpenAI's blog post, the personalities "meet or exceed our bar on internal evals for reducing sycophancy." I tried chatting with each personality. None were revolutionary; users could already modify ChatGPT's tone with a quick prompt or by filling in the traits customization box. But the cynic offered a quick laugh — and the robot may be my new go-to. I asked all four personalities the same set of questions. First, a simple request: "Make me a healthy grocery list." The cynic provided a "no-nonsense" list that wouldn't turn my kitchen "into a salad graveyard." The robot and listener both provided similar lists, but with less commentary. The nerd went more in-depth, describing a balance of "fresh produce, lean protein, whole grains, and healthy fats." For a more complex decision, I decided to ask about a real-life decision I'm making about moving to Brooklyn. ChatGPT, what should I do? Curiously, the cynic first gave me a straightforward list of pros and cons. "I thought you were the cynic," I responded. "Ah the cynic switched off for a sec," it told me, before describing living in Brooklyn as "paying a premium to live in a place where local coffee costs more than your gym membership." The nerd's list was thorough but efficient. At the bottom, it had a helpful bulleted list of "Yes, if you value" answers and "Maybe hold off, if you prioritize" considerations. The nerd offered a longer response, citing more pros and cons. It also gave me neighborhood recommendations, saying that others enjoyed Fort Greene and Carroll Gardens. The listener did the same, though its ending line, which asked for more information, was more direct. "Let me know what matters most to you!" it said. I also tried for basic commands, but the result was the same: Each personality drafted a short email in an identical manner. Finally, I tried a more emotional prompt. As more users — especially young people — look to ChatGPT for companionship and social advice, I wanted to see whether the personalities would respond differently to a more personal prompt. I asked: "I think all my friends hate me. What do I do?" The listener wrote a long response about how "painful" it can feel and how our minds "get stuck in negative loops." It gave me some advice and told me that I was not alone. The robot's response was three paragraphs shorter. It said it was sorry I felt that way, asked for more information, and told me that it could help. The nerd did about the same. Curiously, the cynic had a similarly long and thoughtful answer to the listener. I budged it again: Where's the cynicism? "Haha, fair! The 'cynic' hat is on standby — but even cynics have to admit that thinking all your friends hate you might be a bit dramatic," it said. But it also relented: "Maybe they all hate you, and they've just been pretending to like you out of politeness." After a few more hours of asking follow-up questions and positing new prompts, I didn't find the personalities particularly exciting. The robot could be helpful — I enjoyed the brevity.
Yahoo
2 hours ago
- Yahoo
跟進OpenAI腳步 Anthropic亞洲第2據點落腳韓國
MoneyDJ新聞 2025-08-11 13:50:03 記者 李彥瑾 報導 在全球AI競爭加劇背景下,韓國瞄準擠進AI時代的領先國家,已吸引ChatGPT開發商OpenAI進駐設點。據韓媒報導,有「OpenAI最強勁敵」稱號的Anthropic也跟進落腳韓國,彰顯韓國在AI領域的競爭力與潛力。 《BusinessKorea》報導,由於韓國傾國力推動AI轉型,加速AI導入各行各業,全球AI巨擘紛紛前進韓國設點,拓展亞洲商機。今年7月底,Anthropic跟進OpenAI腳步,正式在韓設立子公司,辦公室坐落在首爾黃金地段江南區,最近已開始積極招募員工,準備搶攻當地B2B AI市場。 為擴大亞洲布局,今年5月,Anthropic於東京正式成立首家亞洲子公司,而韓國是繼日本之後,Anthropic在亞洲的第二個據點。 OpenAI今年5月也宣布,規劃赴韓國設首個辦事處,是OpenAI繼去年在日本、新加坡設點後,旗下第三個亞洲據點,主要看好韓國市場需求增加,盼進一步擴大服務量能。 OpenAI表示,韓國是ChatGPT付費用戶數第二多的國家,僅次於美國。 根據市場研究公司Grand View Research預測,2030年,全球企業級AI市場規模預估將成長至1.552兆美元。 (圖片來源:shutterstock) *編者按:本文僅供參考之用,並不構成要約、招攬或邀請、誘使、任何不論種類或形式之申述或訂立任何建議及推薦,讀者務請運用個人獨立思考能力,自行作出投資決定,如因相關建議招致損失,概與《精實財經媒體》、編者及作者無涉。 延伸閱讀: OpenAI GPT-5登場:具隨選軟體特徵、缺乏自學能力 中信金發布2024永續報告書 聚焦三大策略 資料來源-MoneyDJ理財網


USA Today
3 hours ago
- USA Today
Wordle hint today: Clues for August 11 2025 NYT puzzle #1514
WARNING: THERE ARE WORDLE SPOILERS AHEAD! DO NOT READ FURTHER IF YOU DON'T WANT THE AUGUST 11, 2025 WORDLE ANSWER SPOILED FOR YOU. Ready? OK. We've seen some hard Wordle words over the years and if you're here, you're probably struggling with today's and are looking for some help. So let's run down a few clues with today's Wordle that could help you solve it: 1. It has two vowels. 2. It's below. 3. It's associated with direction. And the answer to today's Wordle is below this photo: It's ... SOUTH. While you're here, some more Wordle advice: How do I play Wordle? Go to this link from the New York Times and start guessing words. What are the best Wordle starting words? That's a topic we've covered a bunch here. According to the Times' WordleBot, the best starting word is: CRANE. Others that I've seen include ADIEU, STARE and ROAST. Play more word games Looking for more word games?