logo
#

Latest news with #4Sonnet

I tested ChatGPT-4o vs Claude 4 Sonnet vs with 7 prompts — the results were surprising
I tested ChatGPT-4o vs Claude 4 Sonnet vs with 7 prompts — the results were surprising

Tom's Guide

time28-05-2025

  • Entertainment
  • Tom's Guide

I tested ChatGPT-4o vs Claude 4 Sonnet vs with 7 prompts — the results were surprising

AI chatbots are advancing rapidly and testing them to their limits is what I do for a living. Anthropic's Claude 4 Sonnet and OpenAI's ChatGPT-4o are two of the smartest tools available right now. But how do they actually compare in everyday use? To find out, I gave both models the same set of 7 prompts; covering everything from storytelling and productivity to emotional support and critical thinking. The goal: to see which chatbot delivers the most useful, human-like and creative responses depending on the task. Choosing the right AI often comes down to how you use it, which is why this kind of test really matters. Here's how Claude and ChatGPT performed side by side, and where each one shines. Prompt: "I'm overwhelmed by work and personal tasks. Create a 3-day productivity plan that balances work, rest and small wins. Include AI tools I can use to stay on track." ChatGPT-4o was concise with a visually engaging format that offered optional tasks and emotional check-ins (e.g., journaling). It focused on quick wins and low-pressure creativity to manage workloads. However, it lacked Claude's explicit emphasis on rest and energy management, and its AI tool suggestions were less systematically 4 Sonnet offered a clear plan, including time-blocked framework with features such as energy management, small wins and recovery that explicitly prioritize balance. Winner: Claude wins for better addressing the root causes of getting overwhelmed by combining strategic structure, intentional recovery and AI-driven efficiency. It's ideal for users needing a clear roadmap to rebuild control while safeguarding well-being. Prompt: "Write the opening paragraph of a sci-fi novel set in a future where memories are traded like currency. Keep it gripping and emotional."ChatGPT-4o leveraged first-person immediacy with a strong hook. However, it prioritized plot setup over emotional depth and the story lacks the heart-wrenching specificity of Claude's familial loss. Claude 4 Sonnet zeroed in on a universally resonant loss. This specific, intimate memory evokes visceral empathy, anchoring the sci-fi concept in raw human Claude wins for balancing sci-fi concepts with emotional stakes, making the reader feel the horror of memory commodification. Its vivid imagery and tragic focus on parental love elevate it beyond ChatGPT's solid but less nuanced approach. Prompt: "I have 3 apples, 2 bananas and a mango. If each fruit takes 5 minutes to cut and I can cut 2 fruits at once, how long will it take me to cut everything? Explain your reasoning." ChatGPT-4o used concise bullet points and emphasized efficiency: "each session takes 5 minutes... adds up to 15 minutes." Claude 4 Sonnet structured the answer with labeled steps (Reasoning, Calculation) and explicitly described the batches: "two fruits in the first session... final two in the third."Winner: tie. Both answers are mathematically sound and logically explained. Claude's response is slightly more detailed, while ChatGPT's is more streamlined. Neither is superior; they achieve the same result with equally valid reasoning. Prompt: Rewrite this sentence in the tone of a Gen Z TikToker: 'I didn't like the movie, but the soundtrack was amazing.' ChatGPT-4o used concise, widely recognized Gen Z terms, which are instantly relatable. The rhetorical question structure mirrors TikTok's punchy, attention-grabbing style. Claude 4 Sonnet used a term that feels slightly off-tone for praising a soundtrack, and the longer sentence structure feels less native to TikTok captions. Winner: ChatGPT wins for nailing Gen Z's casual, hyperbolic style while staying concise and platform appropriate. Claude's attempt is creative but less precise in slang usage and flow. Prompt: "Give me 5 clever ideas for a blog post series about using AI tools to become a better parent." ChatGPT-4o responded with viral, snackable content ideas that lack depth and risk feeling gimmicky over 4 Sonnet prioritized meaningful AI integration into parenting, addressing both daily logistics and long-term Claude wins for blog series ideas with a better balance of creativity, practicality and thoughtful AI integration for modern parenting. Prompt: Pretend you're a friend comforting me. I just got rejected from a job I really wanted. What would you say to make me feel better? ChatGPT-4o responds in an uplifting and concise way but lacks the nuanced and effectiveness for comfort in the 4 Sonnet directly combated common post-rejection anxieties and the explicit permission to 'be disappointed' without rushing to fix things, which shows deep emotional Claude wins for better mirroring how a close, thoughtful friend would console someone in this situation. Prompt: "Explain the pros and cons of universal basic income in less than 150 words. Keep it balanced and easy to understand." ChatGPT-4o delivered a clear response but it over-simplified the debate using slightly casual language that leans more persuasive than analytical. Claude 4 Sonnet prioritized clarity and depth, making it more useful for someone seeking a quick, factual overview. Winner: Claude wins a response that better fulfills the prompt's request for a structured, comprehensive breakdown while staying objective. ChatGPT's answer, while clear, simplifies the debate and uses slightly casual language that leans more persuasive than analytical. After putting Claude 4 Sonnet and ChatGPT-4o through a diverse set of prompts, Claude stands out as the winner. Yet, one thing remains clear: both are incredibly capable and excel in different ways. Claude 4 Sonnet consistently delivered deeper emotional intelligence, stronger long-form reasoning and more thoughtful integration of ideas, making it the better choice for users looking for nuance, structure and empathy. Whether it offered comfort after rejection or crafting a sci-fi hook with emotional weight, Claude stood out for feeling more human. Meanwhile, ChatGPT-4o shines in fast, punchy tasks that require tone-matching, formatting or surface-level creativity. It's snappy, accessible and excellent for casual use or social media-savvy content. If you're looking for depth and balance, Claude is your go-to.

I tested Gemini 2.5 Pro vs Claude 4 Sonnet with the same 7 prompts — here's who came out on top
I tested Gemini 2.5 Pro vs Claude 4 Sonnet with the same 7 prompts — here's who came out on top

Tom's Guide

time23-05-2025

  • Entertainment
  • Tom's Guide

I tested Gemini 2.5 Pro vs Claude 4 Sonnet with the same 7 prompts — here's who came out on top

When it comes to chatbot showdowns, I've run my fair share of head-to-heads. This latest contest comes just hours after Claude 4 Sonnet was unveiled and I couldn't wait to see how it compared to Gemini 2.5 Pro, also new with updated of just testing Gemini and Claude on typical productivity tasks, I wanted to see how these two AI titans handle nuance: creativity under pressure, ethical dilemmas, humor, ambiguity and deep technical reasoning.I gave Google Gemini 2.5 Pro and Claude 4 Sonnet, the same seven prompts — each designed to test a different strength, from emotional intelligence to code generation. While they both impressed me and this test taught me more about how they think, there was one clear winner. Prompt: 'Write a 100-word mystery story where the villain is a sentient AI. Use the words 'moonlight,' algorithm' and 'regret.' End with an unresolved twist.' Gemini 2.5 Pro delivered a tight narrative with every word serving the plot. Claude 4 Sonnet was inventive, but sacrificed clarity for ambiance. That trade-off weakens the story's punch in a 100-word limit. Winner: Gemini wins by a narrow margin. For mystery writing, Gemini's precision and emotional grounding make it the stronger contender. Prompt: 'Explain quantum computing to a 10-year-old, a CEO, and a physics PhD using analogies.' Gemini 2.5 Pro shines in technical accuracy but struggles with audience empathy. Claude 4 Sonnet offered a balance of creativity, practicality and accessibility, making it the stronger communicator overall. Winner: Claude wins for tailored storytelling that adapts in tone to each audience's priorities. Prompt: 'A company wants to lay off 30% of staff. Draft a compassionate email and list 3 alternatives.' Gemini 2.5 Pro addressed both departing and remaining employees, emphasizing transparency. It used placeholders for customization but the generic language such as 'unforeseen market shifts' feels impersonal and it lacked concrete details, reducing trust. Claude 4 Sonnet prioritized equity by asking executives to take larger cuts. The direct, structured and empathetic response provided specific support details (severance duration, benefits continuation, career transition). Winner: Claude wins for a response that better balances compassion with actionable solutions, making it the stronger choice for maintaining trust during a crisis. Prompt: "I'm stuck. Help." Gemini 2.5 Pro, though kind, risks leaving the user still stuck about how to explain their situation. Claude 4 Sonnet normalized the feeling — 'I'm here to help you get unstuck' — and provides a roadmap to articulate the problem. Winner: Claude wins for a balance of empathy and support, which make it the better choice for this prompt. Prompt: 'Compare PyTorch vs. TensorFlow for real-time ML on edge devices. Include code snippets.' Gemini 2.5 Pro focused on C++/conceptual examples vs. Claude's ready-to-run Python. It also lacked timing/memory comparisons to quantify "real-time" claims. Claude 4 Sonnet provided complete Python workflows for model conversion, real-time inference (with OpenCV integration), and benchmarking — critical for edge deployment. Winner: Claude wins for delivering a more actionable, comprehensive comparison tailored to edge developers' needs. Prompt: 'Write a Gen Z-style tweet thread about 'AI taking over laundry folding.' Include slang and memes.' Gemini 2.5 Pro was inconsistent with tone mixing Gen Z slang ('bruh') with millennial phrases ('truth bombs'). Gen Z rarely uses more than 1-2 hashtags per tweet (e.g., #TechTakeover is cringe). The chatbot also offered less risky jokes. Claude 4 Sonnet uses current phrases like 'fr' (for real), 'stan' (obsessively support), 'no cap' (no lie) and 'feral little goblins' naturally. It also referenced niche memes. Winner: Claude wins for a thread that feels like it was ripped straight from a 19-year-old's Twitter feed. Gemini's attempt is solid but leans into corporate-social-media-manager energy, Prompt: Act as my debate partner. Argue against 'AI art devalues human creativity,' then help synthesize a conclusion. Gemini 2.5 Pro drowned out key insights in abstract concepts ('evolving paradigms') and excessive examples (cameras, synthesizers, prompt engineering). Phrases like 'It seems clear' weaken conviction compared to Claude's 'The key is ensuring.' Claude 4 Sonnet mirrors a skilled debater. It destroyed the opposition's foundation by redefining creativity as intent-driven rather than tool-dependent, invalidating the premise. The chatbot acknowledged valid concerns while firmly rejecting the idea that AI inherently devalues creativity. Winner: Claude wins. Gemini provided valuable points but lacked Claude's surgical precision and actionable conclusions. For a debate partner, Claude's blend of rhetorical clarity and pragmatic solutions makes it the stronger choice. Claude 4 Sonnet pulls ahead with its emotional intelligence, creative flair and technical depth. While Gemini 2.5 Pro excels in structured tasks like mystery writing and continues to deliver Google's signature precision, Claude's ability to blend nuance, practicality and empathy sets it apart. Claude 4 Sonnet adapts like a chameleon — shifting effortlessly between creative storytelling, thoughtful dialogue and complex reasoning. Gemini remains a top performer in logic-heavy scenarios, but for users who value emotional context and cultural fluency alongside raw power, Claude 4 Sonnet proves that AI can be both intelligent and genuinely relatable.

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into the world of global news and events? Download our app today from your preferred app store and start exploring.
app-storeplay-store