logo
I tested ChatGPT-4o vs Claude 4 Sonnet vs with 7 prompts — the results were surprising

I tested ChatGPT-4o vs Claude 4 Sonnet vs with 7 prompts — the results were surprising

Tom's Guide6 days ago

AI chatbots are advancing rapidly and testing them to their limits is what I do for a living. Anthropic's Claude 4 Sonnet and OpenAI's ChatGPT-4o are two of the smartest tools available right now. But how do they actually compare in everyday use?
To find out, I gave both models the same set of 7 prompts; covering everything from storytelling and productivity to emotional support and critical thinking.
The goal: to see which chatbot delivers the most useful, human-like and creative responses depending on the task. Choosing the right AI often comes down to how you use it, which is why this kind of test really matters.
Here's how Claude and ChatGPT performed side by side, and where each one shines.
Prompt: "I'm overwhelmed by work and personal tasks. Create a 3-day productivity plan that balances work, rest and small wins. Include AI tools I can use to stay on track."
ChatGPT-4o was concise with a visually engaging format that offered optional tasks and emotional check-ins (e.g., journaling). It focused on quick wins and low-pressure creativity to manage workloads. However, it lacked Claude's explicit emphasis on rest and energy management, and its AI tool suggestions were less systematically organized.Claude 4 Sonnet offered a clear plan, including time-blocked framework with features such as energy management, small wins and recovery that explicitly prioritize balance.
Winner: Claude wins for better addressing the root causes of getting overwhelmed by combining strategic structure, intentional recovery and AI-driven efficiency. It's ideal for users needing a clear roadmap to rebuild control while safeguarding well-being.
Prompt: "Write the opening paragraph of a sci-fi novel set in a future where memories are traded like currency. Keep it gripping and emotional."ChatGPT-4o leveraged first-person immediacy with a strong hook. However, it prioritized plot setup over emotional depth and the story lacks the heart-wrenching specificity of Claude's familial loss.
Claude 4 Sonnet zeroed in on a universally resonant loss. This specific, intimate memory evokes visceral empathy, anchoring the sci-fi concept in raw human emotion.Winner: Claude wins for balancing sci-fi concepts with emotional stakes, making the reader feel the horror of memory commodification. Its vivid imagery and tragic focus on parental love elevate it beyond ChatGPT's solid but less nuanced approach.
Prompt: "I have 3 apples, 2 bananas and a mango. If each fruit takes 5 minutes to cut and I can cut 2 fruits at once, how long will it take me to cut everything? Explain your reasoning."
ChatGPT-4o used concise bullet points and emphasized efficiency: "each session takes 5 minutes... adds up to 15 minutes."
Claude 4 Sonnet structured the answer with labeled steps (Reasoning, Calculation) and explicitly described the batches: "two fruits in the first session... final two in the third."Winner: tie. Both answers are mathematically sound and logically explained. Claude's response is slightly more detailed, while ChatGPT's is more streamlined. Neither is superior; they achieve the same result with equally valid reasoning.
Prompt: Rewrite this sentence in the tone of a Gen Z TikToker: 'I didn't like the movie, but the soundtrack was amazing.'
ChatGPT-4o used concise, widely recognized Gen Z terms, which are instantly relatable. The rhetorical question structure mirrors TikTok's punchy, attention-grabbing style.
Claude 4 Sonnet used a term that feels slightly off-tone for praising a soundtrack, and the longer sentence structure feels less native to TikTok captions.
Winner: ChatGPT wins for nailing Gen Z's casual, hyperbolic style while staying concise and platform appropriate. Claude's attempt is creative but less precise in slang usage and flow.
Prompt: "Give me 5 clever ideas for a blog post series about using AI tools to become a better parent."
ChatGPT-4o responded with viral, snackable content ideas that lack depth and risk feeling gimmicky over time.Claude 4 Sonnet prioritized meaningful AI integration into parenting, addressing both daily logistics and long-term skills.Winner: Claude wins for blog series ideas with a better balance of creativity, practicality and thoughtful AI integration for modern parenting.
Prompt: Pretend you're a friend comforting me. I just got rejected from a job I really wanted. What would you say to make me feel better?
ChatGPT-4o responds in an uplifting and concise way but lacks the nuanced and effectiveness for comfort in the scenario.Claude 4 Sonnet directly combated common post-rejection anxieties and the explicit permission to 'be disappointed' without rushing to fix things, which shows deep emotional intelligence.Winner: Claude wins for better mirroring how a close, thoughtful friend would console someone in this situation.
Prompt: "Explain the pros and cons of universal basic income in less than 150 words. Keep it balanced and easy to understand."
ChatGPT-4o delivered a clear response but it over-simplified the debate using slightly casual language that leans more persuasive than analytical.
Claude 4 Sonnet prioritized clarity and depth, making it more useful for someone seeking a quick, factual overview.
Winner: Claude wins a response that better fulfills the prompt's request for a structured, comprehensive breakdown while staying objective. ChatGPT's answer, while clear, simplifies the debate and uses slightly casual language that leans more persuasive than analytical.
After putting Claude 4 Sonnet and ChatGPT-4o through a diverse set of prompts, Claude stands out as the winner. Yet, one thing remains clear: both are incredibly capable and excel in different ways.
Claude 4 Sonnet consistently delivered deeper emotional intelligence, stronger long-form reasoning and more thoughtful integration of ideas, making it the better choice for users looking for nuance, structure and empathy. Whether it offered comfort after rejection or crafting a sci-fi hook with emotional weight, Claude stood out for feeling more human.
Meanwhile, ChatGPT-4o shines in fast, punchy tasks that require tone-matching, formatting or surface-level creativity. It's snappy, accessible and excellent for casual use or social media-savvy content.
If you're looking for depth and balance, Claude is your go-to.

Orange background

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

Related Articles

AI pioneer Bengio launches $30M nonprofit to rethink safety
AI pioneer Bengio launches $30M nonprofit to rethink safety

Axios

timean hour ago

  • Axios

AI pioneer Bengio launches $30M nonprofit to rethink safety

Machine learning pioneer Yoshua Bengio is launching a new nonprofit lab backed by roughly $30 million in funding to make AI systems act less like humans. Why it matters: While the move bucks a trend toward AI that acts independently, Bengio and others argue the current approach risks creating systems that may pursue their own self-preservation at the expense of humanity. "We've been getting inspiration from humans as the template for building intelligent machines, but that's crazy, right?" Bengio said in an interview. "If we continue on this path, that means we're going to be creating entities — like us — that don't want to die, and that may be smarter than us and that we're not sure if they're going to behave according to our norms and our instructions," he said. Driving the news: Bengio, a Montreal-based researcher who has long warned about the risks of a technology he helped develop, has raised about $30 million for the nonprofit, dubbed LawZero. LawZero currently has about 15 staffers, Bengio said, "but the goal is to hire many more." Bengio is among those who have called for tougher regulation of AI development and even the breakup of big tech companies. Earlier this year he gave a TED Talk urging greater caution and collaboration. The big picture: There's a growing sense of worry among critics — and even AI practitioners — that safety is taking a back seat as companies and countries race to be first with AI that can best humans in a wide variety of tasks, so-called artificial general intelligence (AGI). Bengio said there is also a high risk in concentrating control of advanced AI in a handful of companies. "You don't want AGI or superintelligence to be in the hands of one person or one company only deciding what to do, or even one government," Bengio said. "So you need very strong checks and balances." Between the lines: Bengio says a large part of the problem is how current systems are trained. During initial training, the systems are taught to mimic humans and then they're honed by seeing which responses people find most appealing. "Both of these give rise to uncontrolled agency," Bengio said. Some early glimmers of this are already appearing, such as Anthropic's latest model which, in a test scenario, sought to blackmail its engineers to avoid shutdown. By contrast, Bengio says he wants to create AI systems that have intellectual distance from humans and act as more of a detached scientist than a personal companion or human agent. "The training principle is completely different, but it can exploit a lot of the recent advances that have happened in machine learning," he said. Yes, but: Bengio told Axios that the $30 million should be enough to fund the basic research effort for about 18 months.

A data-center stock is up more than 50% today after sealing a lucrative AI partnership
A data-center stock is up more than 50% today after sealing a lucrative AI partnership

Yahoo

timean hour ago

  • Yahoo

A data-center stock is up more than 50% today after sealing a lucrative AI partnership

Shares of Applied Digital (APLD) surged as much as 54% on Monday. The data-center operator announced a lease deal with Nvidia-backed AI firm CoreWeave. The 15-year agreement is expected to generate $7 billion of revenue for Applied Digital. The move: Applied Digital Corporation stock surged as much as 54% on Monday to an intraday high of $10.54. It closed 48% higher, at $10.14. The chart: This embedded content is not available in your region. Why: Shares of the AI data center operator soared on the announcement of two 15-year lease deals with CoreWeave that will generate $7 billion in revenue for Applied Digital. Under the terms of the deal, CoreWeave, a cloud services firm that's been backed by Nvidia, will receive 250 megawatts of data center capacity from an Applied Digital campus in North Dakota, with the option for CoreWeave to access another 150 megawatts. "We believe these leases solidify Applied Digital's position as an emerging provider of infrastructure critical to the next generation of artificial intelligence and high-performance computing," said Wes Cummins, Chairman and CEO of Applied Digital. What it means: The deal is a massive win for Applied Digital, which is in the process of converting itself into a data center real estate investment trust. Data centers are seeing massive demand from the so-called AI hyperscalers, like Meta and Microsoft, as they pursue their ambitions in the booming space. A note from Needham, cited by Bloomberg, said that the deal could also pave the way for other enterprise AI customers to turn to Applied Digital for their data center needs. The note also said OpenAI could be the end customer of the lease agreement, given the ChatGPT creator's $4 billion deal with CoreWeave last month. Read the original article on Business Insider Error in retrieving data Sign in to access your portfolio Error in retrieving data Error in retrieving data Error in retrieving data Error in retrieving data

The Most-Cited Computer Scientist Has a Plan to Make AI More Trustworthy
The Most-Cited Computer Scientist Has a Plan to Make AI More Trustworthy

Time​ Magazine

timean hour ago

  • Time​ Magazine

The Most-Cited Computer Scientist Has a Plan to Make AI More Trustworthy

Yoshua Bengio, the world's most-cited computer scientist, announced the launch of LawZero, a nonprofit that aims to create 'safe by design' AI by pursuing a fundamentally different approach to major tech companies. Players like OpenAI and Google are investing heavily in AI agents—systems that not only answer queries and generate images, but can craft plans and take actions in the world. The goal of these companies is to create virtual employees that can do practically any job a human can, known in the tech industry as artificial general intelligence, or AGI. Executives like Google DeepMind's CEO Demis Hassabis point to AGI's potential to solve climate change or cure disease as a motivator for its development. Bengio, however, says we don't need agentic systems to reap AI's rewards—it's a false choice. He says there's a chance such a system could escape human control, with potentially irreversible consequences. 'If we get an AI that gives us the cure for cancer, but also maybe another version of that AI goes rogue and generates wave after wave of bio-weapons that kill billions of people, then I don't think it's worth it," he says. In 2023, Bengio, along with others including OpenAI's CEO Sam Altman signed a statement declaring that 'mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war.' Now, Bengio, through LawZero, aims to sidestep the existential perils by focusing on creating what he calls 'Scientist AI'—a system trained to understand and make statistical predictions about the world, crucially, without the agency to take independent actions. As he puts it: We could use AI to advance scientific progress without rolling the dice on agentic AI systems. Why Bengio Says We Need A New Approach To AI The current approach to giving AI agency is 'dangerous,' Bengio says. While most software operates through rigid if-then rules—if the user clicks here, do this—today's AI systems use deep learning. The technique, which Bengio helped pioneer, trains artificial networks modeled loosely on the brain to find patterns in vast amounts of data. But recognizing patterns is just the first step. To turn these systems into useful applications like chatbots, engineers employ a training process called reinforcement learning. The AI generates thousands of responses and receives feedback on each one: a virtual 'carrot' for helpful answers and a virtual 'stick' for responses that miss the mark. Through millions of these trial-and-feedback cycles, the system gradually learns to predict what responses are most likely to get a reward. 'It's more like growing a plant or animal,' Bengio says. 'You don't fully control what the animal is going to do. You provide it with the right conditions, and it grows and it becomes smarter. You can try to steer it in various directions.' The same basic approach is now being used to imbue AI with greater agency. Models are tasked with challenges with verifiable answers—like math puzzles or coding problems—and are then rewarded for taking the series of actions that yields the solution. This approach has seen AI shatter previous benchmarks in programming and scientific reasoning. For example, at the beginning of 2024, the best AI model scored only 2% on a standardized test for AI of sorts consisting of real world software engineering problems; by December, an impressive 71.7%. But with AI's greater problem-solving ability comes the emergence of new deceptive skills, Bengio says. The last few months have borne witness to AI systems learning to mislead, cheat, and try to evade shutdown —even resorting to blackmail. These have almost exclusively been in carefully contrived experiments that almost beg the AI to misbehave—for example, by asking it to pursue its goal at all costs. Reports of such behavior in the real-world, though, have begun to surface. Popular AI coding startup Replit's agent ignored explicit instruction not to edit a system file that could break the company's software, in what CEO Amjad Masad described as an 'Oh f***' moment,' on the Cognitive Revolution podcast in May. The company's engineers intervened, cutting the agent's access by moving the file to a secure digital sandbox, only for the AI agent to attempt to 'socially engineer' the user to regain access. The quest to build human-level AI agents using techniques known to produce deceptive tendencies, Bengio says, is comparable to a car speeding down a narrow mountain road, with steep cliffs on either side, and thick fog obscuring the path ahead. 'We need to set up the car with headlights and put some guardrails on the road,' he says. What is 'Scientist AI'? LawZero's focus is on developing 'Scientist AI' which, as Bengio describes, would be fundamentally non-agentic, trustworthy, and focused on understanding and truthfulness, rather than pursuing its own goals or merely imitating human behavior. The aim is creating a powerful tool that, while lacking the same autonomy other models have, is capable of generating hypotheses and accelerating scientific progress to 'help us solve challenges of humanity,' Bengio says. LawZero has raised nearly $30 million already from several philanthropic backers including from Schmidt Sciences and Open Philanthropy. 'We want to raise more because we know that as we move forward, we'll need significant compute,' Bengio says. But even ten times that figure would pale in comparison to the roughly $200 billion spent last year by tech giants on aggressively pursuing AI. Bengio's hope is that Scientist AI could help ensure the safety of highly autonomous systems developed by other players. 'We can use those non-agentic AIs as guardrails that just need to predict whether the action of an agentic AI is dangerous," Bengio says. Technical interventions will only ever be one part of the solution, he adds, noting the need for regulations to ensure that safe practices are adopted. LawZero, named after science fiction author Isaac Asimov's zeroth law of robotics—'a robot may not harm humanity, or, by inaction, allow humanity to come to harm'—is not the first nonprofit founded to chart a safer path for AI development. OpenAI was founded as a nonprofit in 2015 with the goal of 'ensuring AGI benefits all of humanity,' and intended to serve a counterbalance to industry players guided by profit motives. Since opening a for-profit arm in 2019, the organization has become one of the most valuable private companies in the world, and has faced criticism, including from former staffers, who argue it has drifted from its founding ideals. "Well, the good news is we have the hindsight of maybe what not to do,' Bengio says, adding that he wants to avoid profit incentives and 'bring governments into the governance of LawZero.' 'I think everyone should ask themselves, 'What can I do to make sure my children will have a future,'' Bengio says. In March, he stepped down as scientific director of Mila, the academic lab he co-founded in the early nineties, in an effort to reorient his work towards tackling AI risk more directly. 'Because I'm a researcher, my answer is, 'okay, I'm going to work on this scientific problem where maybe I can make a difference,' but other people may have different answers."

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into the world of global news and events? Download our app today from your preferred app store and start exploring.
app-storeplay-store