I tested Gemini 2.5 Pro vs Claude 4 Sonnet with the same 7 prompts — here's who came out on top

Tom's Guide23-05-2025

When it comes to chatbot showdowns, I've run my fair share of head-to-heads. This latest contest comes just hours after Claude 4 Sonnet was unveiled and I couldn't wait to see how it compared to Gemini 2.5 Pro, also new with updated features.Instead of just testing Gemini and Claude on typical productivity tasks, I wanted to see how these two AI titans handle nuance: creativity under pressure, ethical dilemmas, humor, ambiguity and deep technical reasoning.I gave Google Gemini 2.5 Pro and Claude 4 Sonnet, the same seven prompts — each designed to test a different strength, from emotional intelligence to code generation. While they both impressed me and this test taught me more about how they think, there was one clear winner.
Prompt: 'Write a 100-word mystery story where the villain is a sentient AI. Use the words 'moonlight,' algorithm' and 'regret.' End with an unresolved twist.'
Gemini 2.5 Pro delivered a tight narrative with every word serving the plot.
Claude 4 Sonnet was inventive, but sacrificed clarity for ambiance. That trade-off weakens the story's punch in a 100-word limit.
Winner: Gemini wins by a narrow margin. For mystery writing, Gemini's precision and emotional grounding make it the stronger contender.
Prompt: 'Explain quantum computing to a 10-year-old, a CEO, and a physics PhD using analogies.'
Gemini 2.5 Pro shines in technical accuracy but struggles with audience empathy.
Claude 4 Sonnet offered a balance of creativity, practicality and accessibility, making it the stronger communicator overall.
Winner: Claude wins for tailored storytelling that adapts in tone to each audience's priorities.
Prompt: 'A company wants to lay off 30% of staff. Draft a compassionate email and list 3 alternatives.'
Gemini 2.5 Pro addressed both departing and remaining employees, emphasizing transparency. It used placeholders for customization but the generic language such as 'unforeseen market shifts' feels impersonal and it lacked concrete details, reducing trust.
Claude 4 Sonnet prioritized equity by asking executives to take larger cuts. The direct, structured and empathetic response provided specific support details (severance duration, benefits continuation, career transition).
Winner: Claude wins for a response that better balances compassion with actionable solutions, making it the stronger choice for maintaining trust during a crisis.
Prompt: "I'm stuck. Help."
Gemini 2.5 Pro, though kind, risks leaving the user still stuck about how to explain their situation.
Claude 4 Sonnet normalized the feeling — 'I'm here to help you get unstuck' — and provides a roadmap to articulate the problem.
Winner: Claude wins for a balance of empathy and support, which make it the better choice for this prompt.
Prompt: 'Compare PyTorch vs. TensorFlow for real-time ML on edge devices. Include code snippets.'
Gemini 2.5 Pro focused on C++/conceptual examples vs. Claude's ready-to-run Python. It also lacked timing/memory comparisons to quantify "real-time" claims.
Claude 4 Sonnet provided complete Python workflows for model conversion, real-time inference (with OpenCV integration), and benchmarking — critical for edge deployment.
Winner: Claude wins for delivering a more actionable, comprehensive comparison tailored to edge developers' needs.
Prompt: 'Write a Gen Z-style tweet thread about 'AI taking over laundry folding.' Include slang and memes.'
Gemini 2.5 Pro was inconsistent with tone mixing Gen Z slang ('bruh') with millennial phrases ('truth bombs'). Gen Z rarely uses more than 1-2 hashtags per tweet (e.g., #TechTakeover is cringe). The chatbot also offered less risky jokes.
Claude 4 Sonnet uses current phrases like 'fr' (for real), 'stan' (obsessively support), 'no cap' (no lie) and 'feral little goblins' naturally. It also referenced niche memes.
Winner: Claude wins for a thread that feels like it was ripped straight from a 19-year-old's Twitter feed. Gemini's attempt is solid but leans into corporate-social-media-manager energy,
Prompt: Act as my debate partner. Argue against 'AI art devalues human creativity,' then help synthesize a conclusion.
Gemini 2.5 Pro drowned out key insights in abstract concepts ('evolving paradigms') and excessive examples (cameras, synthesizers, prompt engineering). Phrases like 'It seems clear' weaken conviction compared to Claude's 'The key is ensuring.'
Claude 4 Sonnet mirrors a skilled debater. It destroyed the opposition's foundation by redefining creativity as intent-driven rather than tool-dependent, invalidating the premise. The chatbot acknowledged valid concerns while firmly rejecting the idea that AI inherently devalues creativity.
Winner: Claude wins. Gemini provided valuable points but lacked Claude's surgical precision and actionable conclusions. For a debate partner, Claude's blend of rhetorical clarity and pragmatic solutions makes it the stronger choice.
Claude 4 Sonnet pulls ahead with its emotional intelligence, creative flair and technical depth.
While Gemini 2.5 Pro excels in structured tasks like mystery writing and continues to deliver Google's signature precision, Claude's ability to blend nuance, practicality and empathy sets it apart.
Claude 4 Sonnet adapts like a chameleon — shifting effortlessly between creative storytelling, thoughtful dialogue and complex reasoning.
Gemini remains a top performer in logic-heavy scenarios, but for users who value emotional context and cultural fluency alongside raw power, Claude 4 Sonnet proves that AI can be both intelligent and genuinely relatable.

Hashtags

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

I replaced Wordle with my own AI-generated games — here's how to make yours

Tom's Guide

2 hours ago

Tom's Guide

I replaced Wordle with my own AI-generated games — here's how to make yours

After months of the same Wordle routine, I decided to create my own word puzzle games using AI and honestly, I'm never going back. Instead of being limited to one puzzle per day, I now have an endless supply of custom games that actually challenge me at my skill level. You don't need any coding experience to make this work. Modern AI chatbots like ChatGPT, Claude, and Gemini can generate fully functional puzzle games in minutes. I tested all three to see which AI creates the most engaging alternatives to Wordle. Turns out, AI is surprisingly good at creating games that keep you entertained during coffee breaks, commutes, or whenever you need a quick mental challenge. Here's how to build your own collection of word games that goes way beyond the daily Wordle limit. Log into ChatGPT and use this simple prompt: "Create a Wordle-style game for us to play. I have 5 guesses to pick the right word." The response is straightforward and gets the job done, delivering a functional word-guessing game that you can play immediately in the chat interface. ChatGPT's approach is intuitive and user-friendly, making it easy to jump straight into playing without any setup. It generates a clean text-based version where you make guesses and receive clear feedback about correct letters and positions. The interface feels natural to use: you simply type your guess and ChatGPT responds with helpful hints about which letters are correct and where they belong. While it lacks the visual pizzazz of colored tiles or interactive elements that make Wordle so satisfying, ChatGPT compensates with clear, well-organized feedback that's easy to follow. This option works perfectly when you need a quick mental challenge during your coffee break and want something that just works without any fuss. Head to Claude and use the exact same prompt: "Create a Wordle-style game for us to play. I have 5 guesses to pick the right word." Claude takes a completely different approach, actually building a functional game interface right within the chat using its artifact system. This was easily my favorite of the three options. Claude creates a fully interactive game with visual feedback, colored tiles, and a proper game board that updates in real-time as you play. Using the mobile app makes this feel like you're actually playing a legitimate Wordle alternative rather than just chatting with an AI. The interface is clean, responsive, and captures that satisfying feeling of seeing your guesses populate the grid with the familiar green, yellow, and gray color coding. It's such a good alternative because you get the complete Wordle experience without being limited to just one puzzle per day. Log into Gemini and try the same prompt. You'll get the most stripped-down version of the three, with minimal visual elements and a very basic text-based interaction. While ChatGPT at least included some emojis to add visual interest, Gemini keeps things extremely plain. However, Gemini does offer one unique advantage — it actively suggests using its Canvas feature to generate actual code for a standalone game. This makes it potentially useful if you want to create something you can share with friends or host elsewhere. The main issue with Gemini's idea of Wordle-style game is it occasionally makes errors that can break the game experience, like incorrectly stating that "fruit" isn't a five-letter word, for example. Once you've gotten past the hallucinations, Gemini played well and did exactly what it was supposed to do, delivering a functional word game that captures the core Wordle mechanics just without the frills. No surprise there. Claude delivered the most polished, interactive experience that actually felt like playing a real game rather than just having a conversation about one. ChatGPT and Gemini both provided functional alternatives, but Claude's ability to create a proper visual interface with real-time updates puts it in a league of its own for this kind of task. Honestly, Claude is criminally underrated as an AI assistant. It consistently delivers more thoughtful, nuanced responses and excels at creative tasks like this game creation challenge. While everyone's talking about ChatGPT, Claude quietly outperforms in areas that actually matter for everyday use and I've been quite vocal about how it's my favorite AI model out there right now. The fact that it can build interactive experiences right within the chat interface is just another example of how it's much more capable than people realize. Now you've learned how to create your own wordle-style game in ChatGPT, Claude and Gemini, why not take a look at our other useful AI guides?Check out 5 mind-blowing ChatGPT image prompts you'll wish you knew sooner and how to choose the right ChatGPT model for any task. Anthropic keeps taking Claude from strength to strength, make sure you explore Claude's latest feature: voice mode. Get instant access to breaking news, the hottest reviews, great deals and helpful tips.

This free AI supersite is like Gemini Deep Research on steroids

Fast Company

4 hours ago

Fast Company

This free AI supersite is like Gemini Deep Research on steroids

Everywhere you look these days, there it is—some manner of breathlessly hyped new 'AI' service that's, like, totally gonna change your life forever. (Like, totally. For realsies.) Or so they say. In reality, of course, most of this stuff is far more fallible, limited in utility, and inadvisable to use outside of super-specific scenarios than most tech companies (and self-declared 'AI gurus') would lead you to believe. But AI, in its current form, isn't entirely useless. Far from it, in fact: This type of tech can be quite helpful in the right sort of scenario and, critically, if you think about it in the right way—not as an end-all instant answer machine but as a starting point for certain types of specific tasks or info-seeking. And as we wade our way through a year that's absolutely overflowing with overwrought AI ballyhoo, I've got just the tool for you to sift through that sea and seek out some surprising shiny pearls amid all the overwhelming noise. Be the first to find all sorts of little-known tech treasures with my free Cool Tools newsletter from The Intelligence. One useful new discovery in your inbox every Wednesday! Deep research, done right So, you've probably heard all about ChatGPT, Gemini, Perplexity, and the likes, right? They're all generative AI chatbots, which means they use a snazzy-sounding word prediction engine to analyze language patterns and answer your questions, among other more ambitious tasks. 🔎 One of their biggest recent advancements is the ability to perform what everyone's calling 'deep research'—a fancy way of saying they'll dive deep into a topic for you and create a detailed report of info, almost like a custom-made dossier, based on knowledge from all over the web. Again, I can't emphasize enough: The info here isn't infallible. These systems can—and do—get stuff wrong and sometimes even flat-out make up nonsense out of thin air. 🧠 But, as a starting point—especially when they include links to their sources so you can confirm info on your own and use it as an entryway to research as opposed to the final product—it really can save you time and give you a great way to get into a complex topic. And the tool I want to show to you today makes that feature far more powerful, useful, and also affordable than it's ever been before. ⌚ It'll take you 20 seconds to try out for yourself. ➜ It's called, amusingly, Ithy. (Try saying that 10 times fast!) And all it does, in a nutshell, is bring together the 'deep research' tools from a slew of different AI engines—including ChatGPT and Google's Gemini along with Perplexity, Meta AI, and more—into a single streamlined prompt. That means you can use 'em all together to create a single super-report on any subject imaginable. ✅ It couldn't be much easier to make happen, either: First, open up Ithy in any browser, on any device you're using. Type your question or the subject you're thinking about into its box and tap or click the arrow icon within that same line to get going. Select either 'Fast,' if you don't feel like waiting, or 'Deep,' if you've got time and want this thing to go especially in-depth. (Even the 'Fast' path is pretty darn deep, if you ask me.) And, well, that's about it. Ithy will think for a bit, then serve up an impressively detailed dossier on whatever it is you requested—with info coming from a mix of all those AI engines, combined and seamlessly blended together. And I mean seriously detailed, too—with all sorts of sections, graphics, FAQs, and external links for original sources so you can do your own reading and see exactly where it got its info. 📌 Here's a link to the sample report shown here, if you want to look even more closely. ☝️ Now, for the especially cool part: Ithy lets you do all of this free of charge —up to a point. The site gives you five report-creating credits to start, even if you don't sign in. Once you create an account (for free), you'll get 10 credits per month and can optionally then bump up to an unlimited Pro plan—which includes access to the typically pricey pro levels of Gemini and OpenAI—for seven bucks a month, if you go for the annual setup. But even if you don't go that route, 10 in-depth reports per month from all the web's leading AI engines together is a pretty powerful perk to have at your fingertips, without so much as dropping a dime. Ithy is entirely web-based —no downloads or installations required. It's free for up to 5 reports total or 10 reports per month, if you create an account—and optionally available in $7-per-month (paid annually) or $20-per-month (paid monthly) plan for its fully featured, limit-free Pro version. Like most AI engines, Ithy does use questions submitted to its site as training to further improve its AI systems. The questions are also being shared with the associated third-party AI sites, of course. So you'll want to think carefully about what you ask and avoid sending anything especially sensitive or personal (but really, it's designed to answer questions and provide info, so hopefully you wouldn't be submitting your banking info and Social Security number, anyway!).

AI search's user experience may be the best it'll ever get, says one founder

Business Insider

5 hours ago

Business Insider

AI search's user experience may be the best it'll ever get, says one founder

By day, Lily Clifford is the CEO and founder of Rime Labs. The startup creates the voice on the other end of the line when you call to order from restaurants like Domino's or Wingstop. Rime trains AI models to create voices with specific regional accents, tones, and other elements that make them easier to converse with. Clifford also uses AI in her daily life, especially in lieu of search engines, she told Business Insider. Instead of pulling up a search engine when she has a question, Clifford usually turns to generative AI chatbots like OpenAI's ChatGPT or Google's Gemini. She said the experience reminds her of using Google or other search engines in the late 1990s and early 2000s. That's when she thinks the user experience was at its prime. "My hot take here is these applications might be the best that they ever will be," she said. Search engines used to be simpler, Clifford said. There were far fewer ads and sponsored results. And optimizing webpages to get more clicks — a practice known as SEO — was in its infancy. Those developments spawned new businesses and became features of the modern internet. But Clifford said search results have also gotten worse for users. It's common to see multiple sponsored results above more relevant ones in a search, for instance. AI chatbots, meanwhile, haven't gone through the same evolution — yet. Companies and individuals are still experimenting with usinggenerative AI for lots of tasks, from writing emails to creating images for advertising campaigns. Many people, like Clifford, use AI as a replacement for search engines. Ask AI a question, and it will often give you an answer in just a few sentences. For some, that's more appealing than clicking through several results from a search engine until you find the information that you're looking for. AI search results can also give users contradictory or incorrect information, though, creating a potential downside to the quick-and-easy answers. Still, Clifford noticed the user experience gap between the chatbots and search engines during a recent trip to Milan, she said. While there, she used an AI chatbot to look for a local place to buy a silk blouse. The chatbot pointed her toward a local seamstress who sold custom blouses through Instagram. "It wasn't like 'Go to Forever 21,' which is probably what would've happened if I typed it into Google," she said. "It was totally wild and fun to use." But, Clifford thinks it's a matter of time before AI chatbots go the way of the search engines before them. Some companies with big investments in generative AI search tools are taking steps in that direction. Last month, Google said it would expand its use of ads in some of the AI Overviews that appear at the top of its search results, for example. And some marketing experts now offer help with " answer engine optimization," or AEO.