DeepSeek claims its 'reasoning' model beats OpenAI's o1 on certain benchmarks
R1 is available from the AI dev platform Hugging Face under an MIT license, meaning it can be used commercially without restrictions. According to DeepSeek, R1 beats o1 on the benchmarks AIME, MATH-500, and SWE-bench Verified. AIME employs other models to evaluate a model's performance, while MATH-500 is a collection of word problems. SWE-bench Verified, meanwhile, focuses on programming tasks.
Being a reasoning model, R1 effectively fact-checks itself, which helps it to avoid some of the pitfalls that normally trip up models. Reasoning models take a little longer — usually seconds to minutes longer — to arrive at solutions compared to a typical nonreasoning model. The upside is that they tend to be more reliable in domains such as physics, science, and math.
R1 contains 671 billion parameters, DeepSeek revealed in a technical report. Parameters roughly correspond to a model's problem-solving skills, and models with more parameters generally perform better than those with fewer parameters.
Indeed, 671 billion parameters is massive, but DeepSeek also released "distilled" versions of R1 ranging in size from 1.5 billion parameters to 70 billion parameters. The smallest can run on a laptop. As for the full R1, it requires beefier hardware, but it is available through DeepSeek's API at prices 90%-95% cheaper than OpenAI's o1.
Clem Delangue, the CEO of Hugging Face, said in a post on X on Monday that developers on the platform have created more than 500 "derivative" models of R1 that have racked up 2.5 million downloads combined — five times the number of downloads the official R1 has gotten.
https://twitter.com/ClementDelangue/status/1883946119723708764
There is a downside to R1. Being a Chinese model, it's subject to benchmarking by China's internet regulator to ensure that its responses "embody core socialist values." R1 won't answer questions about Tiananmen Square, for example, or Taiwan's autonomy.
Many Chinese AI systems, including other reasoning models, decline to respond to topics that might raise the ire of regulators in the country, such as speculation about the Xi Jinping regime.
R1 arrives days after the outgoing Biden administration proposed harsher export rules and restrictions on AI technologies for Chinese ventures. Companies in China were already prevented from buying advanced AI chips, but if the new rules go into effect as written, companies will be faced with stricter caps on both the semiconductor tech and models needed to bootstrap sophisticated AI systems.
In a policy document last week, OpenAI urged the U.S. government to support the development of U.S. AI, lest Chinese models match or surpass them in capability. In an interview with The Information, OpenAI's VP of policy Chris Lehane singled out High Flyer Capital Management, DeepSeek's corporate parent, as an organization of particular concern.
So far, at least three Chinese labs — DeepSeek, Alibaba, and Kimi, which is owned by Chinese unicorn Moonshot AI — have produced models that they claim rival o1. (Of note, DeepSeek was the first — it announced a preview of R1 in late November.) In a post on X, Dean Ball, an AI researcher at George Mason University, said that the trend suggests Chinese AI labs will continue to be "fast followers."
"The impressive performance of DeepSeek's distilled models [...] means that very capable reasoners will continue to proliferate widely and be runnable on local hardware," Ball wrote, "far from the eyes of any top-down control regime."
This story originally published on January 20 and was updated on January 27 with more information.
TechCrunch has an AI-focused newsletter! Sign up here to get it in your inbox every Wednesday.
Hashtags

Try Our AI Features
Explore what Daily8 AI can do for you:
Comments
No comments yet...
Related Articles
Yahoo
30 minutes ago
- Yahoo
Google Now Lets You Create Illustrated AI Storybooks Inside The Gemini App
Google isn't sitting idly by while OpenAI is busy releasing its newest open-source ChatGPT models and teasing the big GPT-5 upgrade. For example, Google added several AI agents to Google Cloud that can help app developers and businesses create faster and smarter apps. The company also released a mind-blowing new AI world model called Genie 3 that lets you interact AI-generated content. Finally, Google unveiled a new Storybook mode for Gemini that lets you create illustrated stories for your kids on any topic. The current AI models can't achieve anywhere close to the same level of creativity as humans. They can answer questions and solve problems, but creativity isn't their strong suit. When it is, it will likely be the dawn of artificial general intelligence (AGI), the next generation of AI that will give the AI the creativity it needs to address any situation like a human would. But the new Storybook mode in the Gemini app doesn't need the AI to be especially creative. It might be just what some parents need to keep their children entertained with a new bedtime story when they get bored of their favorites. It can also be a great tool to come up with new stories on the fly when your kid demands it, or to explain more complex concepts in a more digestible format. And Storybook doesn't just create a unique digital book by following a set of instructions -- it also provides picture book illustrations. Read more: ChatGPT Has A Built-In 'Hack' That Makes Your Prompts So Much Better How Storybook Works In The Gemini App Google explained in a blog post that the new Storybook mode is available globally on desktop and mobile in all languages that Gemini is available. The Storybook mode is available as a Gem or a customized chatbot. Gems are the equivalent of custom GPTs in ChatGPT. Google still labels the Storybook Gemini Gem as an experiment, warning that it may have unexpected results. This also signals that Google should improve it in the future. As for those unexpected results, you can expect hallucinations (which aren't bad for a story) and illustration issues. For example, The Verge was presented with an image of a fish with hands. But how else would the fish push a boulder? Other than that, the Storybook Gem lets you use Gemini as you normally would. You get to pick the AI model you want to use for the story, and you can upload files to customize the illustrated 10-page book that Gemini will imagine for you. The video above shows a Google demo of an AI story made with Storybook based on a prompt asking the AI to create a coloring book inspired by the mother's career. Choosing the type of illustration might be the best part of Storybook mode. For example, you can instruct the AI to use the style of your kid's drawings to illustrate the story. The examples that the Storybook page provides also show the AI will support Anime and Claymation art. You can tweak the book until you get the desired results. Read the original article on BGR.
Yahoo
30 minutes ago
- Yahoo
Google Gemini Chats Just Got More Personal
GPT-5 might be the most important thing happening in AI tech this month, but Google is also constantly upgrading the abilities of its Gemini chatbot. The company announced new Gemini features on Wednesday that users will surely appreciate. Gemini chats will get more personal than before, as the chatbot will be able to provide more personal answers thanks to a new "Personal Context" feature. The feature will roll out gradually, starting with Gemini 2.5 Pro users. Gemini is also getting more private, with Google rolling out support for Temporary Chats, a feature that's similar to how private browsing or private Google Maps navigation works. On top of that, Google will also make changes to the way you handle Gemini privacy. The "Gemini Apps Activity" section will be renamed to "Keep Activity" in the near future. It'll include settings that manage Gemini Live data from your iPhone or Android handset, including audio and recordings, just like the "Gemini Apps Activity." Read more: How To Remove Yourself From Google Results And Any Other Websites Gemini Personalization Is Turned On By Default AI firms like OpenAI and Google are improving the abilities of their chatbots with every new update. The goal isn't to just offer a more reliable assistant that can answer quickly all sorts of prompts. AI firms also want to create personal assistants that know everything about the user and can provide more helpful context. Personalization features, like remembering previous chats and specific memories, are stepping stones towards that future. Google on Wednesday said that Gemini will be able to reference past chats to learn what you like and offer personal responses. Here's an example of a personal Gemini experience from Google's announcement: "You've previously discussed the evolution of characters' powers in your favorite comic book. Now, if you ask Gemini to brainstorm a birthday party theme that's unique to me, it might suggest a celebration based on your favorite character, complete with themed food and a custom photo booth with props." The feature is called "Personal Context," and it's turned on by default in the settings menu of the Gemini app. The screenshots above show your customization options on mobile and desktop. You can disable "Your past chats with Gemini" if you don't want the AI to retain memories. Personalized conversations will be available with the Gemini 2.5 Pro model initially in select markets. It will roll out to Gemini 2.5 Flash in the coming weeks. Temporary Chats And New Privacy Controls Temporary Chats let you exclude Gemini conversations from your history. They won't appear in the list of recent chats, and won't impact Gemini's "Personal Context" memory feature. Also, Temporary Chats will not send data to Google to improve the Gemini if you have that setting enabled. To use Temporary Chats, you'll want to click on the chat bubble that appears on the right of the "New chat" menu. Temporary Chats will be retained for 72 hours, which is the standard for all Gemini chats. This is a safety and feedback feature. If you have the "Gemini Apps Activity" setting enabled in your Gemini apps, Google will use information from your chat to train the model. That's how other AI models work, including ChatGPT. But users can always opt-out, to exclude personal chats from the training data Google uses to improve its AI models. The "Gemini Apps Activity" feature will be called "Keep Activity" in the coming weeks. Despite the name change, the purpose of the setting will not change. Turn it off, and Google won't train the AI with your data. If "Gemini Apps Activity" is off, "Keep Activity" will remain off. Google also explained that it introduced a privacy feature earlier this month that lets Gemini Live users decide whether audio, video, and screenshots can be used to train Gemini. That feature is off by default, so you don't have to do anything to prevent your data from reaching Google's servers. As you can see above, the upcoming "Keep Activity" setting will also include a tick box for audio and Gemini Live recordings. Do not tick the box if you want to prevent Google from using that data to improve its AI models. Read the original article on BGR.
Yahoo
43 minutes ago
- Yahoo
CoreWeave's Q2 Loss Narrows Y/Y, Revenues Up, Stock Down
CoreWeave, Inc. CRWV reported a second-quarter 2025 loss per share of 60 cents compared with a loss of $1.62 in the year-ago quarter. Adjusted net loss for the quarter was $131 million compared with $5 million a year ago. The Zacks Consensus Estimate was pegged at a loss of 23 cents per share. Revenues in the quarter were a record $1212.8 million, which beat the Zacks Consensus Estimate by 12.5%. Total revenues jumped 207% year over year. The top-line performance was driven by increasing demand for the AI-cloud platform. In the second quarter, CoreWeave recorded a series of major achievements — securing high-profile customer wins across AI labs, hyperscalers and enterprises. Highlights included a $4 billion expansion with OpenAI, adding to the previously announced $11.9 billion deal and onboarding a new hyperscaler customer that expanded within the quarter. Other key partnerships were formed with BT Group, Cohere, Hippocratic AI, Hologen, LG CNS, Mistral, Moonvalley, Novel and Woven by Toyota. Revenue backlog (inclusive of remaining performance obligation and other amounts the company estimates will be recognized as revenues in future periods under committed customer contracts) was $30.1 billion, rising 86% year over year. This growth was driven by the OpenAI strategic agreement and its expansion in the first half of 2025, along with a major hyperscaler contract signed and extended in the second quarter of 2025. Year to date, revenue backlog has doubled to $6.9 billion. Following the results, shares declined 9.2% in the after-market trading session yesterday. CRWV's shares have gained 93.2% in the past year, significantly outperforming the 14.3% rise of its Internet Software industry. Image Source: Zacks Investment Research CRWV's Q2 Margin Performance Total operating expenses were $1.2 billion compared with $317.7 million in the year-ago quarter. Operating income was $19.2 million compared with operating income of $77.7 million in the prior-year quarter. Adjusted operating income was $200 million, up 134% year over year, while adjusted operating margin was 16%, down from 22%. Adjusted EBITDA was $753.2 million compared with $249.8 million in the prior-year quarter. CRWV's Q2 Cash Flow and Liquidity As of March 31, 2025, CRWV had $2.5 billion in cash, cash equivalents and restricted cash. The company exited the second quarter with cash used in operating activities of $251.3 million, while capex was $2.9 billion. CRWV's Guidance for Q3 & 2025 CRWV expects full-year 2025 revenues to be between $5.15 billion and $5.35 billion compared with $4.9 billion to $5.1 billion projected earlier. Adjusted operating income is still forecasted to be between $800 million and $830 million. Capex is estimated to be $20 billion to $23 billion. For the third quarter, CRWV projects revenues to be between $1.26 billion and $1.3 billion. Adjusted operating income is forecasted to be between $160 million and $190 million. Interest expense is projected to be $350 million to $390 million. Capex is forecasted to be between $2.9 billion and $3.4 billion. CRWV's Zacks Rank CRWV currently carries a Zacks Rank #4 (Sell). You can see the complete list of today's Zacks #1 (Strong Buy) Rank stocks here. Recent Performance of Other Companies in the Same Space Astera Labs ALAB reported second-quarter 2025 non-GAAP earnings of 44 cents per share, surpassing the Zacks Consensus Estimate by 33.33%. The company reported earnings of 13 cents per share. Net revenues surged 149.7% year over year to $192 billion, surpassing the Zacks Consensus Estimate by 11.1%. Shares of ALAB surged 394.1% in the past year. Datadog DDOG reported second-quarter 2025 non-GAAP earnings per share (EPS) of 46 cents, which beat the Zacks Consensus Estimate by 12.20%. The bottom line rose 7% from the year-ago quarter. Revenues of $826.8 million beat the consensus mark by 4.55% and increased 28.1% year over year. Shares of DDOG gained 13% in the past year. F5, Inc. FFIV reported third-quarter non-GAAP earnings of $4.16 per share, which surpassed the Zacks Consensus Estimate by 19.2% and came ahead of management's guidance of $3.41-$3.53 (midpoint of $3.47). The bottom line increased 23.8% year over year. F5's revenues of $780 million for the third quarter beat the consensus mark by 3.6%. The top line rose 12.2% on a year-over-year basis. Revenues also came ahead of management's guidance of $740-$760 million (midpoint of $750 million). Shares of FFIV increased 70.6% in the past year. Want the latest recommendations from Zacks Investment Research? Today, you can download 7 Best Stocks for the Next 30 Days. Click to get this free report F5, Inc. (FFIV) : Free Stock Analysis Report Datadog, Inc. (DDOG) : Free Stock Analysis Report Astera Labs, Inc. (ALAB) : Free Stock Analysis Report CoreWeave Inc. (CRWV) : Free Stock Analysis Report This article originally published on Zacks Investment Research ( Zacks Investment Research Error in retrieving data Sign in to access your portfolio Error in retrieving data Error in retrieving data Error in retrieving data Error in retrieving data