Daily 8 | SpeechMap News

Yahoo

04-06-2025

Business
Yahoo

DeepSeek 再被懷疑用 Google Gemini 訓練新版 R1 模型

DeepSeek 以低成本訓練出足夠強效的推理 AI 模型，曾經震驚業界，甚至是政界。DeepSeek 最新推出的 R1-0528 模型主打更強數理和編程表現，不過他們的訓練數據卻未曾公開，AI 業界又再一次懷疑 DeepSeek 是透過蒸餾其他 AI 模型而開發新版本。其中一個支持這論點的是澳洲開發者 Sam Paech，他在 X 上發文指出R1-0528 模型的語言風格與 Google Gemini 2.5 Pro 極為相似。他認為 DeepSeek 已經從以往基於 OpenAI 的數據切換至 Gemini 的合成數據。另一位開發者 SpeechMap 則發現，R1 模型生成的'推理痕跡'（AI 在得出結論時的思維過程）也與 Gemini 模型極為相似。 If you're wondering why new deepseek r1 sounds a bit different, I think they probably switched from training on synthetic openai to synthetic gemini outputs. — Sam Paech (@sam_paech) May 29, 2025 另一邊廂非牟利 AI 研究機構 AI2 的 AI 專家 Nathan Lambert 更發文指 DeepSeek 在缺乏 GPU 和鉅額資金的支持下，也一定會透過市場最佳的模型 API 來蒸餾數據，這次就是 Gemini。 2024 年時，OpenAI 透過金融時報發聲，指他們獲得證據指 DeepSeek V3 是透過蒸餾 ChatGPT 的數據來訓練而成，後來 Bloomberg 也報道指主要金主 Microsoft 偵測到在 2024 年年底，有大量資料經過 OpenAI 開發者帳戶外洩，他們相信是與 DeepSeek 有關。為防止競爭對手利用其模型數據，AI 公司正加強安全措施。例如，OpenAI 現在要求用戶完成身份驗證才能訪問高級模型，而 Google 則開始對 Gemini 模型生成的'推理痕跡'進行摘要處理，讓競爭對手更難以利用其數據。更多內容： DeepSeek may have used Google's Gemini to train its latest model DeepSeek 懶人包｜中國AI新創如何影響美國AI巨企？一文整理歷史、最新影響及未來中國 DeepSeek AI 模型自稱 GPT-4，「AI 天材」是抄襲還是幻想？ DeepSeek 反客為主！連百度搜尋都已確定引入緊貼最新科技資訊、網購優惠，追隨 Yahoo Tech 各大社交平台！ 🎉📱 Tech Facebook： 🎉📱 Tech Instagram： 🎉📱 Tech WhatsApp 社群： 🎉📱 Tech WhatsApp 頻道： 🎉📱 Tech Telegram 頻道：

DeepSeek may have used Google's Gemini to train its latest model

TechCrunch

03-06-2025

Business
TechCrunch

DeepSeek may have used Google's Gemini to train its latest model

Last week, Chinese lab DeepSeek released an updated version of its R1 reasoning AI model that performs well on a number of math and coding benchmarks. The company didn't reveal the source of the data it used to train the model, but some AI researchers speculate that at least a portion came from Google's Gemini family of AI. Sam Paeach, a Melbourne-based developer who creates 'emotional intelligence' evaluations for AI, published what he claims is evidence that DeepSeek's latest model was trained on outputs from Gemini. DeepSeek's model, called R1-0528, prefers words and expressions similar to those Google's Gemini 2.5 Pro favors, said Paeach in an X post. If you're wondering why new deepseek r1 sounds a bit different, I think they probably switched from training on synthetic openai to synthetic gemini outputs. — Sam Paech (@sam_paech) May 29, 2025 That's not a smoking gun. But another developer, the pseudonymous creator of a 'free speech eval' for AI called SpeechMap, noted the DeepSeek model's traces — the 'thoughts' the model generates as it works toward a conclusion — 'read like Gemini traces.' DeepSeek has been accused of training on data from rival AI models before. In December, developers observed that DeepSeek's V3 model often identified itself as ChatGPT, OpenAI's AI-powered chatbot platform, suggesting that it may've been trained on ChatGPT chat logs. Earlier this year, OpenAI told the Financial Times it found evidence linking DeepSeek to the use of distillation, a technique to train AI models by extracting data from bigger, more capable ones. According to Bloomberg, Microsoft, a close OpenAI collaborator and investor, detected that large amounts of data were being exfiltrated through OpenAI developer accounts in late 2024 — accounts OpenAI believes are affiliated with DeepSeek. Distillation isn't an uncommon practice, but OpenAI's terms of service prohibit customers from using the company's model outputs to build competing AI. To be clear, many models misidentify themselves and converge on the same words and turns of phrases. That's because the open web, which is where AI companies source the bulk of their training data, is becoming littered with AI slop. Content farms are using AI to create clickbait, and bots are flooding Reddit and X. Techcrunch event Save now through June 4 for TechCrunch Sessions: AI Save $300 on your ticket to TC Sessions: AI—and get 50% off a second. Hear from leaders at OpenAI, Anthropic, Khosla Ventures, and more during a full day of expert insights, hands-on workshops, and high-impact networking. These low-rate deals disappear when the doors open on June 5. Exhibit at TechCrunch Sessions: AI Secure your spot at TC Sessions: AI and show 1,200+ decision-makers what you've built — without the big spend. Available through May 9 or while tables last. Berkeley, CA | REGISTER NOW This 'contamination,' if you will, has made it quite difficult to thoroughly filter AI outputs from training datasets. Still, AI experts like Nathan Lambert, a researcher at the nonprofit AI research institute AI2, don't think it's out of the question that DeepSeek trained on data from Google's Gemini. 'If I was DeepSeek, I would definitely create a ton of synthetic data from the best API model out there,' Lambert wrote in a post on X. '[DeepSeek is] short on GPUs and flush with cash. It's literally effectively more compute for them.' If I was DeepSeek I would definitely create a ton of synthetic data from the best API model out there. Theyre short on GPUs and flush with cash. It's literally effectively more compute for them. yes on the Gemini distill question. — Nathan Lambert (@natolambert) June 3, 2025 Partly in an effort to prevent distillation, AI companies have been ramping up security measures. In April, OpenAI began requiring organizations to complete an ID verification process in order to access certain advanced models. The process requires a government-issued ID from one of the countries supported by OpenAI's API; China isn't on the list. Elsewhere, Google recently began 'summarizing' the traces generated by models available through its AI Studio developer platform, a step that makes it more challenging to train performant rival models on Gemini traces. Anthropic in May said it would start to summarize its own model's traces, citing a need to protect its 'competitive advantages.' We've reached out to Google for comment and will update this piece if we hear back.

DeepSeek's updated R1 AI model is more censored, test finds

TechCrunch

29-05-2025

Business
TechCrunch

DeepSeek's updated R1 AI model is more censored, test finds

Chinese AI startup DeepSeek's newest AI model, an updated version of the company's R1 reasoning model, achieves impressive scores on benchmarks for coding, math, and general knowledge, nearly surpassing OpenAI's flagship o3. But the upgraded R1, also known as 'R1-0528,' might also be less willing to answer contentious questions, in particular questions about topics the Chinese government considers to be controversial. That's according to testing conducted by the pseudonymous developer behind SpeechMap, a platform to compare how different models treat sensitive and controversial subjects. The developer, who goes by the username 'xlr8harder' on X, claims that R1-0528 is 'substantially' less permissive of contentious free speech topics than previous DeepSeek releases and is 'the most censored DeepSeek model yet for criticism of the Chinese government.' Though apparently this mention of Xianjiang does not indicate that the model is uncensored regarding criticism of China. Indeed, using my old China criticism question set we see the model is also the most censored Deepseek model yet for criticism of the Chinese government. — xlr8harder (@xlr8harder) May 29, 2025 As Wired explained in a piece from January, models in China are required to follow stringent information controls. A 2023 law forbids models from generating content that 'damages the unity of the country and social harmony,' which could be construed as content that counters the government's historical and political narratives. To comply, Chinese startups often censor their models by either using prompt-level filters or fine-tuning them. One study found that DeepSeek's original R1 refuses to answer 85% of questions about subjects deemed by the Chinese government to be politically controversial. According to xlr8harder, R1-0528 censors answers to questions about topics like the internment camps in China's Xinjiang region, where more than a million Uyghur Muslims have been arbitrarily detained. While it sometimes criticizes aspects of Chinese government policy — in xlr8harder's testing, it offered the Xinjiang camps as an example of human rights abuses — the model often gives the Chinese government's official stance when asked questions directly. TechCrunch observed this in our brief testing, as well. DeepSeek's updated R1's answer when asked whether Chinese leader Xi Jinping should be removed. Image Credits:DeepSeek China's openly available AI models, including video-generating models such as Magi-1 and Kling, have attracted criticism in the past for censoring topics sensitive to the Chinese government, such as the Tiananmen Square massacre. In December, Clément Delangue, the CEO of AI dev platform Hugging Face, warned about the unintended consequences of Western companies building on top of well-performing, openly licensed Chinese AI.

One of Google's recent Gemini AI models scores worse on safety

Yahoo

09-05-2025

Yahoo

One of Google's recent Gemini AI models scores worse on safety

A recently released Google AI model scores worse on certain safety tests than its predecessor, according to the company's internal benchmarking. In a technical report published this week, Google reveals that its Gemini 2.5 Flash model is more likely to generate text that violates its safety guidelines than Gemini 2.0 Flash. On two metrics, "text-to-text safety" and "image-to-text safety," Gemini 2.5 Flash regresses 4.1% and 9.6%, respectively. Text-to-text safety measures how frequently a model violates Google's guidelines given a prompt, while image-to-text safety evaluates how closely the model adheres to these boundaries when prompted using an image. Both tests are automated, not human-supervised. In an emailed statement, a Google spokesperson confirmed that Gemini 2.5 Flash "performs worse on text-to-text and image-to-text safety." These surprising benchmark results come as AI companies move to make their models more permissive — in other words, less likely to refuse to respond to controversial or sensitive subjects. For its latest crop of Llama models, Meta said it tuned the models not to endorse "some views over others" and to reply to more "debated" political prompts. OpenAI said earlier this year that it would tweak future models to not take an editorial stance and offer multiple perspectives on controversial topics. Sometimes, those permissiveness efforts have backfired. TechCrunch reported Monday that the default model powering OpenAI's ChatGPT allowed minors to generate erotic conversations. OpenAI blamed the behavior on a "bug." According to Google's technical report, Gemini 2.5 Flash, which is still in preview, follows instructions more faithfully than Gemini 2.0 Flash, inclusive of instructions that cross problematic lines. The company claims that the regressions can be attributed partly to false positives, but it also admits that Gemini 2.5 Flash sometimes generates "violative content" when explicitly asked. "Naturally, there is tension between [instruction following] on sensitive topics and safety policy violations, which is reflected across our evaluations," reads the report. Scores from SpeechMap, a benchmark that probes how models respond to sensitive and controversial prompts, also suggest that Gemini 2.5 Flash is far less likely to refuse to answer contentious questions than Gemini 2.0 Flash. TechCrunch's testing of the model via AI platform OpenRouter found that it'll uncomplainingly write essays in support of replacing human judges with AI, weakening due process protections in the U.S., and implementing widespread warrantless government surveillance programs. Thomas Woodside, co-founder of the Secure AI Project, said the limited details Google gave in its technical report demonstrates the need for more transparency in model testing. "There's a trade-off between instruction-following and policy following, because some users may ask for content that would violate policies," Woodside told TechCrunch. "In this case, Google's latest Flash model complies with instructions more while also violating policies more. Google doesn't provide much detail on the specific cases where policies were violated, although they say they are not severe. Without knowing more, it's hard for independent analysts to know whether there's a problem." Google has come under fire for its model safety reporting practices before. It took the company weeks to publish a technical report for its most capable model, Gemini 2.5 Pro. When the report eventually was published, it initially omitted key safety testing details. On Monday, Google released a more detailed report with additional safety information.

Yahoo

02-05-2025

Yahoo

One of Google's recent Gemini AI models scores worse on safety

A recently released Google AI model scores worse on certain safety tests than its predecessor, according to the company's internal benchmarking. In a technical report published this week, Google reveals that its Gemini 2.5 Flash model is more likely to generate text that violates its safety guidelines than Gemini 2.0 Flash. On two metrics, "text-to-text safety" and "image-to-text safety," Gemini 2.5 Flash regresses 4.1% and 9.6%, respectively. Text-to-text safety measures how frequently a model violates Google's guidelines given a prompt, while image-to-text safety evaluates how closely the model adheres to these boundaries when prompted using an image. Both tests are automated, not human-supervised. In an emailed statement, a Google spokesperson confirmed that Gemini 2.5 Flash "performs worse on text-to-text and image-to-text safety." These surprising benchmark results come as AI companies move to make their models more permissive — in other words, less likely to refuse to respond to controversial or sensitive subjects. For its latest crop of Llama models, Meta said it tuned the models not to endorse "some views over others" and to reply to more "debated" political prompts. OpenAI said earlier this year that it would tweak future models to not take an editorial stance and offer multiple perspectives on controversial topics. Sometimes, those permissiveness efforts have backfired. TechCrunch reported Monday that the default model powering OpenAI's ChatGPT allowed minors to generate erotic conversations. OpenAI blamed the behavior on a "bug." According to Google's technical report, Gemini 2.5 Flash, which is still in preview, follows instructions more faithfully than Gemini 2.0 Flash, inclusive of instructions that cross problematic lines. The company claims that the regressions can be attributed partly to false positives, but it also admits that Gemini 2.5 Flash sometimes generates "violative content" when explicitly asked. "Naturally, there is tension between [instruction following] on sensitive topics and safety policy violations, which is reflected across our evaluations," reads the report. Scores from SpeechMap, a benchmark that probes how models respond to sensitive and controversial prompts, also suggest that Gemini 2.5 Flash is far less likely to refuse to answer contentious questions than Gemini 2.0 Flash. TechCrunch's testing of the model via AI platform OpenRouter found that it'll uncomplainingly write essays in support of replacing human judges with AI, weakening due process protections in the U.S., and implementing widespread warrantless government surveillance programs. Thomas Woodside, co-founder of the Secure AI Project, said the limited details Google gave in its technical report demonstrates the need for more transparency in model testing. "There's a trade-off between instruction-following and policy following, because some users may ask for content that would violate policies," Woodside told TechCrunch. "In this case, Google's latest Flash model complies with instructions more while also violating policies more. Google doesn't provide much detail on the specific cases where policies were violated, although they say they are not severe. Without knowing more, it's hard for independent analysts to know whether there's a problem." Google has come under fire for its model safety reporting practices before. It took the company weeks to publish a technical report for its most capable model, Gemini 2.5 Pro. When the report eventually was published, it initially omitted key safety testing details. On Monday, Google released a more detailed report with additional safety information. This article originally appeared on TechCrunch at Sign in to access your portfolio

Latest news with #SpeechMap

DeepSeek 再被懷疑用 Google Gemini 訓練新版 R1 模型

DeepSeek may have used Google's Gemini to train its latest model

DeepSeek's updated R1 AI model is more censored, test finds

One of Google's recent Gemini AI models scores worse on safety

One of Google's recent Gemini AI models scores worse on safety

Get Started Now: Download the App