Latest news with #NativeSparseAttention

DeepSeek to share some AI model code, doubling down on open source

Zawya

21-02-2025

Business
Zawya

DeepSeek to share some AI model code, doubling down on open source

BEIJING - Chinese startup DeepSeek will make its models' code publicly available, it said on Friday, doubling down on its commitment to open-source artificial intelligence. The company said in a post on social media platform X that it will open source 5 code repositories next week, describing the move as "small but sincere progress" that it will share "with full transparency." "These humble building blocks in our online service have been documented, deployed and battle-tested in production." the post said. DeepSeek rattled the global AI industry last month when it released its open-source R1 reasoning model, which rivaled Western systems in performance while being developed at a lower cost. The company's commitment to open-source has distinguished it from most AI firms in China, which like their U.S. rivals lean towards closed-sourced models. DeepSeek's low-key founder Liang Wenfeng said in a rare interview with a Chinese media outlet last July that the firm did not prioritize commercializing its AI models and that there was soft power to be gained from open source. "Having others follow your innovation gives a great sense of accomplishment," Liang said in July. "In fact, open source is more of a cultural behavior than a commercial one, and contributing to it earns us respect" he added. The newly released open source code will provide infrastructure to support the AI models that DeepSeek has already publicly shared, building on top of those existing open source model frameworks. The announcement came after DeepSeek on Tuesday released a new algorithm called Native Sparse Attention (NSA), designed to make long-context training and inference more efficient. DeepSeek's user base exploded since last month. In China, it is the most popular chatbot service with 22.2 million daily active users as of January 11, surpassing Douban's 16.95 million users, according to a Chinese website that tracks AI products. (Reporting by Liam Mo, Eduardo Baptista and Brenda Goh; Editing by Elaine Hardcastle)

Reuters

21-02-2025

Business
Reuters

DeepSeek to share some AI model code, doubling down on open source

BEIJING, Feb 21 (Reuters) - Chinese startup DeepSeek will make its models' code publicly available, it said on Friday, doubling down on its commitment to open-source artificial intelligence. The company said in a post on social media platform X that it will open source 5 code repositories next week, describing the move as "small but sincere progress" that it will share "with full transparency." "These humble building blocks in our online service have been documented, deployed and battle-tested in production." the post said. DeepSeek rattled the global AI industry last month when it released its open-source R1 reasoning model, which rivaled Western systems in performance while being developed at a lower cost. The company's commitment to open-source has distinguished it from most AI firms in China, which like their U.S. rivals lean towards closed-sourced models. DeepSeek's low-key founder Liang Wenfeng said in a rare interview with a Chinese media outlet last July that the firm did not prioritize commercializing its AI models and that there was soft power to be gained from open source. "Having others follow your innovation gives a great sense of accomplishment," Liang said in July. "In fact, open source is more of a cultural behavior than a commercial one, and contributing to it earns us respect" he added. The newly released open source code will provide infrastructure to support the AI models that DeepSeek has already publicly shared, building on top of those existing open source model frameworks. The announcement came after DeepSeek on Tuesday released a new algorithm called Native Sparse Attention (NSA), designed to make long-context training and inference more efficient. DeepSeek's user base exploded since last month. In China, it is the most popular chatbot service with 22.2 million daily active users as of January 11, surpassing Douban's 16.95 million users, according to a Chinese website that tracks AI products.

Yahoo

21-02-2025

Business
Yahoo

DeepSeek to share some AI model code, doubling down on open source

BEIJING (Reuters) - Chinese startup DeepSeek will make its models' code publicly available, it said on Friday, doubling down on its commitment to open-source artificial intelligence. The company said in a post on social media platform X that it will open source 5 code repositories next week, describing the move as "small but sincere progress" that it will share "with full transparency." "These humble building blocks in our online service have been documented, deployed and battle-tested in production." the post said. DeepSeek rattled the global AI industry last month when it released its open-source R1 reasoning model, which rivaled Western systems in performance while being developed at a lower cost. The company's commitment to open-source has distinguished it from most AI firms in China, which like their U.S. rivals lean towards closed-sourced models. DeepSeek's low-key founder Liang Wenfeng said in a rare interview with a Chinese media outlet last July that the firm did not prioritize commercializing its AI models and that there was soft power to be gained from open source. "Having others follow your innovation gives a great sense of accomplishment," Liang said in July. "In fact, open source is more of a cultural behavior than a commercial one, and contributing to it earns us respect" he added. The newly released open source code will provide infrastructure to support the AI models that DeepSeek has already publicly shared, building on top of those existing open source model frameworks. The announcement came after DeepSeek on Tuesday released a new algorithm called Native Sparse Attention (NSA), designed to make long-context training and inference more efficient. DeepSeek's user base exploded since last month. In China, it is the most popular chatbot service with 22.2 million daily active users as of January 11, surpassing Douban's 16.95 million users, according to a Chinese website that tracks AI products. Sign in to access your portfolio

DeepSeek founder provides clue on start-up's AI priorities in new technical study

South China Morning Post

21-02-2025

Business
South China Morning Post

DeepSeek founder provides clue on start-up's AI priorities in new technical study

DeepSeek has signalled its next development priorities in a new technical study, with founder and chief executive Liang Wenfeng among 15 co-authors, that delves on 'native sparse attention' (NSA) – a system that is touted to make artificial intelligence (AI) models more efficient when processing vast amounts of data. Advertisement The study, titled 'Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention', was published by DeepSeek on Sunday via arXiv – an online forum for professional members of the scientific community – just a day before Liang, 40, took part in a symposium with tech entrepreneurs hosted by Chinese President Xi Jinping in Beijing DeepSeek has sharpened its focus on research, as worldwide attention on the Hangzhou -based start-up has increased, and is in no rush to conduct any fundraising and new commercial activities, according to a person with knowledge of the matter who declined to be identified. The study shows how Liang and DeepSeek's young team of scientists are continuing to push the envelope in their industry, following the start-up 's breakthrough development of advanced open-source AI models, V3 and R1 , at a fraction of the cost and computing power that major tech companies typically require for large language model (LLM) projects. 'With optimised design for modern [computing] hardware, NSA speeds up inference while reducing pre-training costs – without compromising performance,' the study said. Advertisement Inference refers to a situation when an AI model, which has been trained to see patterns in curated data sets, starts to recognise those patterns in data it has never seen before. As a result, the AI model can reason and make predictions that mimic a human's abilities.

DeepSeek founder provides clue on start-up's AI priorities in new study

South China Morning Post

19-02-2025

Business
South China Morning Post

DeepSeek founder provides clue on start-up's AI priorities in new study

Published: 8:00pm, 19 Feb 2025 DeepSeek has signalled its next development priorities in a new technical study, with founder and chief executive Liang Wenfeng among 15 co-authors, that delves on 'native sparse attention' (NSA) – a system that is touted to make artificial intelligence (AI) models more efficient when processing vast amounts of data. The study, titled 'Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention', was published by DeepSeek on Sunday via arXiv – an online forum for professional members of the scientific community – just a day before Liang, 40, took part in a symposium with tech entrepreneurs hosted by Chinese President Xi Jinping in Beijing . DeepSeek has sharpened its focus on research, as worldwide attention on the Hangzhou -based start-up has increased, and is in no rush to conduct any fundraising and new commercial activities, according to a person with knowledge of the matter who declined to be identified. The study shows how Liang and DeepSeek's young team of scientists are continuing to push the envelope in their industry, following the start-up 's breakthrough development of advanced open-source AI models, V3 and R1 , at a fraction of the cost and computing power that major tech companies typically require for large language model (LLM) projects. 'With optimised design for modern [computing] hardware, NSA speeds up inference while reducing pre-training costs – without compromising performance,' the study said. Inference refers to a situation when an AI model, which has been trained to see patterns in curated data sets, starts to recognise those patterns in data it has never seen before. As a result, the AI model can reason and make predictions that mimic a human's abilities. DeepSeek founder and chief executive Liang Wenfeng, second from left, exchanges greetings with President Xi Jinping at a symposium held in Beijing on Monday. Photo: CCTV

Latest news with #NativeSparseAttention

DeepSeek to share some AI model code, doubling down on open source

DeepSeek to share some AI model code, doubling down on open source

DeepSeek to share some AI model code, doubling down on open source

DeepSeek founder provides clue on start-up's AI priorities in new technical study

DeepSeek founder provides clue on start-up's AI priorities in new study

Get Started Now: Download the App