
DeepSeek: A Paradigm Shift, What It Means For Humanity
The release of DeepSeek-R1 immediately cratered the market cap of several hardware and software companies which were buoyed by what investors thought was American exceptionalism. Withholding the latest chips and AI Intellectual Property from China was thought to be the strategy to follow. Except it was wrong. Such is the stuff that leapfrogging is made of. Especially for manufacturing and design powerhouse such as China. Ironically, the latest models from DeepSeek are free to use. They even run it on their servers for free.
Development of general purpose large language models through scaling of parameters and training data led to many breakthroughs. The release of ChatGPT-3.5 and 4.0 in 2022-23 unleashed the general purpose potential of AI to the general public. This approach also increased costs tremendously as compute and data demands spurred bigger and better processors. In late 2023 and 2024 and even now, the construction of power hungry data centers were thought to be the only way to improve the performance of the models. Limiting access to computing and the latest chips was thought to restrain China as a source of these powerful models. With DeepSeek that paradigm was shifted.
Companies like Nvidia whose stock was heavily affected by the announcement have since recovered and thrived. The lessons were lost on global markets. The worst may yet to come as the companies buoyed by the rising of AI and its use are brought down to earth by a combination of new methods and the lessening of compute needed to do training as well as inference.
Sunk costs and the costs of switching with their own powerful economic adherents prevent a longer term view and lock the American AI in their paths. Success breeds complacency and adherence to the model that produced success. In AI, a rapidly developing field, getting stuck on algorithms, process and practice is deadly. DeepSeek showed that just piling on computing and data does not make for exponential progress. This is a lesson from many fields, that is often ignored with an overused but wrong dictum 'This time it is different.' Innovation follows familiar patterns; slowly then rapidly.
Efficiency
The costs of training and running DeepSeek are much lower than for other models. The ratio in a recent presentation showed $6M for DeepSeek/ versus $600M for Llama (the open source model from Meta). One hundredth the cost. The costs for other models, including ChatGPT, are even more. The cost savings are a result of implementing DeepSeek's own discoveries in reinforcement learning and training using distillation. Further, the model is very efficient in generating Chinese language. As of three months ago, a large number of Chinese companies had joined the AI revolution by subscribing to DeepSeek. As the national champion, the government industrial policy supports DeepSeek.
RL as a training method was invented in the University of Amherst. The recipients of the 2024 ACM Turing award, Andrew Barto and Richard Sutton were the inventors of the classic reinforcement learning techniques. For LLMs and other large models, such an approach falls under supervised learning. The model is refined by feedback, classically from humans, called RLHF (Reinforcement Learning with Human Feedback). This is called supervised fine- tuning. Humans are the supervisors. The paper released by the creators of DeepSeek R1 goes into detail on the way that they modified RL.
Anything that involves humans in the loop at scale requires a lot of money. Removing the human in the loop makes training cheaper. A version of the model is used to fine-tune the other. In other words, one model functions as the supervisor and the other is trained. The arrival of new companies with models such as MiniMax-M1 epitomizes this shift even more. Such techniques will overtake models which are created using conventional scaling.
DeepSeek-R1 was effective through its evolution utilizing multiple strategies. A combination of novel methods based on existing techniques made the training and inference efficient in time and resources. More details can be found in this article. In short, all aspects of the creation and running of large language models were changed, enhanced or reworked for cost and time efficiency.
MiniMax-M1
MiniMax-M1 claims to have chopped the cost of DeepSeek-R1 training by 90%. They trained their model for a cost of $500K. Contrast this to the $6M cost for DeepSeek-R1 and $600M for LLaMa. There have been doubts cast on the numbers publicized by both DeepSeek and MiniMax.
Efficiencies have been through further refining RL with what is called lightning attention. This is mostly for deterministic problems such as mathematical and logical reasoning and long context problems such as coding. Minimax is also available through HuggingFace the open source AI host.
Privacy
There is concern that DeepSeek is harvesting private data for its own use. This phenomenon is rife in the world of AI and social media in general. What makes the sharing of private data with DeepSeek or other private companies is the fact that they will be used to refine the models. In the case of DeepSeek or other China based companies, there is a fear of data reaching the Chinese government. Private AI companies, even those in the United States do the same, except they will share that data with the US government if they are forced by law. At this juncture, such a scenario is more disquieting. The fourth amendment will fall by the wayside, if the government can search not only our persons and our homes, but our minds without a warrant.
To read more about the risks of DeepSeek, read this analysis from Hidden Layer. Since Hidden Layer's business model is based on these kinds of analysis, it is best to look closely at the analysis and compare with their work on other open models.
Open Source AI Models
Open Source International (OSI) has a definition of Open Source AI. It is 1.0 right now, subject to revision. Like the Open Source definition for software, it allows users to use, observe, modify and distribute without any restrictions. AI models depend a lot on their training data. AI use involves inference, consuming resources. The expenditure on training is separate from the expense of inference. In the classic definition of open source software the source code is available to any user to use, observe, modify and distribute. In a strict interpretation of AI open-source, the source code should include data used to train the model. However this may not be practical, nor is it part of the OSI definition of Open Source AI.
This is drastically different from the OSI guidance for open source software. The other difference is the observability of the model weights and hyperparameters. During the learning phase model weights are refined. Model weights embody the model in its current form, crystallizing all the training that the model has undergone. Hyperparameters control the initial configuration of the learning setup. In an open model, model weights and model parameters are meant to be open.
Open Source AI models can be called open weights models. Many models from China are open weights models, including Qwen (From AliBababa). This competition has also forced OpenAI to release an open weight model. This is the gpt-oss base model with two variants.
The Future
We have not delved into the technology behind the creation of multi-modal prompts and multi-modal generation. By multi-modal, we mean not only text, but images, audio as well as video. MiniMax as well as DeepSeek have these capabilities. It is clear that limiting access to hardware and know-how cannot hold true innovation back. Such constraints also make for multiple paradigm shifts, making AI cheaper to develop with lower hardware and power resources, creating democratized and decentralized future where we could fine-tune and run models on commodity hardware. These developments give us hope that we will be able to control and bend these capabilities to help humanity rather than harm ourselves.
Hashtags

Try Our AI Features
Explore what Daily8 AI can do for you:
Comments
No comments yet...
Related Articles
Yahoo
16 minutes ago
- Yahoo
SoftBank's AI investment spree to be in focus on at Q1 earnings
By Anton Bridge TOKYO (Reuters) -When Japan's SoftBank Group reports earnings on Thursday, its mammoth investments in artificial intelligence companies are set to take the spotlight. Analysts and investors are keen for updates on how they will be financed, the timeline for returns to materialise and whether assets will be sold to fund the new projects. SoftBank has embarked on its biggest spending spree since the launch of its Vision Funds in 2017 and 2019. It is leading a $40 billion funding round for ChatGPT maker OpenAI. SoftBank has until the end of the year to fund its $22.5 billion portion, although the remainder has been subscribed, according to a source familiar with the matter. It is also leading the financing for the Stargate project - a $500 billion scheme to develop data centres in the United States, part of its effort to position itself as the "organiser of the industry," founder Masayoshi Son said in June. SoftBank has yet to release details on what kinds of returns its financing of the Stargate project could generate. The extent of third-party investment will determine what other financing tools, such as bank loans and debt issuance, it may have to deploy. In July, SoftBank raised $4.8 billion by selling off a portion of its holding in T-Mobile. "If other sources of capital are less supportive, SoftBank could look to asset-backed finance, which is collateralised by equity in other holdings," Macquarie analyst Paul Golding said. The Japanese conglomerate is expected to post a net profit of 127.6 billion yen ($865 million) in the April-June quarter, according to the average estimate of three analysts polled by LSEG. That would mark SoftBank's second consecutive quarter of profit and follow its first annual profit in four years when it was helped by a strong performance by its telecom holdings and higher valuations for its later-stage startups. Its results are, however, typically very volatile and difficult to estimate due to manifold investments, many of which are not listed. SoftBank's performance in exiting from investments and distributing profits has been patchy of late. The Vision Funds had made a cumulative investment loss of $475 million as of end-March. That said, 13 of 18 analysts have a "buy" or "strong buy" rating for SoftBank's stock, according to LSEG. Although there is some concern in the market that AI-related valuations have become bubbly, they continue to climb. OpenAI is in early-stage discussions about a stock sale that would allow employees to cash out and could value the company at about $500 billion, according to the source - a huge jump from its current valuation of $300 billion. Error in retrieving data Sign in to access your portfolio Error in retrieving data Error in retrieving data Error in retrieving data Error in retrieving data


Axios
17 minutes ago
- Axios
"Clankers": A robot slur emerges to express disdain for AI's takeover
AI is everywhere whether you like it or not, and some online have turned to a choice word to express their frustration. Why it matters: Referring to an AI bot as a "clanker" (or a "wireback," or a "cogsucker") has emerged as a niche, irreverent internet phenomenon that illuminates a broader disdain for the way AI is overtaking technology, labor, and culture. State of play: The concerns range from major to minor: people are concerned that AI will put them out of a job, but they're also annoyed that it's getting harder to reach a human being at their mobile carrier. "When u call customer service and a clanker picks up" one X post from July reads, with over 200,000 likes, alongside a photo of someone removing their headset in resignation. "Genuinely needed urgent bank customer service and a clanker picked up," reads another from July 30. Here's what to know: Where "clanker" comes from Context: The word is onomatopoeic, but the term can be traced back to Star Wars. It comes from a 2005 Star Wars video game, "Republic Commando," according to Know Your Meme. The term was also used in 2008's Star Wars: The Clone Wars: "Okay, clankers," one character says. "Eat lasers." Robot-specific insults are a common trope in science fiction. In the TV Show Battlestar Galactica, characters refer to the robots as "toasters" and "chrome jobs." "Slang is moving so fast now that a [Large Language Model] trained on everything that happened before... is not going to have immediate access to how people are using a particular word now," Nicole Holliday, associate professor of linguistics at UC Berkeley, told Rolling Stone. "Humans [on] Urban Dictionary are always going to win." How people feel about AI Anxiety over AI's potential impact on the workforce is especially strong. By the numbers: U.S. adults' concerns over AI have grown since 2021, according to Pew Research Center, and 51% of them say that they're more concerned than excited about the technology. Only 23% of adults said that AI will have a very or somewhat positive impact on how people do their jobs over the next 20 years. And those anxieties aren't unfounded. AI could wipe out half of all entry-level white-collar jobs — and spike unemployment to 10-20% in the next one to five years, Anthropic CEO Dario Amodei told Axios in May. And the next job market downturn — whether it's already underway or still years off — might be a bloodbath for millions of workers whose jobs can be supplanted by AI, Axios' Neil Irwin wrote on Wednesday. People may have pressing concerns about their jobs or mental health, but their annoyances with AI also extend to the mundane, like customer service, Google searches, or dating apps. Social media users have described dating app interactions where they suspect the other party is using AI to write responses. There are a number of apps solely dedicated, in fact, to creating images and prompts for dating apps. Yes, but: Hundreds of millions of people across the world are using ChatGPT every day, its parent company reports. What we're watching: Sens. Ruben Gallego (D-AZ) and Jim Justice (R-WV) introduced a bipartisan bill last month to ensure that people can speak to a human being when contacting U.S. call centers. "Slur" might not be the right word for what's happening People on the internet who want a word to channel their AI frustrations are clear about the s-word. The inclination to "slur" has clear, cathartic appeal, lexical semantician Geoffrey Nunberg wrote in his 2018 article "The Social Life of Slurs." But any jab at AI is probably better classified as "derogatory." "['Slur'] is both more specific and more value-laden than a term like "derogative," Nunberg writes, adding that a derogative word "qualifies as a slur only when it disparages people on the basis of properties such as race, religion, ethnic or geographical origin, gender, sexual orientation or sometimes political ideology." "Sailing enthusiasts deprecate the owners of motor craft as 'stinkpotters,' but we probably wouldn't call the word a slur—though the right-wingers' derogation of environmentalists as 'tree-huggers' might qualify, since that antipathy has a partisan cast."


Bloomberg
18 minutes ago
- Bloomberg
China's Property Slump to Last Longer Than Expected, UBS Says
UBS Group AG, which had been among the few firms predicting a recovery in China's property sector, now expects a delay following a renewed sales slowdown in the second quarter. John Lam, head of China and Hong Kong property research at the Swiss bank, said in March that home prices in top-tier cities would 'turn stable' by early 2026. He now anticipates that to happen in mid-to-late 2026, unless Beijing introduces additional stimulus measures.