How DeepSeek Achieved More with Less in AI Innovation

As with most of the news that enters the Artificial Intelligence arena, there are many different assessments regarding the impact of DeepSeek on the panorama of AI. When the news of a Chinese highly capable and cost-efficient Large Language Model hit the world on January 20th, the whole industry reacted immediately. Most of the headlines reported the direct connection between the Models release and the largest loss of market cap value ever recorded by Wall Street in a single day, evaporating $593 billion of Nvidia’s market value.

Until now, AI development has been seen from a resource-centric perspective. Countless discussions are originating in public spheres and research communities regarding the amount of water and the infrastructure requirements needed to develop these technologies. Until now, the most prominent resource required to build cutting-edge models was the latest and most sophisticated computational chips produced exclusively by Nvidia. The United States’ geopolitical pressure to remain on top of the industry materialized itself in an embargo of the coveted chips towards China. This meant that Chinese researchers had to make do with the second-best chips available. Of course, this stunted their growth for at least until the researchers at DeepSeek responded by launching a barrage of free, open source, and efficient LLMs.

Although DeepSeek R1 performs similarly to the industry giant Chat-GPT, Chinese researchers achieved this using a fraction of the resources. This is a direct result of the utilization of methods that existed before but had never been used to this extent. Specifically, the combination of Distillation and Chain of Thought Reasoning delivered these astounding results. Distillation is a technique in which a parent model trains a smaller, more compact model. Using synthetic structure data, that is, algorithmically created data, the smaller LLM can learn how the bigger one thinks without having to compute the data itself. In a nutshell, the student can perform like the teacher but at a smaller magnitude of computing power. Chain of Thought Reasoning is the other arm of this innovative model. This technique urges the model to answer the queries of a prompt by taking several steps along the way. This allows minor corrections to raise accuracy and transparency in the model’s reasoning. This means that, while products like Chat-GPT seek to find reliable information through a single extraction of costly and murky data, Deepseek stops along the way to try different avenues and then finds the most efficient one. It weighs its own policy against a new one within the same prompt, keeping statistical guardrails to ensure model stability. These are the two novelties the Chinese researchers utilized to develop the most efficient LLM known to date.

In the research paper published along with DeepSeek, the creators shared that models like Llama and Qwen had been involved in the creation and distillation of DeepSeek R1, and OpenAI even suggested that their model had been unlawfully used to train DeepSeek, which is problematic since Chat-GPT is not an open source model. Interestingly, the implementation of novel techniques is not what caused the schism in the industry; it is the fact that DeepSeek now has a competitive parent model that distilled other prominent players in the industry while retaining the open source ethos. This means that the industry panorama is turned upside down, which could result in innovation by leveling the playing field to force both titans and newcomers to reinvent specific techniques to remain relevant. 

As it has become apparent, AI has cemented itself as a strategic industry, the scope of which surpasses the technological implications of its deployment. Artificial Intelligence has become another frontier in which countries compete with each other. Still, it has also become apparent that throwing money and chip embargos will no longer ensure better results. Donald Trump’s recent effort to ensure the United States’ dominance over this industry by investing 500 billion dollars over 4 years follows the assumption that money and resources are the most critical elements of AI. But while no one would have questioned that decision two months ago, DeepSeek has made it evident that groundbreaking innovation can grow in more resource-constrained environments.

* Guillermo Alfaro studied International Relations and Political Science at ITAM, where he specialized in research on AI and global tech governance. He was a member of the first cohort of the Cyber Policy Dialog for the Americas, organized in conjunction with Stanford University (2024), and has fostered debate around these topics by organizing academic events in Mexico. Guillermo is deeply committed to positioning the Global South at the forefront of discussions regarding AI and technology.

Source: We Are Innovation