What DeepSeek’s AI Did That Everyone Else’s Didn’t

The Chinese AI company DeepSeek exploded into the news cycle over the weekend after it replaced OpenAI’s ChatGPT as the most downloaded app on the Apple App Store.

Its commercial success followed the publication of several papers in which DeepSeek announced that its newest R1 models—which cost significantly less for the company to make and for customers to use—are equal to, and in some cases surpass, OpenAI’s best publicly available models.

So what did DeepSeek do that deep-pocketed OpenAI didn’t? It’s hard to say with certainty because OpenAI has been pretty cagey about how it trained its GPT-o1 model, the previous leader on a variety of benchmark tests. But there are some clear differences in the companies’ approaches and other areas where DeepSeek appears to have made impressive breakthroughs.

Probably the biggest difference—and certainly the one that sent the stocks of chip makers like NVIDIA tumbling on Monday—is that DeepSeek is creating competitive models much more efficiently than its bigger counterparts.

The company’s latest R1 and R1-Zero “reasoning” models are built on top of DeepSeek’s V3 base model, which the company said was trained for less than $6 million in computing costs using older NVIDIA hardware (which is legal for Chinese companies to buy, unlike the company’s state-of-the-art chips). By comparison, OpenAI CEO Sam Altman said that GPT-4 cost more than $100 million to train.

Karl Freund, founder of the industry analysis firm Cambrian AI Research, told Gizmodo that U.S. policies like the recent ban on advanced chip sales to China have forced companies like DeepSeek to improve by optimizing the architecture of their models rather than throwing money at better hardware and Manhattan-sized data centers.

“You can build a model quickly or you can do the hard work to build it efficiently,” Freund said. “The impact on Western companies will be that they’ll be forced to do the hard work that they’ve not been willing to undertake.”

DeepSeek didn’t invent most of the optimization techniques it used. Some, like using data formats that use less memory, have been proposed by its bigger competitors. The picture that emerges from DeepSeek’s papers—even for technically ignorant readers—is of a team that pulled in every tool they could find to make training require less computing memory and designed its model architecture to be as efficient as possible on the older hardware it was using.

OpenAI was the first developer to introduce so-called reasoning models, which use a technique called chain-of-thought that mimics humans’ trial-and-error method of problem solving to complete complex tasks, particularly in math and coding. The company hasn’t said how exactly it did that.

DeepSeek, on the other hand, laid out its process.

In the past, generative AI models have been improved by incorporating what’s known as reinforcement learning with human feedback (RLHF). Humans label the good and bad characteristics of a bunch of AI responses and the model is incentivized to emulate the good characteristics, like accuracy and coherency.

DeepSeek’s big innovation in building its R1 models was to do away with human feedback and design its algorithm to recognize and correct its own mistakes.

“DeepSeekR1-Zero demonstrates capabilities such as self-verification, reflection, and generating
long [chains-of-thought], marking a significant milestone for the research community,” the researchers wrote. “Notably, it is the
first open research to validate that reasoning capabilities of [large language models] can be incentivized purely through [reinforcement learning].”

The results of the pure reinforcement learning approach weren’t perfect. The R1-Zero model’s outputs were sometimes difficult to read and switched between languages.

So DeepSeek created a new training pipeline that incorporates a relatively small amount of labeled data to nudge the model in the preferred direction combined with several rounds of pure reinforcement learning.

The resulting model, R1, outperformed OpenAI’s GPT-o1 model on several math and coding problem sets designed for humans.

Bill Hannas and Huey-Meei Chang, experts on Chinese technology and policy at the Georgetown Center for Security and Emerging Technology, said China closely monitors the technological breakthroughs and practices of Western companies which has helped its companies find workarounds to U.S.

policies like chip embargoes that are designed to give American companies an advantage.

DeepSeek’s success, they said, isn’t a bad thing for the domestic industry but it is “a wake-up call to U.S. AI companies obsessed with gargantuan (and expensive) solutions. ‘Doing more with less’ underpins the approach taken at several Chinese state-funded labs.”

See more here Gizmodo.com

Please Donate Below To Support Our Ongoing Work To Defend The Scientific Method

PRINCIPIA SCIENTIFIC INTERNATIONAL, legally registered in the UK as a company incorporated for charitable purposes. Head Office: 27 Old Gloucester Street, London WC1N 3AX. 

Trackback from your site.

Leave a comment

Save my name, email, and website in this browser for the next time I comment.
Share via