DeepSeek: Did a little known Chinese startup cause a ‘Sputnik moment’ for AI?

Did AI just have a “Sputnik moment”?

That’s what some investors, after the little known Chinese startup DeepSeek released a chatbot that experts say holds its own against industry leaders, like OpenAI and Google, despite being made with less money and computing power.

Buzz around DeepSeek built into a wave of concern that hammered tech stocks on Monday. It wiped almost $600bn from chipmaker Nvidia’s market value.

Not iterative or evolutionary, but pathbreaking

“This is, I think, something that has really shown to some degree how much the U.S. was living in a bubble,” said Antonia Hmaidi, a senior analyst at the Mercator Institute for China Studies in Berlin.

“OpenAI and companies like OpenAI had really bet on scaling being sort of infinite, and needing to buy more and more and more chips for performance to improve.”

What DeepSeek showed, she said, is that there are different paths.

The company says it used a little more than 2,000 Nvidia H800 GPUs to train the bot, and it did so in a matter of weeks for $5.6 million. Others have reportedly deployed 10,000 or more GPUs, and spent upwards of $100 million or more to get similar results.

Marina Zhang, a scholar with University of Technology Sydney, said DeepSeek has also demonstrated a new kind of innovation for China – not iterative or evolutionary, but pathbreaking.

“They’re not really following existing models,” she said. “It’s basically based on algorithm optimization, using software to break through the constraints of not enough computational power.”

Have the U.S. chip export controls failed?

Those constraints were imposed on China by the United States. In 2022, the Biden Administration banned the export of cutting edge microchips to China, arguing that they could be used to enhance the Chinese military.

Zhang said DeepSeek has shown that the chip blockade has not been successful so far. Beijing has been doubling down on a self-reliance drive in tech for several years, pouring money into chip development and other sectors, including AI.

Others argue it’s too early to say the chip export controls have failed.

Gregory Allen, director of the Wadhwani AI Center at the Center for Strategic and International Studies in Washington, said DeepSeek could have acquired all its chips before the effect of the controls started to be felt.

In a widely reported 2023 interview, DeepSeek founder Liang Wenfeng said the company had stockpiled some 10,000 Nvidia A100 GPUs – a variety that was put on the U.S. export control list. Experts think those may have been deployed in earlier versions of DeepSeek’s model.

After the chip blockade started, Nvidia developed a workaround, creating the slightly less powerful H800 GPU, which was legal to sell to China for a time.

“We are currently living through the era of the lagging impact of the Biden administration’s misfire in that first batch of AI export controls,” said Allen.

DeepSeek had a window in which it was able to buy H800s – before the administration eventually banned the sale of them to China, too.

“DeepSeek has discovered some architectural innovations, some algorithmic innovations that sort of increase the number of IQ points, the amount of intelligence, that a given AI model can get from a given quantity of computational resources,” he said.

But AI development requires computing power, and the number of advanced GPUs that DeepSeek, or any other Chinese company, can access is limited by the export controls, he said. That will eventually bite.

Allen says it means the U.S. has an edge: access to advanced chips without restrictions.

“We can copy China’s advantages. They cannot copy our advantages. At least not any time soon,” he said.

In terms of the hype around DeepSeek developing its near-cutting edge model on the cheap, Allen said the cost was undoubtedly far north of the reported $5.6 million. He likened it to the development of a drug.

“The cost of developing a new medication is not just the cost of the clinical trial that worked,” he said. “It’s the cost of all the clinical trials that didn’t work. And it’s the same with this AI model training run. DeepSeek has published how much it cost them for that final successful training run.”

It’s not known how much the company spent to get to that point, he said.

Hmaidi says DeepSeek is a “very legitimate triumph of Chinese engineering”. But she says it’s not yet the threat that many are making it out to be.

“I currently don’t see how you get a significantly better model with their current pipeline – without more compute,” she said.

“Personally, I don’t think it’s a threat to America’s AI prowess at this point.”

 

Senator calls RFK Jr.’s position on race and vaccines dangerous

In one of the most tense exchanges in a heated confirmation hearing, Senator Angela Alsobrooks called out past comments RFK Jr. made suggesting a different vaccine schedule for Black people.

Dick Button, Olympic great and voice of skating, dies at 95

The winner of two Olympic gold medals and five consecutive world championships, Button died Thursday in North Salem, New York, at age 95.

Q&A: OpenAI on rival DeepSeek and partnering with the government

OpenAI — the company behind ChatGPT and a big part of Stargate — is partnering with the U.S. National Laboratories. NPR's Mary Louise Kelly spoke with OpenAI's Chris LeHane, here are the highlights.

RFK Jr., Trump’s pick for HHS, grilled about vaccines again in Day 2 hearing

Robert F. Kennedy, Jr. said he wanted "gold standard science" on vaccines, but when presented with compelling research, he cited reasons to doubt it.

Questions about helicopter’s path could prove key in Pentagon probe of midair crash

Among the unanswered questions about the crash near Washington, D.C., are the flight pattern of the Black Hawk helicopter and the exact nature of its training exercise.

FDA upgrades recall of Lay’s potato chips to most serious level

The problem ingredient identified was "undeclared milk," which poses a risk to those with severe sensitivities or allergies.

More Front Page Coverage