Does DeepSeek Impact the Future of AI Data Centers?

China’s DeepSeek has made innovations in the cost of AI and innovations like mixture of experts (MoE) and fine-grain expert segmentation which significantly improve efficiency in large language models. The DeepSeek model activates only about 37 billion parameters out of its total 600+ billion parameters during inference, compared to models like Llama that activate all parameter. This results in dramatically reduced compute costs for both training and inference.

Others have been using mixture of experts (MoE) but DeepSeek R1 aggressively scaled to the number of experts within the model.

Othe Key efficiency improvements in DeepSeek’s architecture include:
Enhanced attention mechanisms with sliding window patterns, optimized key-value caching and multi-head attention.

Advanced position encoding innovations, including rotary position embeddings and dynamic calibration.

A novel routing mechanism that replaces the traditional auxiliary loss with a dynamic bias approach, improving expert utilization and stability.

These innovations have led to a 15-20% improvement in computational efficiency compared to traditional transformer implementations.
Amazon, Microsoft, Google and Meta are still proceeding with large data center buildouts for several reasons:

The surge in AI compute for reasoning and AI agents requires more compute and the increased efficiency enables more value to be delivered. Jevons paradox (economics) occurs when advancement make a resource more efficient to use but the effect is overall demand increases causing total consumption to rise. This was seen with cheaper personal computers meant the demand for computers increased 100 times from tens of millions to billions of units. The top 4 companies plan to spend $310 billion on AI infrastructure and research.

Deepseek came out at prices per million token that was far cheaper than OpenAI but OpenAi and Google Gemini have competitive and even better pricing.

The AI inference price improvements have been consistent but the surprise from Deepseek is that this latest push was not by OpenAI or Meta.

Google Gemini Flash 2.0 is lower cost per million tokens and gives faster answers than Deepseek.
OpenAI o3-mini has competitive pricing. It higher on input but output is twice as expensive.

Those who are building AI data centers and training models know that AI will continue to get much better and cheaper. The expectation is the demand for really good AI will increase despite cost improvements. There is energy efficiency and design choices that Deepseek has highlighted. They optimized coding by directly accessing the hardware of Nvidia GPUs. There are many companies exploring FPGA hardware encoding of logic.

There is scaling of pre-training, post training and test time training. There is also key competition for hardware efficiency and efficiency and optimization of all aspects of the hardware and software stacks.

There will be specialized AI models and agent systems that keep only the necessary specialized knowledge needed for particular use cases.

Energy and cost efficiency will be another area of competition beyond improving the base models.

There are increases in performance from test time compute. The OpenAI O3 model used 30,000 H100 GPU hours to answer the toughest math and reasoning problems. This type of AI inference will only be used when there is great value to push the boundaries of AI capability for a vastly superior and urgently needed answer. There needs to be some form of question pre-analysis or routing done to estimate how much effort is worthwhile.

Deepseek shows AI continues to improve rapidly and we are getting better results for less energy and less cost. It shows that AI will be profitable where answers and value will become lower and lower cost. Even virtually free with more and more capable local models. The World will change and more AI Data Centers will be needed to give the best answers or agent actions for the most valuable and challenging needs.

3 thoughts on “Does DeepSeek Impact the Future of AI Data Centers?”

  1. Well, the algorithmic improvements certainly aren’t over.

    New methods are found nearly every week, to make low parameter count open source/weights models perform feats previously only larger for-pay models could, or none at all.

    Like scaling up test-time compute with latent reasoning: https://arxiv.org/abs/2502.05171

    At this pace, cellphone runnable models will have the abilities of GPT4 and better soon.

    The cloud behemoths OTOH, could get nearly divine powers or probably face a plateau, but if there is such a wall, we can’t yet see it.

  2. I have to honestly say, the best AI, will always have a person in the loop. Why? Because “that person” can always provide that intangible quality, both unanticipated and yet needed in solving a problem. It’s insight a person brings to a problem, that no amount of data can solve.

  3. DeepSeek might not have negative impact on NVIDIA but it already choked software companies like OpenAI, Meta, Google… OpenAI drastically reduced its token price after DeepSeek R1 released. OpenAI will be hit hardest because it’s very difficult to raise huge capital as risk/reward seems very high now. Other deep pocket companies like Meta, Google, Microsoft have a great chance to catch up OpenAI now.

Comments are closed.