Meta Losing and Restructuring AI Team While AI Chief Disses LLMs

Meta is ranked 23rd on the LLM leaderboards and is also behind several Chinese open source LLM models. Meta Chief AI Scientist Yann LeCun explains why bigger models and more data alone can’t cross the gap to true intelligence—and what will move the field forward. Meta’s Yann LeCun says AGI cannot be reached by scaling …

Read more

DeepSeek New Deepseek-R1 Model is Competitive With OpenAI O3 and Gemini 2.5 Pro

A new version of DeepSeek is DeepSeek-R1-0528. DeepSeek R1 has significantly improved its depth of reasoning and inference capabilities by leveraging increased computational resources and introducing algorithmic optimization mechanisms during post-training. The model has demonstrated outstanding performance across various benchmark evaluations, including mathematics, programming, and general logic. Its overall performance is now approaching that of …

Read more

Qwen 2.5 Coder and Qwen 3 Lead in Open Source LLM Over DeepSeek and Meta

Qwen 2.5 Coder/Max is currently the top open-source model for coding, with the highest HumanEval (~70–72%), LiveCodeBench (70.7), and Elo (2056) scores among open models. DeepSeek V3/Coder V2 remains strong, especially in reasoning/math, but is slightly behind Qwen in code generation and competitive programming Elo. Meta Llama 4 Maverick offers unmatched context length (up to …

Read more

Google Gemini 2.5, Claude 3.7 and DeepSeek 3.1 Compete in Coding

Gemini 2.5 and Deepseek 3.1 both do very well on coding challenges and tests. Those who have tested them and Claude 3.7 Sonnet still prefer Claude for coding. 🚀 DeepSeek V3-0324 is now available in Cline. This model offers significant improvements for coding tasks while being up to 53x cheaper than Claude 3.7 Sonnet with …

Read more

Does DeepSeek Impact the Future of AI Data Centers?

China’s DeepSeek has made innovations in the cost of AI and innovations like mixture of experts (MoE) and fine-grain expert segmentation which significantly improve efficiency in large language models. The DeepSeek model activates only about 37 billion parameters out of its total 600+ billion parameters during inference, compared to models like Llama that activate all …

Read more

Deep Deepseek History and Impact on the Future of AI

Some believe DeepSeek is so efficient that we don’t need more compute and everything has now massive overcapacity because of the model changes. Jevons Paradox is closer to reality because demand has already increased H100 and H200 pricing. Deepseek and High Flyer have a mix of 50,000 H20s, H800s, A100s and H100 GPUs. Deepseek has …

Read more

What Are the Financial Trades Around the DeepSeek AI Earthquake?

There are many big questions about the impact of the efficiency gains from the improved DeepSeek methods. Deepseek improved the Reinforcement learning. Deepseek also directly accessed the Nvidia chips. They did NOT use Nvidia CUDA. CUDA was preventing them from doing what they needed to do with the chips. The Deepseek research paper gave a …

Read more