MoE

Baidu Releases ERNIE-4.5-21B-A3B-Thinking: A Compact MoE Model for Deep Reasoning

Baidu AI Research team has just released ERNIE-4.5-21B-A3B-Thinking, a new reasoning-focused large language model designed around efficiency, long-context reasoning, and tool integration. Being part of the ERNIE-4.5 family, this model is a Mixture-of-Experts (MoE) architecture with 21B total parameters but only 3B active parameters per token, making it computationally efficient while maintaining competitive reasoning capability….

Moonshot AI Releases Kimi K2: A Trillion-Parameter MoE Model Focused on Long Context, Code, Reasoning, and Agentic Behavior

ellonjohns2 months ago08 mins

Kimi K2, launched by Moonshot AI in July 2025, is a purpose-built, open-source Mixture-of-Experts (MoE) model—1 trillion total parameters, with 32 billion active parameters per token. It’s trained using the custom MuonClip optimizer on 15.5 trillion tokens, achieving stable training at this unprecedented scale without the typical instabilities seen in ultra-large models. Unlike traditional chatbots, K2 is architected…

Hunyuan-Large and the MoE Revolution: How AI Models Are Growing Smarter and Faster

ellonjohns9 months ago011 mins

Artificial Intelligence (AI) is advancing at an extraordinary pace. What seemed like a futuristic concept just a decade ago is now part of our daily lives. However, the AI we encounter now is only the beginning. The fundamental transformation is yet to be witnessed due to the developments behind the scenes, with massive models capable…

Highlights

NAS Devices Under $300 for Home & Small Office

Purpose-built AI inference architecture: Reengineering compute design

‘Hades II’ is double the ‘Hades,’ for better and worse

Push Notifications Deep Dive: The Ultimate Technical Guide to APNs & FCM

Category Collection

Baidu Releases ERNIE-4.5-21B-A3B-Thinking: A Compact MoE Model for Deep Reasoning

Moonshot AI Releases Kimi K2: A Trillion-Parameter MoE Model Focused on Long Context, Code, Reasoning, and Agentic Behavior

Hunyuan-Large and the MoE Revolution: How AI Models Are Growing Smarter and Faster