Huawei’s new AI CloudMatrix cluster beats Nvidia’s GB200 by brute force, uses 4X the power

Huawei’s new AI CloudMatrix cluster beats Nvidia’s GB200 by brute force, uses 4X the power



Unable to use leading-edge process technologies to produce its high-end processors for AI, Huawei has to rely on brute force – install more processors than its industry competitors to achieve comparable performance for AI.

To do this, Huawei took a multifaceted strategy that includes a dual-chiplet HiSilicon Ascend 910C processor, optical interconnections, and the Huawei AI CloudMatrix 384 rack-scale solution that relies on proprietary software, reports SemiAnalysis. The whole system provides a 2.3X lower performance per watt than Nvidia’s GB200 NVL72, but it still enables Chinese companies to train advanced AI models.

At glance

Huawei’s CloudMatrix 384 is a rack-scale AI system composed of 384 Ascend 910C processors arranged in a fully optical, all-to-all mesh network. The system spans 16 racks, including 12 compute racks housing 32 accelerators each and four networking racks facilitating high-bandwidth interconnects using 6,912 800G LPO optical transceivers. 



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *