Tesla Has 4+ Exaflops of Dojo or 3% Dojo Versus Nvidia AI Chips

Tesla just presented a highly technical talk about Dojo’s networking protocol at the Hot Chips conference. The Tesla networking protocol is twice as fast as the industry standard NVLink used in Nvidia supercomputers. Tesla Dojo is cost competitive to Nvidia.

Tesla is nearing the end of Dojo Version 1. They are working on Dojo Version 2. Have plans for Dojo Version 3.

Tesla has about 40,000 Nvidia H100 chips which have 2 petaflop of FP16 compute and 4 petaflops of FP8 compute. This would be 80 exaflops of FP16 and 160 exaflops of FP8 compute. Tesla with 4 Exaflops of Dojo is 3% Dojo for training and the rest Nvidia. Tesla Dojo is expanding with version 2 and $500 million facility. Tesla Dojo version 2 will likely build out to 20% or more of the total Tesla AI training.

Dojo was architected specifically to ingest and deal with the huge bandwidth requirements of video as opposed to, for example, Large Language Models. The individual training unit in an LLM is a 10 byte or so token. For video, it can be a 1.7 gigabyte video file.

With Dojo, Tesla created their own networking protocol (Telsa Transport Protocol over Ethernet) to replace TCP/IP and others (like Nvidia’s NVLink) to reduce latency by orders of magnitude. Latencies:
TCP/IP: 0.53 ms
NVLink: 0.0023 ms
TTPoE: 0.0013 ms