Tesla Dojo, Supercomputers and AI

September 17, 2021 by Brian Wang

The more compute you put into a neural network then the better result you can get. There does not seem to be a limit for neural nets to utilize more resources to get better and faster results.

Tesla is motivated to develop bigger, faster computers that are precisely suited to their needs.

The Google TPU architecture has not evolved as much over the last 5 years. The Google TPU chip is designed for the problems that Google runs. They are not optimized for training AI.

Tesla has rethought the problem of AI training and designed the Dojo AI supercomputer to optimally solve their problems.

If Tesla commercializes the AI supercomputer that will help to get to lower costs and greater power with more economies of scale.

One of the reasons that TSMC overtook Intel was that TSMC was making most of the ARM chips for cellphones. TSMC having more volume let them learn faster and drive down costs and accelerate technology.

99% of what neural network nodes do are 8 by 8 matrix multiply and 1% that is more like a general computer. Tesla created a superscalar GPU to optimize for this compute load.

SOURCES- David Lee Investing
Written by Brian Wang, Nextbigfuture.com (Brian owns shares of Tesla)

Brian Wang

Brian Wang is a Futurist Thought Leader and a popular Science blogger with 1 million readers per month. His blog Nextbigfuture.com is ranked #1 Science News Blog. It covers many disruptive technology and trends including Space, Robotics, Artificial Intelligence, Medicine, Anti-aging Biotechnology, and Nanotechnology.

Known for identifying cutting edge technologies, he is currently a Co-Founder of a startup and fundraiser for high potential early-stage companies. He is the Head of Research for Allocations for deep technology investments and an Angel Investor at Space Angels.

A frequent speaker at corporations, he has been a TEDx speaker, a Singularity University speaker and guest at numerous interviews for radio and podcasts. He is open to public speaking and advising engagements.

16 thoughts on “Tesla Dojo, Supercomputers and AI”

Anonymous

September 21, 2021 at 8:12 am

Again I try my best not to personal attack .. as I said Kimhi could have been right its not my area of expertise, I just wanted him to provide some `evidence` I could assess and possibly learn from.

When he does not but continues to `assert` my reply is not really for him but for other readers.

Have a nice day.
DrPat

September 21, 2021 at 2:53 am

I personally like your standards. I was trying to word my caution to you in a way that wouldn't raise the aggravation level higher among any of the participants.

Because we've (ie. the NBF forums) certainly had things just break down into personal attacks in previous discussions and I'm feeling all relaxed at this juncture and don't want to get aggravated again.
Anonymous

September 20, 2021 at 8:30 am

Fair point but initially I am not `debating`
I am stating what a AI expert says and then Giving Kimhi a chance to point me at another expert says that backs his assertion .. to avoid my own `confirmation bias` in my own belief system by looking at alternative viewpoints and perhaps just perhaps learning something as its possible Kimhi was correct.
I accept people have different standards personally I like evidence that is verifiable or a expert defending a hypothesis in fields outside my expertise.
DrPat

September 20, 2021 at 2:55 am

I have found that on this site (as with much of life in general) that different people, from different backgrounds, have different "standards of evidence" as it were.

Debating such people and insisting that your standards are correct is not usually a productive use of one's time.
William Readling

September 19, 2021 at 10:39 pm

I'd think that every hardware node(core) would be programmed to model neurons, and their connections as described to them by a "virtual neuron configuration" program, that would include which virtual neurons, or digital inputs, or outputs on other nodes to connect to. Once all the nodes sent/receiving instructions report completion of configuration, a signal is sent to all nodes to commence virtual neuron activity.
No one program would handle the ongoing activities of the many threads.
William Readling

September 19, 2021 at 10:29 pm

It might be a good bet to attempt to develop quantum computer algorithms/programs, so at least some of the work will be done when/if cost effective hardware becomes available.
Tangential

September 19, 2021 at 4:41 am

Thank you. I should have known Elon would find a genius and hire him!
Anonymous

September 18, 2021 at 9:56 pm

The video is correct. As I said, the bigger computer doesn’t make the final, trained AI smarter. It just gets to that trained AI faster.

So it’s all about making the training process better (less time required). Which makes the economics better. From 11:45-12:08 he explains:

———
“Tesla’s going to be interested in a machine that’s going to get the answers to a sizable network training things relatively quickly. Like, it’s not the case that they care so much about economics that they don’t care how long the answer comes. They don’t want to wait 3 months to train the next rev of FSD. They want it in an hour, ideally. And certainly within a week.”
———

That’s why they need this more powerful computer now. And they’ll need even more power in the future, when they have even more data, which would make training even slower. So they will want even faster computers in the future, to train quickly on even bigger data sets (which allow bigger networks).

You could get exactly the same end result by training the entire thing on your laptop. But the training wouldn’t finish in your lifetime. So they’re doubling down on powerful hardware.
Anonymous

September 18, 2021 at 8:11 pm

So you do just `assert` .. you provide no links or experts, you just state that which you `believe` .. you produce no evidence of personal skills in the relevant field nor do provide any link to such evidence .. even though the video contains a A.I expert directly saying the exact opposite to such a assertion, you just double on down.

Forgive me but as such your assertions are valueless.
R. Kimhi

September 18, 2021 at 7:12 pm

In practice it seems that neural network power doesn't lead to one on one scaling of autonomy at least not in Tesla case as they keep missing their targets and expected results. Other companies have moved away from the idea of neural networks only.
Anonymous

September 18, 2021 at 2:33 pm

I am extremely well read in A.I and have worked as a professional psychologist with post grad qualifications, so I have a good grasp of human NNs .. I am aware of Deepminds work on Go and Chess and the methods they used to get to Alpha Zero, I am aware of Teslas data advantage and I am aware of how the simulation computer works, and automatic data labeling .

I understand that this is data labeled and thus is supervised, but I have listened to hours of Douma speaking on this and he is 100% aware of this fact and yet STILL he says (not me) in the embedded video at 9.40 that it WILL scale.
So forgive me but i have Douma saying one thing (apparently I may be missing a subtlety as I am not a expert and thus may misinterpret something that is said or written) and you apparently saying another can you give me a link to a page that will expand my knowledge? Although I must say I am open also to the idea that you are wrong on this as you know comments sections are full of expert advice.
Anonymous

September 18, 2021 at 2:12 pm

It depends on the problem. For reinforcement learning (such as playing Go or chess), better hardware leads to a smarter AI. But for supervised learning (such as most of autonomous driving), better hardware doesn’t give a smarter AI. It just gets to being smart faster.

For supervised learning, the biggest limit is how much training data you have. And this is where Tesla has an enormous advantage. They are collecting huge amounts of data from their customers’ cars. They have more data than their competitors. And their lead continues to grow.
Anonymous

September 18, 2021 at 11:39 am

Its a internal team in line with Teslas theme of Vertical Integration
Headed by
Andrej Karpathy (born October 23, 1986)
is the director of artificial intelligence and Autopilot Vision at Tesla. He specializes in deep learning and computer vision.
Andrej Karpathy was born in Slovakia and moved with his family to Toronto when he was 15. He completed his Computer Science and Physics bachelor's degree at University of Toronto in 2009 and completed his master's degree at University of British Columbia in 2011, where he worked on physically-simulated figures. He graduated with PhD from Stanford University in 2015 under the supervision of Fei-Fei Li, focusing on the intersection of natural language processing and computer vision, and deep learning models suited for this task.
He joined the artificial intelligence group OpenAI as a research scientist in September 2016 and became Tesla's director of artificial intelligence in June 2017.

Karpathy was named one of MIT Technology Review's Innovators Under 35 for the year 2020
Anonymous

September 18, 2021 at 11:34 am

Well James Douma the interviewed AI Expert directly contradicts what you just said in the video .. he says at 9.40-10.40 exactly the opposite that Neural Networks do endlessly scale with computer power `The sky's the limit` he says with NNs
SO
Maybe you are correct but I can see his AI credentials as I have seen him interviewed in depth many times and can see that he is indeed credible… do you personally either have AI credentials (I have read on NNs and worked as a professional psychologist but I claim no expertise in this domain) if so what? or can you cite or link to a AI / NN expert that backs your assertion?
Tangential

September 18, 2021 at 6:18 am

This is one area of research that is difficult to parse. The software side is just as important as the hardware side. Massive input – Massive output requires code capable of handling thousands of threads at once.

Who is writing the code? Is it internal to Tesla? Is it farmed out? If an external company is involved, I haven't been able to track it down. Such a corporation may make a great investment option…
R. Kimhi

September 17, 2021 at 11:37 pm

The relationship between a neural network computing power and the quality of the computational results is not linear. In fact, after a certain point increase in a neural network computing power does not improve results at all in any case.
Getting the next big future right is not for everybody… Definitely not for people that their understanding of growth is limited to a linear relation of one parameter at a time and especially if it is done to promote a product of a person selected to constantly receive favorable coverage…..

Comments are closed.