xAI has released Grok 3 which is the first model to break a 1400 LLM Arena score. This makes Grok 3 the top AI model.
Grok 3 is still undergoing reinforcement learning to make it even smarter. This will help Grok 3 to surpass OpenAI O3.

The data center that xAI has built has been expanded to 250 megawatts and 200,000 GPUs.

xAI is already expanding to 1.2 gigawatts of power and 1 million GPUs. Those will be Nvidia B200 chips and Dojo 2 chips of similar compute power. A million more powerful GPUs will have 40 times the training power.
The big economic impact for Tesla will having a great conversational AI voice system out in a week. They will stick it into Tesla cars in 1-2 months. The voice system will also be in Teslabots.
Talking to xAI grok 3+ in Tesla cars will be a huge convenience and will be very useful for drivers.
It will also get deeply integrated with X and X payments
Agent grok will come out in a few months to perform very useful tasks and work based upon voice commands.
AI expert, Andrej Karpathy Summary of Grok 3 . As far as a quick vibe check over ~2 hours this morning, Grok 3 + Thinking feels somewhere around the state of the art territory of OpenAI’s strongest models (o1-pro, $200/month), and slightly better than DeepSeek-R1 and Gemini 2.0 Flash Thinking. Which is quite incredible considering that the team started from scratch ~1 year ago, this timescale to state of the art territory is unprecedented. Do also keep in mind the caveats – the models are stochastic and may give slightly different answers each time, and it is very early, so we’ll have to wait for a lot more evaluations over a period of the next few days/weeks. The early LM arena results look quite encouraging indeed. For now, big congrats to the xAI team, they clearly have huge velocity and momentum and I am excited to add Grok 3 to my “LLM council” and hear what it thinks going forward.
BREAKING: @xAI early version of Grok-3 (codename "chocolate") is now #1 in Arena! 🏆
Grok-3 is:
– First-ever model to break 1400 score!
– #1 across all categories, a milestone that keeps getting harder to achieveHuge congratulations to @xAI on this milestone! View thread 🧵… https://t.co/p8z8lccNd5 pic.twitter.com/hShGy8ZN1o
— lmarena.ai (formerly lmsys.org) (@lmarena_ai) February 18, 2025
I was given early access to Grok 3 earlier today, making me I think one of the first few who could run a quick vibe check.
Thinking
✅ First, Grok 3 clearly has an around state of the art thinking model ("Think" button) and did great out of the box on my Settler's of Catan… pic.twitter.com/qIrUAN1IfD— Andrej Karpathy (@karpathy) February 18, 2025







Brian Wang is a Futurist Thought Leader and a popular Science blogger with 1 million readers per month. His blog Nextbigfuture.com is ranked #1 Science News Blog. It covers many disruptive technology and trends including Space, Robotics, Artificial Intelligence, Medicine, Anti-aging Biotechnology, and Nanotechnology.
Known for identifying cutting edge technologies, he is currently a Co-Founder of a startup and fundraiser for high potential early-stage companies. He is the Head of Research for Allocations for deep technology investments and an Angel Investor at Space Angels.
A frequent speaker at corporations, he has been a TEDx speaker, a Singularity University speaker and guest at numerous interviews for radio and podcasts. He is open to public speaking and advising engagements.
Brian, from where do you get that there will be a grok 3 in Teslas within weeks? Is hw4 even suitable for running LLMs?