xAI’s Colossus 2 (MACROHARD) now has 7 models in training at once.
Grok Imagine V2
2 variants of 1-trillion-parameter models
2 variants of 1.5-trillion-parameter models
1 variant of a 6-trillion-parameter model
1 variant of a 10-trillion-parameter model
SpaceXAI Colossus 2 now has 7 models in training:
– Imagine V2
– 2 variants of 1T
– 2 variants of 1.5T
– 6T
– 10TSome catching up to do.
— Elon Musk (@elonmusk) April 8, 2026
xAI would be leading in raw announced scale of parameters. No other lab has publicly confirmed training 10T or even 6T models right now. The 6T model alone is roughly double the rumored size of Grok 4 and far larger than most current estimates for GPT-5 or Claude 4.6.
Parameter count is only part of the story.
AI models are judged more on:
Active parameters per token (MoE efficiency).
Training data quality and “intelligence density” (xAI claims higher density per gigabyte).
Inference-time compute (reasoning modes, multi-agent orchestration).
Real-world benchmarks (coding, agentic tasks, multimodality).

Chips Needed & Costs for Pre-Training Runs
Exact per-model costs are not public (models are still training), but here are the best analyses and estimates.
Colossus 2 hardware: ~550,000 NVIDIA GPUs (mostly GB200/GB300 Blackwell variants) at ~$18 billion hardware cost alone (average ~$32k–$40k per GPU). This supports the full parallel training lineup.
Total CapEx is tens of billions of dollars for Colossus 2 (land, power infrastructure, cooling, networking). Includes on-site gas turbines/Megapacks for 400+ MW dedicated power and rapid buildout.
Per-model rough estimates (community/analyst extrapolations).
10T model needs ~$1.5 billion+ in compute (one early analyst call. scales with FLOPs and duration). Initial pre-training phase ~2 months on Colossus 2.
6T model needs Similar order of magnitude but lower. benefits from shared cluster efficiency.
Smaller 1T/1.5T runs: Significantly cheaper/faster due to parallelization.

Brian Wang is a Futurist Thought Leader and a popular Science blogger with 1 million readers per month. His blog Nextbigfuture.com is ranked #1 Science News Blog. It covers many disruptive technology and trends including Space, Robotics, Artificial Intelligence, Medicine, Anti-aging Biotechnology, and Nanotechnology.
Known for identifying cutting edge technologies, he is currently a Co-Founder of a startup and fundraiser for high potential early-stage companies. He is the Head of Research for Allocations for deep technology investments and an Angel Investor at Space Angels.
A frequent speaker at corporations, he has been a TEDx speaker, a Singularity University speaker and guest at numerous interviews for radio and podcasts. He is open to public speaking and advising engagements.
I hope they make it because GROK usefulness has taken a nosedive.
It has no concept of time even if you prompt it to pay attention to what date it is and not use old stale data. Every now and then, it goes back to reusing old training data or stale session context. This is really dangerous if you try to monitor real world events.
GROK imagine is a mess due to moderation forced upon XAI by EU and others.
They now have moderation filters acting upon the output which means they waste compute resources by rendering stuff that they throw away in the end due to their own moderation. Often if you have a human in depicted in video output, half of the videos are moderated.