XAI Releases Grok 4.1 and It Tops the LMArena Leaderboard

In LMArena, Grok4.1 (Thinking) and Grok4.1 ranks first. In the earlier benchmark tests, Grok4.1 (Thinking) ranked first with a score of 1510. Currently, it is still first but with a score of 1483. Grok 4.1 is second.

There is a massive reduction in hallucination. It drops from 12% to about 4%.

This version scored more than 40 points higher than the previously released Grok4Fast two months ago.

It was silently rolled out to a subset of users from November 1–14, 2025, and it was preferred over prior models in 64.78% of blind pairwise evaluations on live traffic.

It is the top released model for creative writing.

OpenAI has GPT 5.1 that has not been released yet, that could have a slightly higher creative writing score.

Google will soon release Gemini 3. It is expected to be very good.

XAI Grok 5 will be released in Q1 2026 and will have double the parameters and will be a major step up.

Grok 4.1 Specifications and Benchmarks

Grok 4.1 is positioned as a frontier model emphasizing conversational intelligence, emotional understanding, real-world helpfulness, and reduced errors. It’s available immediately to all users (including free tier) on grok.com, x.com (via login), and the Grok iOS/Android apps.

Model Variants

Grok 4.1 Thinking (code name quasarflux). It uses thinking tokens for step-by-step reasoning in responses.
Grok 4.1 Non-Reasoning (code name tensor). It has direct responses without thinking tokens for speed.
Grok 4.1 Fast (Non-Reasoning) Includes integrated search tools for quick, fact-checked answers.

2 thoughts on “XAI Releases Grok 4.1 and It Tops the LMArena Leaderboard”

  1. Grok 4.1 is crap, it is a downgrade for free users. Free users without supergrok can’t access the reasoning models anymore, and there is literally no way to turn thinking back on.

    Grok 4.1 and “expert” both don’t think, and mess up on things grok 4 fast got right without issue

  2. At this point in time, the entire GROK echo system seems to be down.
    No responses to prompts, all chat history lost.

    Hopefully just a temporarily overload and not GROK 4.1 going full Terminator on us.

Comments are closed.