In LMArena, Grok4.1 (Thinking) and Grok4.1 ranks first. In the earlier benchmark tests, Grok4.1 (Thinking) ranked first with a score of 1510. Currently, it is still first but with a score of 1483. Grok 4.1 is second.

There is a massive reduction in hallucination. It drops from 12% to about 4%.

This version scored more than 40 points higher than the previously released Grok4Fast two months ago.

It was silently rolled out to a subset of users from November 1–14, 2025, and it was preferred over prior models in 64.78% of blind pairwise evaluations on live traffic.

It is the top released model for creative writing.

OpenAI has GPT 5.1 that has not been released yet, that could have a slightly higher creative writing score.

Google will soon release Gemini 3. It is expected to be very good.

XAI Grok 5 will be released in Q1 2026 and will have double the parameters and will be a major step up.

Grok 4.1 Specifications and Benchmarks

Grok 4.1 is positioned as a frontier model emphasizing conversational intelligence, emotional understanding, real-world helpfulness, and reduced errors. It’s available immediately to all users (including free tier) on grok.com, x.com (via login), and the Grok iOS/Android apps.

Model Variants

Grok 4.1 Thinking (code name quasarflux). It uses thinking tokens for step-by-step reasoning in responses.

Grok 4.1 Non-Reasoning (code name tensor). It has direct responses without thinking tokens for speed.

Grok 4.1 Fast (Non-Reasoning) Includes integrated search tools for quick, fact-checked answers.