Poweruser Developer Using Grok 4

Developer and entrepreneur David Ondrej demonstrates Grok 4, xAI’s latest AI mode. It outperforms competitors like OpenAI, Anthropic, Meta, and Google on benchmarks such as GPQA (Google-proof questions), advanced math, and coding.

He highlights three variants: Grok 4 Base (no tools), Grok 4 Base with tools, and Grok 4 Heavy ($300/month), a multi-agent system where four parallel agents collaborate to solve problems, achieving PhD-level performance across fields.Key features include native multimodality (text and image processing), a 256k context window, function calling, and structured outputs.

David speculates that an earlier Grok 3 version’s unrestricted behavior (Mechahitler) was a marketing ploy to generate hype.

He urges viewers to shift from consuming AI content to building startups or agents. xAI is running out of tests due to rapid advancements. Andre predicts xAI and Google DeepMind as leaders in the AGI race, based on talent, data, compute, and execution.He asserts Grok 4 can run simple businesses (e.g., vending machines) and shares early API access results showing top performance on relevant benchmarks. Upcoming releases include a specialized coding model (next month), a multimodal video agent (September), and a video generation model (October).

The bulk of the video is a live demo using Grok 4 (via Cursor IDE and a repo prompt tool) to make UI tweaks to Vectal, Andre’s AI-powered task management startup (used by 55,000+ people). He integrates Grok 4 into Vectal for $20/month (Pro plan), avoiding the $300 Heavy cost. Steps include:Prompting Grok 4 to analyze the codebase and identify relevant files for Kanban board changes.
Using Grok 4 Heavy (via grok.com) for complex tasks like repositioning elements (e.g., priority badges, task names, due dates) to create a minimal, clean design.

Testing simpler tweaks in Cursor’s agent mode, emphasizing precise prompting and “do not change anything else” to avoid over-edits.
Handling Git operations (branching, committing, pushing, creating PRs) to demonstrate tool calling and reasoning.
Iterating on issues like padding removal and conditional rendering (e.g., hiding project names in specific views).

Grok 4’s built-in tool make calls (via reinforcement learning) and multi-agent collaboration, contrasting it with independent variations in tools like Codex. There werw minor integration glitches (e.g., in Cursor, possibly due to recent release) but expects fixes soon.

Grok 4 Heavy excels for advanced problem-solving, he recommends the base version for most users and plans to use it as his default in Cursor and Vectal, alongside Claude for agentic tasks. He promotes his own Vectal tool as a superior alternative to tools like Todoist, ClickUp, or Trello, offering AI agents, custom prompts per project, team plans, and access to top models (e.g., Grok 4, Gemini 2.5 Pro, Claude 3 Opus). He offers personal onboarding for teams and stresses switching for productivity gains. He also plugs his “New Society” community for AI tutorials, startup building (e.g., growing Vectal to $10K+ MRR), and cutting-edge updates.

1 thought on “Poweruser Developer Using Grok 4”

  1. On the point of “PhD-level performance”. By definition, “PhD” is a recognition of the fact of a contribution made to the body of knowledge. Not the fact of having, compiling, applying or deriving from existing knowledge. New and useful knowledge. This definition was forgotten recently, with most PhDs being compilations or derivations presented as new knowledge – Grok can easily be at this level, however you want to call it, but it is not a “PhD-level performance”. Grok itself is an example of a “PhD-level performance”.

Comments are closed.