Grok Build CLI

Grok Build CLI Feels Different to Claude Code CLI…

May 27, 2026

For now I’m using Claude Code CLI and Grok Build CLI side-by-side to get an anecdotal feel of what works for me between the two…

These are Agent Harnesses That Actually Lives Where We Work

Most AI coding agents still feel like they were designed by people who visit the terminal occasionally rather than people who live in it.

They treat the command line as a place you occasionally escape to, not the primary environment where serious engineering happens.

Grok Build might be the first one that feels like it was built by someone who understands the terminal is not a constraint but it is the highest-leverage interface we have…

This is not another chat interface with some file editing bolted on.

It is something more interesting: an attempt to build the missing Agent Harness layers directly inside the environment where the highest-signal work already occurs.

I say this because the last few months I have been living inside Claude Code CLI…not just for writing code, but also as a universal interface to my MacBook.

the terminal is the product

The Model

grok-build is not the general-purpose Grok 4.3 that people use on grok.com or the mobile apps.

It’s a specialised variant that xAI tuned specifically for heavy agentic workflows (lots of tool use, long-running tasks).

Also subagents and parallel execution planning mode.

The model is also optimised for codebase exploration & editing…

All geared at optimising the full Grok Build TUI experience.

It does feel different from a normal chat model, because the main focus is the kind work the model will b doing…

Reading code, running commands, editing files, spawning specialist subagents etc.

The CLI Renaissance Is Real!

In a previous piece I wrote about how the command line is quietly becoming the most powerful interface for AI agents.

Not despite its age, but because of it.

The same arguments apply here.

The CLI is:

Already installed everywhere that matters
Deeply embedded in the training data of every frontier model
Self-documenting
Infinitely composable
Low latency, zero UI chrome

When Jensen Huang talks about moving from pre-recorded software to real-time processing, the CLI is the natural delivery mechanism for that shift.

The agent does not need a pre-built integration layer. It can just reason and act.

Most agent tools have not internalised this. They still build beautiful web UIs, or they bolt a chat window into VS Code, or they give you a thin wrapper that streams text while you mentally context-switch between five different surfaces.

Does Grok Build start from a different premise where the terminal is the product?

What a Real Agent Harness Looks Like

In my recent content on AI Harness Engineering, I have described the components that sit between a raw model and reliable outcomes:

Context Engine
Planner
Memory Manager
Verifier
Tool Registry
Harness Config

Most tools implement one or two of these well.

Grok Build implements more of them, and it implements them in the place where they are most useful.

Let me show you what this looks like in practice…

The Main Interface

Here is what you actually see when you live inside it:

When working with Grok Build, there is no forced summarisation that loses critical detail.

No “here is a summary of what I did” that you then have to expand. The full trace is navigable, foldable, and copyable with single keystrokes.

This is the Context Engine made visible and controllable.

Plan Mode

When you give Grok Build a genuinely ambiguous task it does something most agents don’t do.

It proposes entering Plan Mode.

In Plan Mode the agent can read, search, and explore the codebase freely.

What it cannot do is edit files or run destructive commands. The only thing it is allowed to write is a structured ‘plan.md’ inside the session.

Only when you approve the plan does execution begin.

The moment before implementation begins. The agent has explored the codebase and can only write to the plan file until you approve.

One can think of this as the planning layer of the harness made explicit.

There is a danger if an agent is over eager…they start coding before they have understood the existing patterns, failure modes, second-order consequences, etc.

Plan Mode forces a type of senior engineering discipline: think, document the approach, get alignment, then only implement.

Skills as Captured Intelligence

One of the hardest problems in agentic systems is turning one-off success into repeatable capability.

You can try and tell the model always follow our commit conventions in every prompt.

Or you can manually capture the actual workflow once and make it invocable.

But, Grok Build has a command called `/skillify`.

Think of it as a tool for capturing workflows so you (and others) can reuse them without having to explain the whole process every time.

The final piece of the harness that most tools miss is parallel decomposition.

Parallel Decomposition

Grok Build lets the main agent spawn independent child agents, subagents, each with their own context window and a defined persona.

You can have a reviewer subagent reading the diffs you just made while an implementer subagent is writing the next piece.

Or a researcher (read-only) doing deep investigation while you continue working in the parent session.

This is not simulated parallelism, these are real separate sessions.

They consume their own context. They can be given restricted capability modes.

They can even run in isolated git worktrees so their changes do not collide with yours until you choose to merge them.

Lastly

It seems like Grok Build is trying to make the terminal the single surface where the harness does the coordination work for you…planning, memory, verification, parallel decomposition, while you stay in the highest-bandwidth environment developers have ever had.

It is not perfect.

The TUI is still evolving. Some of the MCP integrations are more mature than others. But the direction is correct in a way I have not seen elsewhere.

As models keep getting better, the harness is what turns better models into reliably better outcomes.

Chief AI Evangelist @ Kore.ai | I’m passionate about exploring the intersection of AI and language. From Language Models, AI Agents to Agentic Applications, Development Frameworks & Data-Centric Productivity Tools, I share insights and ideas on how these technologies are shaping the future.

COBUS GREYLING
Where AI Meets Language | Language Models, AI Agents, Agentic Applications, Development Frameworks & Data-Centric…www.cobusgreyling.com

Introducing Grok Build
Now in early beta for all SuperGrok and X Premium Plus subscribers - Grok Build is a new coding agent that runs right…x.ai

Cobus Greyling on LLMs, NLU, NLP, chatbots & voicebots

Discussion about this post

Ready for more?