What is Haft, and does it generate code or just plan decisions?

Haft does not generate code itself. It is a governance and decision-tracking layer that sits between your intentions and agent execution. You frame problems, explore options, record decisions with invariants and evidence, then pass those decisions to a code-generation agent (Claude Code or Codex) via `haft run`. It enforces specification discipline and evidence decay, not code synthesis.

Is Haft production-ready, or are there stability warnings?

The MCP plugin mode (`haft serve`) and CLI harness are stable and production-proven per the README. The TUI (`haft agent`) and Desktop app are explicitly alpha and not recommended for production. Cursor, Gemini CLI, JetBrains Air, and generic MCP clients remain experimental. Choose Claude Code or Codex if you want the supported path.

How much setup work does Haft require before I can use it?

Initialization is fast (`haft init --codex` or `haft init`), but onboarding is intentionally heavyweight. Running `/h-onboard` is designed to parse target-system specs, enabling-system specs, term maps, and spec coverage graphs before broad harness execution. Broad execution is blocked for 'needs_onboard' projects by default. This is a feature, not a bug, but it delays first execution.

How does Haft compare to the official Anthropic skills library?

Haft is not a general-purpose skills library. It is a decision-governance framework bundled with MCP tools (`haft_note`, `haft_problem`, `haft_decision`, etc.) and CLI harness commands. Anthropic's official library contains utility skills for document parsing, web search, and data transformation. Haft is orthogonal: you use it to decide *how* and *why* to use other skills, and to track evidence decay.

Can I install Haft from the npm registry or does it require the install script?

Haft is installed via a bash script (`curl -fsSL https://raw.githubusercontent.com/m0n0x41d/haft/main/install.sh | bash`). There is no npm package or marketplace distribution. The script downloads the binary and places it in your PATH. The README notes the install URL still points to the historical 'quint-code' repository path, though the installed binary is `haft`.

Haft: Engineering governance for Claude Code

Haft is a decision-recording and governance framework for AI coding agents, primarily Claude Code and Codex. It does not generate code. Instead, it forces you to frame problems, explore alternatives fairly, record decisions as falsifiable contracts with invariants and evidence baselines, and know when your assumptions go stale. At 1,333 stars with a last commit in May 2026, it is a focused, opinionated tool aimed at teams who believe specification discipline prevents agent sprawl.

The core premise is sound: agents write fast, but most repositories lack the harness engineering to distinguish “we shipped fast” from “we shipped right.” Haft’s designers argue that you must specify, think, run, and govern in that order, and that evidence decay (assumptions becoming stale) is a first-class problem. Whether that philosophy matches your workflow is the real question.

What Haft actually ships

Haft has two production surfaces: an MCP plugin mode embedded in Claude Code and Codex, and a CLI harness for operators. Both compile into the same artifact graph, so there is no dual-source-of-truth problem. The MCP mode gives agents seven tools to reason about problems, explore solutions, and record decisions. The CLI harness lets you prepare commissions, run them via Open-Sleigh, check status, and apply results.

Component	Status	Purpose	When to use
MCP plugin (`haft serve`)	Stable, production-proven	Agent reasoning, decision recording, commission creation	Claude Code, Codex workflows
CLI harness (`haft harness`)	Stable, production-proven	Prepare, run, status, apply bounded execution	Operator/runtime boundary, CI/CD integration
TUI (`haft agent`)	Alpha	Terminal UI for reasoning and decision flow	Not recommended for production
Desktop app	Alpha	GUI for decision management	Not recommended for production

The README is unusually transparent about stability. TUI and Desktop are explicitly marked alpha and excluded from the v7 production envelope. Support is intentionally narrowed: Claude Code and Codex are stable; Cursor, Gemini CLI, JetBrains Air, and OpenCode are experimental with flags that may exist while docs converge. This narrowing is pragmatic but limits the tool’s addressable market.

Installation and host integration

Haft uses a bash install script rather than npm, pip, or a plugin marketplace. This is a friction point. The script downloads the binary, but thereafter you run host-specific init commands:

haft init               # Claude Code (default)
haft init --codex      # Codex CLI / Codex App
haft init --cursor     # Cursor (experimental)
haft init --gemini     # Gemini CLI (experimental)

Each host gets different MCP config paths and skill installation locations. Claude Code uses .mcp.json at project root and ~/.claude/commands/ or .claude/commands/ with --local. Codex uses .codex/config.toml and ~/.codex/prompts/. The README includes a reference table, but the multiplicity of paths is error-prone. A package manager or one-click marketplace would reduce misconfigurations.

One note: Cursor requires manual MCP toggle activation after init. The README explicitly states “open Cursor Settings → MCP → find haft → enable the toggle. Cursor adds MCP servers as disabled by default.” This hidden step has probably caught users.

Project-scoped MCP configs can be committed to shared repositories safely, using portable project-root values instead of absolute checkout paths. That is a design win.

The decision-governance model

The core workflow is /h-reason, you describe a problem, and the agent frames it, explores alternatives, compares them fairly, and records a decision. Or you drive each step manually: /h-frame, /h-char, /h-explore, /h-compare, /h-decide. The MCP tools include haft_problem, haft_solution, haft_decision, and haft_commission.

Once a decision is recorded, haft run dec-20260414-001 reads the decision’s invariants, claims, affected files, and governing invariants from the knowledge graph, then spawns an agent with full reasoning context. A baseline snapshot is taken automatically after execution. This is a sound design: it separates decision from implementation and keeps evidence tied to both.

The WorkCommission harness is the bounded execution authority between a decision and runtime. In CLI mode, you can prepare commissions without starting execution, then run them through Open-Sleigh later. The README notes that broad harness execution is blocked for projects marked needs_onboard by default, unless you pass --force-skip-specs. This is intentional friction: onboarding is supposed to build parseable specs before broad execution.

Onboarding cost and spec checking

Running /h-onboard after init is pitched as building “a parseable target-system spec, enabling-system spec, term map, and spec coverage graph.” This is not codebase summarization. It is heavyweight specification work. The README does not quantify how long this takes for a typical project, which is a red flag for adoption timelines.

Spec checking is deterministic and shallow: haft spec check parses fenced yaml spec-section blocks, checks structural fields, validates carrier shapes, and verifies term-map entries. It does not make semantic judgments, perform LLM review, or prove correctness. This honesty is refreshing, but it also means specs can be formally valid and still useless.

License, freshness, and maintenance signals

The README does not state the license. A quick check of the GitHub repo header is necessary to confirm it is MIT, which is standard. The last update was May 25, 2026, making the project current. The codebase has 1,333 stars, which is respectable for a specialized tool but not a strong popularity signal.

The README is well-written and unusually explicit about stability boundaries and host support, which suggests the maintainer cares about clarity. The decision to rename from quint-code to haft and the note that “the install URL still points at the historical quint-code repository path” hints at a past rebranding. This is not inherently concerning, but it does suggest the tool has iterated on its identity.

Comparison to Anthropic’s official skills and other approaches

Haft is not a general-purpose skills library. The official Anthropic skills library contains utility skills for document parsing, web search, knowledge base retrieval, and code execution. Haft is orthogonal: it is a meta-layer for deciding how and why to invoke other capabilities, and for tracking evidence decay.

A comparison is not quite apples-to-apples because Haft is governance-first, not capability-first. But a developer choosing between Haft and lighter-weight alternatives might ask:

Aspect	Haft	Lighter approach (inline decisions)	Official Anthropic library
Specification discipline	Enforced via artifact graph	Implicit, depends on discipline	Not applicable (utilities only)
Evidence and baseline tracking	First-class, with decay	Manual or not tracked	Not applicable
Multi-step reasoning with invariants	Yes, decision records with claims	Delegated to agent context	Not applicable
Install complexity	Script + host-specific init	None	Package manager (pip, npm)
Onboarding overhead	Heavyweight (target/enabling specs)	None	None
Stability guarantees	MCP and CLI stable; TUI/Desktop alpha	N/A	N/A

Haft’s closest competitor is not a single tool but a pattern: teams that use decision documents, git commits with rationale, and code review as governance. Haft automates and formalizes that pattern, but it costs onboarding time upfront.

Failure modes and adoption risks

Haft has several risks worth naming:

Onboarding friction: The spec-building process is heavyweight and timeline-opaque. A team with a four-person sprint will feel the cost.
Host fragmentation: Stable support is limited to Claude Code and Codex. Cursor, Gemini, JetBrains Air, and OpenCode are experimental. If your team uses multiple hosts, you will hit unsupported paths.
Alpha surface risk: TUI and Desktop are alpha. If you want a GUI for decision management, you are on the bleeding edge. The README is explicit, but users will learn this the hard way.
Onboarding as a gate: Broad harness execution is blocked until onboarding is complete. This is a feature, but it can feel like a wall.
No marketplace distribution: No npm package, no Claude Code marketplace, no Codex plugin store. You install via bash script. This is low-friction for one-off use but adds operational complexity in a team context.
Spec carrier validation is shallow: haft spec check is deterministic L0/L1 only. You can have formally valid specs that are meaningless. The README is honest about this, but it sets a low bar for spec quality.

Who should use Haft

Haft is for teams that:

Value specification discipline over speed-first iteration.
Have capacity for upfront onboarding (building target/enabling/term specs).
Use Claude Code or Codex as their primary agent.
Can tolerate CLI-first workflows and bash script installation.
Need to track evidence decay and decision invariants over time.

Haft is not for:

Teams racing against a deadline and short on specification time.
Organizations using Cursor, Gemini, or JetBrains Air as primary agents.
Projects that prefer lightweight, implicit decision-making.
Teams without prior experience thinking in specs.

Takeaways

Haft’s governance model is sound and its transparency about stability boundaries is rare. The MCP plugin and CLI harness are production-ready, but the tool demands specification discipline upfront and narrows host support intentionally. Installation is friction-heavy (bash script, host-specific init), onboarding overhead is real (target/enabling/term specs), and TUI/Desktop are alpha. The project is maintained and thoughtful, but it is not a drop-in utility. Use it if your team believes specification prevents agent chaos; skip it if you need quick results or multi-host support.

Haft: Engineering governance for Claude Code

What Haft actually ships

Installation and host integration

The decision-governance model

Onboarding cost and spec checking

License, freshness, and maintenance signals

Comparison to Anthropic’s official skills and other approaches

Failure modes and adoption risks

Who should use Haft

Takeaways

Further reading

Frequently asked

What Haft actually ships

Installation and host integration

The decision-governance model

Onboarding cost and spec checking

License, freshness, and maintenance signals

Comparison to Anthropic’s official skills and other approaches

Failure modes and adoption risks

Who should use Haft

Takeaways

Further reading

Frequently asked

Related

Claude Code skill anti-patterns to avoid

gget skill review: lightweight genomic queries for Claude

Customs Trade Compliance Skill Review: Installation

benchmark-models skill review: Cross-model testing