Skip to content
agent-callable code review

A warden for
every diff.

diffwarden is a small CLI your coding agent calls to review uncommitted changes, a branch, or a single commit — then hand back Markdown or structured JSON findings. Many reviewers, one contract, read-only by default.

  • never writes to your repo
  • 9 reviewer engines
  • stable JSON / NDJSON
diffwarden — base:main
$ diffwarden --target base:main --reviewer-set 2
preflight claude · cursor · pi ok
reviewing 3 reviewers · 14 files · base:main
claude 2 findings
cursor 1 finding
pi 3 findings
aggregated · 4 findings after dedupe
P1 src/auth/session.ts:48
token compared with == — use a timing-safe equal
P2 src/api/users.ts:131
unvalidated limit flows into the query
exit 1 · --fail-on-findings P2

One CLI, the reviewers you already run

Claude Codex Cursor Gemini Pi Droid Grok OpenCode Antigravity

The pipeline

One command in. A reconciled review out.

The CLI owns every step between your agent's call and the result. Adapters only run their engine and hand back text or structured output.

  1. 01

    Resolve a target

    Point diffwarden at uncommitted changes, a base branch, a commit, or custom instructions. It collects the diff and the changed-line ranges.

  2. 02

    Fan out to reviewers

    Your selected reviewers run concurrently behind adapters — each with read-only tools and the shared, Codex-derived review rubric.

  3. 03

    Reconcile findings

    Results are parsed, schema-validated, checked against changed lines, then deduplicated and attributed across reviewers.

  4. 04

    Deliver & gate

    Render Markdown, emit JSON or NDJSON, optionally fail CI on severity, and append a history report — all from the final artifact.

Capabilities

Everything a review needs, nothing it shouldn't touch.

diffwarden owns target resolution, prompting, parsing, validation, and rendering. Reviewer SDKs and CLIs stay behind adapters — so the surface your agent calls stays small and the same across every engine.

Many reviewers, one verdict

Run several reviewers in a single pass. Findings are schema-validated, path-checked, deduplicated, and attributed back to the reviewer that raised them.

Read-only by default

Reviewers get read-oriented tools only. diffwarden never writes to your tree and never posts comments to a remote — write paths are out of scope by design.

Contracts your agent can parse

markdown for humans, json for one stable artifact, and ndjson for a versioned event stream you can consume as the run progresses.

Review any target

uncommitted changes, base:<branch>, commit:<sha>, or custom:<text> for repository-scoped review instructions with no diff to collect.

Gate your pipeline

--fail-on-findings P2 exits non-zero on anything at or above your threshold, with the Markdown or JSON output left unchanged.

Reviewer sets & profiles

Name a set in config for one-flag runs, or compose engine:profile specs ad hoc on the command line for a one-off mix.

Preflight with doctor

Check runtime, auth, model, and effort for a reviewer before you spend a single review — no diff required.

Opt-in history

--report appends a durable JSON record — provenance, timings, and finding counts — and is never written unless you ask for it.

A skill for your agent

Install the bundled skill and your coding agent can call diffwarden from any repository it is working in.

Reviewers

Pick your engines.
Mix them freely.

Every adapter preserves the same review rubric, parsing, validation, and output contract. The read-only badge tells you how strongly each path is held to read-only behavior — diffwarden is precise about what it can prove.

enforced
native read-only / sandbox / spec mode
tool-restricted
limited to read-oriented tools
prompt-only
asked to stay read-only; not yet hard-proven
Full capability matrix
engine transport read-only
  • codex CLI · app-server enforced
  • droid SDK · CLI enforced
  • claude SDK · CLI tool-restricted
  • pi SDK · CLI tool-restricted
  • gemini CLI tool-restricted
  • cursor SDK · CLI prompt-only
  • opencode CLI prompt-only
  • grok CLI prompt-only
  • antigravity CLI prompt-only

+ fake — built-in reviewer for credential-free dev

Output

Readable for you. Parseable for your agent.

Only stdout carries the stable contracts. Pick the shape your consumer needs — the final artifact is identical whichever you choose.

--format markdown

One rendered report after every reviewer finishes. Readable, not a parsing contract.

default
--format json

One final ReviewArtifact object. The authoritative, machine-stable result.

stable contract
--format ndjson

Newline-delimited, versioned review events as work progresses. Built for agents and CI.

event stream
--format ndjson
{"type":"run_started","target":{…},"reviewers":[…]}
{"type":"preflight_finished","reviewer_id":"pi","ok":true}
{"type":"reviewer_result","provisional":true,"artifact":{…}}
{"type":"final_result","artifact":{…}}
# exactly one terminal frame: final_result or error

Quickstart

Up and reviewing in a minute.

Install from the GitHub source release, link the binary, and your agent can call it from any repo.

  • Requires Node ≥ 22.19
  • Not on npm yet — v0.2.9 source release
  • diffwarden init writes a starter config
install from source
$ git clone https://github.com/aurokin/diffwarden.git
$ cd diffwarden && git checkout v0.2.9
$ pnpm install && pnpm build
$ pnpm link --global
$ diffwarden --version
give it to your agent
$ npx skills add aurokin/diffwarden --global --skill diffwarden
gate a pull request · .github/workflows
- name: diffwarden review
  run: >
    diffwarden --target base:${{ github.base_ref }}
    --reviewer-set ci --format json --fail-on-findings P2

Put a warden on your diffs.

One command your agent calls. Many reviewers, one reconciled result, read-only by default.