kaijutsu chibi monster mascot

Open arts for AI agents

MIT-licensed registry of skills + a multi-model review CLI. Compose pipelines once; run them on Claude, Codex, Antigravity, DeepSeek — or any provider you add to agents.yaml.

curl -fsSL https://kaijutsu.dev/install.sh | sh copy

One agent says yes. Three say no.

Single-agent review is sycophancy by default. Same prompt, same model, run twice — you usually get the same "LGTM." Adversarial review needs structural disagreement, not rerolls.

Single-agent — Claude only
“LGTM. Clean rename, well-scoped. Tests look comprehensive.”
kaijutsu swarm — claude + antigravity + deepseek
deepseek (paranoid-security): validateUserPresetEntry drops Mode/Personas/ConfidenceFloor at parse. User configs silently lose fields. func validateUserPresetEntry(e UserPreset) (Preset, error) { return Preset{ Name: e.Name, Base: e.Base, - // Mode/Personas/ConfidenceFloor dropped + Mode: e.Mode, + Personas: e.Personas, + ConfidenceFloor: e.ConfidenceFloor, }, nil }

Full disagreement-table run → commit 83476ef

kaijutsu fans the same diff across multiple models, shows where they agree, and flags exactly where they don't. The disagreement table is the artifact you read.

Quickstart — agent-first

kaijutsu is agent-first: jutsu is meant to be invoked BY your AI coding agent on your behalf. The first time, point your agent at kaijutsu.dev — after that, jutsu init writes a fragment into AGENTS.md / CLAUDE.md so future agent sessions already know what the tool is.

1. One-time setup (cold start)

Paste the kaijutsu URL to your coding agent (Claude Code / Codex / Antigravity CLI), or run the install yourself. Either way, jutsu init drops the persistent context fragment.

You say to your agent:
“Set up kaijutsu from https://kaijutsu.dev in this repo. Run jutsu init. Then install the pr-review skill. Tell me which API keys I still need to export.”

Agent runs:

curl -fsSL https://kaijutsu.dev/install.sh | sh
jutsu init                              # detects active agents, writes .kaijutsu/ + AGENTS.md fragment
jutsu install pr-review                 # add the pr-review skill
jutsu agent doctor                      # verify provider env vars + reachability

2. After setup

Future agent sessions read the AGENTS.md fragment and already know what jutsu does — no URL re-paste needed.

You say to your agent:
“Run a multi-agent pr-review on my current branch and post as a PR comment.”

Agent runs:

jutsu swarm pr-review --diff-from-branch main --post-comment

What you get back:

claude     │ ✓ │ ✗ │ ✓
antigravity     │ ✓ │ ✓ │ ✗
deepseek   │ ✗ │ ✓ │ ✓

           severity   file:line                summary
           issue      auth.go:42               token expiry uses < not <=
           issue      handlers/user.go:118     missing rate-limit on /verify
           minor      schema.sql:9             index name shadows reserved word

Prefer to type it yourself? Every command above works in a normal terminal — jutsu doesn't care who's typing.

Full quickstart guide →

One config. Every vendor.

Write a swarm.yaml once. jutsu dispatches it across Claude Code, Codex CLI, Antigravity CLI, DeepSeek-via-HTTP, and whatever vendor ships next. v0.6's driver abstraction handles the wire format per vendor; v0.13's Preset SDK lets you compose your own pipelines. Vendor lock-in stays a vendor problem.

# .kaijutsu/swarm.yaml — composable, portable, version-controlled
version: 1
presets:
  - name: tight-pr-review
    base: pr-review
    personas:
      - default-claude
      - paranoid-security-claude
      - default-antigravity
    mode: full
    confidence_floor: 0.7
jutsu swarm tight-pr-review --diff-from-branch main
Why this matters

Vendor IDEs will ship native multi-agent review by 2027. Cross-vendor portability is the part that doesn't commodify. Your pipeline is yours, regardless of which vendor wins.

Cost per bug found

Multi-agent review costs more per call than a single-agent run. The framing that matters is cost per bug actually caught and confirmed real, not cost per call.

Receipt from a real run. Reproducible: same prompt + same providers + cold cache should land within ±20% (vendor pricing drifts; rate limits / context refresh affect timing).

MetricValue
Rundoc-review on the v0.14 landing-page content spec
Personasclaim-auditor-deepseek + architecture-purist-antigravity
Providersdeepseek-chat (HTTP) + antigravity-2.5-pro (CLI)
Cache statecold (run-id 20260509T155131Z)
Total API cost$0.05
Wall time4m31s
Findings10 (4 issue, 4 minor, 2 info)
Cost per real issue caught~$0.008 (4 issue / $0.03 attributable)
Isn't this ~3x a single-agent run?
Yes — that's what you're buying. Three models with different priors disagree about different things. The cost difference is the disagreement. If your codebase doesn't need disagreement, run jutsu swarm pr-review with --personas default-claude only.

Trust model

kaijutsu's trust surface has four parts:

MIT-licensed, end to end

kaijutsu CLI, every core skill, the registry index — all MIT. No CLA, no rug-pull license clause, no "open core" tier. Fork it. The LICENSE file in every skill is part of the integrity check.

Skill provenance

Core skills (skills/core/*) are sigstore-signed at release. jutsu install verifies the signature before unpacking. Community skills are pinned to a tagged commit + SHA + sha256 integrity in kaijutsu.lock.json; tampering breaks the lockfile.

Permission gates + supply-chain

Skills can request bash, network, fs-write. jutsu install parses each skill's permissions: manifest before write and prompts on first sensitive use. Skills are arbitrary code that runs in your agent loop — pin to SHA, audit the diff, do not auto-update.

Local data stays local

~/.kaijutsu/findings.db is mode 0600, never leaves disk, never syncs. The cli/internal/cli/finding.go package is import-list-restricted from net, net/http, net/url — enforced as a build-time test.

Receipts, not rhetoric

Every kaijutsu release links to the actual jutsu swarm pr-review that reviewed it. Self-review is necessary, not sufficient — but it's a falsifiable receipt, not a marketing badge.

Isn't this just a wrapper around OpenAI/Anthropic APIs?

The honest answer: it has wrapper parts and non-wrapper parts.

Wrapper:
Per-vendor driver shims (cli/http/cli-compat/mcp). Intentionally thin — vendor APIs change, and we want the change surface small.
Not wrapper:
  • Cross-vendor portability layer — same swarm.yaml runs across vendors with different wire formats and auth.
  • findings.db quality fingerprinting — per-(provider, persona, preset, codebase) precision math derived from your accept/dismiss history. The synthesizer weights each agent's vote by observed precision. No vendor offers this.
  • Preset SDK — compose your own swarm shapes; project + home overlay; built-ins shadowed at registration.
  • Dream lens-blindspot warnings — jutsu swarm dream --mode full --lenses all runs the 8-lens matrix and flags where lens consensus may be RLHF-convergence rather than actual signal.
  • agents.yaml driver abstraction — provider catalog with overlay + override semantics; vendor switch is a config edit.

Skill catalog

Browseable list of every skill kaijutsu ships, plus vendored third-party catalogs. Filter by source, search by name or tag.

Showing all 47 skills.

api-and-interface-design

THIRD-PARTY

Third-party skill from github.com/addyosmani/agent-skills; install via `jutsu install api-and-interface-design`.

claude · codex · antigravity
jutsu install api-and-interface-design

autopilot 2.0.0

CORE

Intent-to-PR pipeline. Multi-agent adversarial review at every artifact stage; cost-capped; reverse-drift gate. Replaces the v1 autopilot skill.

claude · codex · antigravity
jutsu install autopilot

blunder-hunt 0.2.0

CORE

Multi-pass adversarial critique. Run a skeptic review N times in fresh contexts (optionally in parallel via dispatch-parallel); deduplicate; surface only issues that survive.

claude · codex · antigravity
jutsu install blunder-hunt

brainstorm 0.1.0

CORE

Multi-agent ideation against a free-form prompt. Runs claude / codex / antigravity in parallel via `jutsu swarm brainstorm "<prompt>"`, each with a different angle (claude=long-horizon framing, codex=code-pattern grounding, antigravity=cross-domain analogy), then synthesizes into a ranked list of approaches with tradeoffs. Designed to feed into `decide` for ADR capture or into `refactor-plan` for execution. Use when the user says "brainstorm", "ideas for", "how should we approach", "options for", or invokes /brainstorm.

claude · codex · antigravity
jutsu install brainstorm

browser-testing-with-devtools

THIRD-PARTY

Third-party skill from github.com/addyosmani/agent-skills; install via `jutsu install browser-testing-with-devtools`.

claude · codex · antigravity
jutsu install browser-testing-with-devtools

ci-cd-and-automation

THIRD-PARTY

Third-party skill from github.com/addyosmani/agent-skills; install via `jutsu install ci-cd-and-automation`.

claude · codex · antigravity
jutsu install ci-cd-and-automation

code-review-and-quality

THIRD-PARTY

Third-party skill from github.com/addyosmani/agent-skills; install via `jutsu install code-review-and-quality`.

claude · codex · antigravity
jutsu install code-review-and-quality

code-simplification 0.1.0

COMMUNITY

Reduce code without changing behavior. Apply Chesterton's Fence + Rule of 500. Find dead code, duplicate logic, unused abstractions, leaky helpers, premature factoring. Composes polish (review-fix loop) and blunder-hunt (regression risk).

claude · codex · antigravity
jutsu install code-simplification

context-engineering

THIRD-PARTY

Third-party skill from github.com/addyosmani/agent-skills; install via `jutsu install context-engineering`.

claude · codex · antigravity
jutsu install context-engineering

convergence-detect 0.2.0

CORE

Detect when an iterative loop has stopped producing new signal. Quantitative scoring (token deltas, jaccard similarity, new-item ratio) plus optional persistence of round-by-round signals to .claude/convergence-log/ for post-hoc tuning.

claude · codex · antigravity
jutsu install convergence-detect

dcg 0.1.0

COMMUNITY

Destructive command guard. Pre-tool-use hook that blocks rm -rf /, git reset --hard, git clean -fd, rm of .env / credentials, and other unrecoverable shell commands before the agent executes them.

claude · codex · antigravity HOOKS
jutsu install dcg

debugging-and-error-recovery

THIRD-PARTY

Third-party skill from github.com/addyosmani/agent-skills; install via `jutsu install debugging-and-error-recovery`.

claude · codex · antigravity
jutsu install debugging-and-error-recovery

deprecation-and-migration

THIRD-PARTY

Third-party skill from github.com/addyosmani/agent-skills; install via `jutsu install deprecation-and-migration`.

claude · codex · antigravity
jutsu install deprecation-and-migration

deslop 0.2.0

CORE

Strip AI-generated tells from a draft. Two modes: --strip edits in place; --detect reports flagged passages without editing. Reads project style guides from CLAUDE.md/AGENTS.md for project-specific term preservation.

claude · codex · antigravity
jutsu install deslop

dispatch-parallel 0.2.0

CORE

Fan out N independent tasks to parallel subagents, collect, synthesize. Primitive for skills that need parallelism (multi-lens review, multi-file edits, multi-source data gathering).

claude · codex · antigravity
jutsu install dispatch-parallel

doc-review 0.1.0

CORE

Universal QA gate for markdown artifacts. Runs claude / codex / antigravity in parallel via `jutsu swarm doc-review <path>`, each with a tailored lens (claude=completeness, codex=implementability, antigravity=consistency), then synthesizes into a single review with a disagreement table. Designed to be invoked at the end of any artifact-producing skill (spec-driven-development, planning-and-task-breakdown). Use when the user says "review this spec", "review this plan", "check this RFC", or invokes /doc-review.

claude · codex · antigravity
jutsu install doc-review

documentation-and-adrs

THIRD-PARTY

Third-party skill from github.com/addyosmani/agent-skills; install via `jutsu install documentation-and-adrs`.

claude · codex · antigravity
jutsu install documentation-and-adrs

dream 0.1.0

CORE

Pre-implementation interrogation primitive. Walks any topic through 8 cognitive lenses (4 base: honest/fit/gaps/wild + 4 opt-in extras: adversary/inverse/status-quo/time). Anti-sycophancy gates baked into prompts. Standalone /dream OR multi-agent jutsu swarm dream (matrix of agents × lenses).

claude · codex · antigravity
jutsu install dream

editorial-review 0.1.0

COMMUNITY

Multi-pass editorial review for long-form essays + articles. Structural / line-level / AI-tells / resonance passes. Voice-configurable (literary-intelligent-general | terse-technical | conversational | journalistic). Composes jutsu swarm doc-review for multi-agent prose review on top.

claude · codex · antigravity
jutsu install editorial-review

executing-plans

THIRD-PARTY

Third-party skill from github.com/obra/superpowers; install via `jutsu install executing-plans`.

claude · codex · antigravity
jutsu install executing-plans

frontend-design

THIRD-PARTY

Third-party skill from github.com/anthropics/skills; install via `jutsu install frontend-design`.

claude · codex · antigravity
jutsu install frontend-design

frontend-ui-engineering

THIRD-PARTY

Third-party skill from github.com/addyosmani/agent-skills; install via `jutsu install frontend-ui-engineering`.

claude · codex · antigravity
jutsu install frontend-ui-engineering

git-workflow-and-versioning

THIRD-PARTY

Third-party skill from github.com/addyosmani/agent-skills; install via `jutsu install git-workflow-and-versioning`.

claude · codex · antigravity
jutsu install git-workflow-and-versioning

idea-refine

THIRD-PARTY

Third-party skill from github.com/addyosmani/agent-skills; install via `jutsu install idea-refine`.

claude · codex · antigravity
jutsu install idea-refine

incremental-implementation 0.2.0

CORE

Implement features as thin vertical slices behind feature flags, each shippable on green CI. Per-slice two-stage review chain: fresh-context spec-compliance subagent, then code-quality via polish (with verification-before-completion gate). Continuous execution between independent slices — no pause-to-check-in. Composes scope-check + polish.

claude · codex · antigravity
jutsu install incremental-implementation

journal 0.2.0

COMMUNITY

Periodic development journal — read git/PRs/decisions/memory; produce a structured narrative; optional multi-model synthesis for the narrative voice. Themed cadences (weekly/sprint/release/monthly) and sentiment+velocity tracking.

claude · codex · antigravity
jutsu install journal

mcp-builder

THIRD-PARTY

Third-party skill from github.com/anthropics/skills; install via `jutsu install mcp-builder`.

claude · codex · antigravity
jutsu install mcp-builder

performance-optimization

THIRD-PARTY

Third-party skill from github.com/addyosmani/agent-skills; install via `jutsu install performance-optimization`.

claude · codex · antigravity
jutsu install performance-optimization

planning-and-task-breakdown 0.3.0

CORE

Decompose a spec into a verifiable task list — header (goal + architecture + tech-stack), file-structure-up-front, vertical slices with explicit dependencies and acceptance criteria, AND bite-sized 2-5-minute checkbox steps inside each slice for direct subagent-driven execution. Final review via jutsu swarm doc-review; cost-tier classification via scope-check.

claude · codex · antigravity
jutsu install planning-and-task-breakdown

polish 0.3.0

CORE

Automated post-implementation review-and-fix loop. Up to 4 passes; each pass alternates random-code-exploration and cross-agent-review lenses; convergence-detect declares the stop; verification-before-completion runs a fresh test/build gate AFTER convergence; deslop runs at the end on any user-facing prose.

claude · codex · antigravity
jutsu install polish

pr-review 1.0.1

CORE

Adversarial multi-agent PR review. Runs claude / codex / antigravity in parallel via `jutsu swarm pr-review`, each with a tailored prompt, then synthesizes into a single markdown review with a disagreement table. --full mode adds a Pass-2 round-robin debate. Posts the result as a PR comment when --post-comment is set.

claude · codex · antigravity
jutsu install pr-review

readme-update 0.2.0

COMMUNITY

Detect README staleness, identify out-of-date sections, propose targeted diffs. Semantic section matching (Quickstart ≈ Get Started); works on any markdown doc (CONTRIBUTING, ARCHITECTURE, CHANGELOG); auto-generates usage examples from new CLI commands; checks badge staleness.

claude · codex · antigravity
jutsu install readme-update

refactor-plan 0.1.0

CORE

Multi-agent refactor planning. Runs claude / codex / antigravity in parallel via `jutsu swarm refactor-plan <path>... --goal "<goal>"`. Each agent contributes from a different angle (claude=architectural decomposition, codex=stepwise risk, antigravity=pattern consistency); synthesizer assembles into an ordered step plan with risk per step. Designed to feed into `planning-and-task-breakdown` for execution. Use when the user says "plan this refactor", "how do I restructure", "step plan to extract X", or invokes /refactor-plan.

claude · codex · antigravity
jutsu install refactor-plan

scope-check 0.2.0

CORE

Mid-task scope mirror with cost-tier classification (plan/bead/code = 1x/5x/25x). Surfaces when work has drifted from cheap planning into expensive code-space rework, and identifies natural split-points where a PR could become two.

claude · codex · antigravity
jutsu install scope-check

security-and-hardening 0.1.0

COMMUNITY

OWASP-Top-10-aware security review. Sweep the diff (or a target file set) for injection, auth gaps, deserialization risks, broken access control, sensitive data exposure, security misconfig, and supply-chain trust. Composes blunder-hunt's hostile-input lens.

claude · codex · antigravity
jutsu install security-and-hardening

security-audit 0.1.0

CORE

Multi-agent security audit. Runs claude / codex / antigravity in parallel via `jutsu swarm security-audit --pr <n>` (PR mode) or `jutsu swarm security-audit <path>...` (files mode). Three lenses: claude=auth+data flow, codex=injection+priv-esc, antigravity=dep+supply-chain. CVSS-aligned severity (critical|high|medium|low|informational). --full mode defaults ON. Output: threat model + prioritized recs. Use when the user says "audit this for security", "threat model", "check for vulns", "review the auth flow", or invokes /security-audit.

claude · codex · antigravity
jutsu install security-audit

session-retro 0.3.0

COMMUNITY

End-of-session retrospective. Mines the current conversation for learnings; cross-session pattern detection across recent retros; writes to project-memory; optional auto-apply mode.

claude · codex · antigravity
jutsu install session-retro

shipping-and-launch

THIRD-PARTY

Third-party skill from github.com/addyosmani/agent-skills; install via `jutsu install shipping-and-launch`.

claude · codex · antigravity
jutsu install shipping-and-launch

skill-creator

THIRD-PARTY

Third-party skill from github.com/anthropics/skills; install via `jutsu install skill-creator`.

claude · codex · antigravity
jutsu install skill-creator

source-driven-development

THIRD-PARTY

Third-party skill from github.com/addyosmani/agent-skills; install via `jutsu install source-driven-development`.

claude · codex · antigravity
jutsu install source-driven-development

spec-driven-development 0.2.0

CORE

Author a PRD-style spec before writing implementation. Produce a written, reviewable artifact (problem statement, scope, non-goals, acceptance criteria, open questions) that survives the agent session and gets reviewed by `jutsu swarm doc-review` before unblocking implementation.

claude · codex · antigravity
jutsu install spec-driven-development

subagent-driven-development

THIRD-PARTY

Third-party skill from github.com/obra/superpowers; install via `jutsu install subagent-driven-development`.

claude · codex · antigravity
jutsu install subagent-driven-development

systematic-debugging

THIRD-PARTY

Third-party skill from github.com/obra/superpowers; install via `jutsu install systematic-debugging`.

claude · codex · antigravity
jutsu install systematic-debugging

test-driven-development

THIRD-PARTY

Third-party skill from github.com/addyosmani/agent-skills; install via `jutsu install test-driven-development`.

claude · codex · antigravity
jutsu install test-driven-development

unstuck 0.3.0

CORE

Guided problem articulation when you're stuck — agent-side evidence gathering BEFORE the human-side articulation questions, then structured pivots, blunder-hunt escalation, and walk-away if needed.

claude · codex · antigravity
jutsu install unstuck

using-git-worktrees

THIRD-PARTY

Third-party skill from github.com/obra/superpowers; install via `jutsu install using-git-worktrees`.

claude · codex · antigravity
jutsu install using-git-worktrees

verification-before-completion 0.1.0

CORE

Hard gate before claiming work is complete: run the verification command fresh, capture output, only then claim done. Composed by polish, incremental-implementation, and pr-review to prevent silent-fail tail (convergence ≠ correctness). Adapted from obra/superpowers under MIT.

claude · codex · antigravity
jutsu install verification-before-completion

Contributing

Skills + tools live at github.com/momentmaker/kaijutsu. Read CONTRIBUTING.md for skill authoring + PR conventions.

Want to add a skill? jutsu skill new <name> scaffolds a flat-layout skill directory. jutsu lint <path> validates schema + license + permission manifest. Run jutsu publish to open a PR against skills/community/.