Appearance
Agent-team review — role-isolated development in ~10 minutes
Run one task through four isolated roles — implementer, tester, reviewer, lead — so review is adversarial instead of self-confirming. A single session that writes, tests, and reviews its own code green-lights its own mistakes; splitting the jobs across separate teammates means the producer never approves its own output.
This is the setup recipe for the /review-team preset, which sits on Claude Code's native Agent Teams feature. For the role contract and the lead playbook, read the preset; this page is the wiring.
When to use it
- A change important enough that a self-review isn't enough — security-sensitive code, a tricky algorithm, anything where a confident-but-wrong implementation is expensive.
- You want depth (N roles on one task), not breadth. For fanning one instruction across N independent branches, use
cdforkinstead.
Prerequisites
- A BoB-provisioned repo with
team-implementer,team-tester, andcode-reviewerprovisioned (cdprov add agents/team-implementer agents/team-tester agents/code-reviewer, or add them via/provision;code-revieweris a default agent so it's usually already present). - Claude Code with the experimental Agent Teams feature (it is off by default).
Setup (the ~10 minutes)
1. Enable the experimental feature (one-time, ~1 min)
jsonc
// ~/.claude/settings.json (or the project .claude/settings.json)
{ "env": { "CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS": "1" } }Or, for a single session: claude --teammate-mode auto. Restart Claude Code so it picks up the env change.
2. Decide file isolation up front (~3 min)
The native feature does not auto-isolate teammate file writes — two teammates editing the same file overwrite each other. Pick one of:
Disjoint file ownership (simplest). Scope the implementer's task to a specific set of files; the tester only writes under
tests/(or the project's test dir); the reviewer writes nothing. For most single-task work this is enough — the implementer is the only writer of source.Worktree isolation (for risk or parallel writers). Run the implementer in its own git worktree so its writes can't touch your working tree until you merge. BoB already does this for parallel work — reuse the same mechanism:
bash# a throwaway worktree + branch for the implementer's changes git worktree add ../wt-review-team -b feature/review-team-task # point the team's implementation task at ../wt-review-team; review the diff, then: git worktree remove ../wt-review-team # when doneSee cdfork for the fuller worktree-per-agent pattern.
3. (Optional) Enforce stop-on-failure (~3 min)
The team-tester role is instructed to stop on failure, but you can make it a hard gate with a TaskCompleted hook: exit non-zero when the test command fails, and the task cannot be marked complete (the teammate gets the feedback and the loop returns to the implementer via the lead).
bash
# .claude/hooks/team-tester-gate.sh (wire as a TaskCompleted hook)
INPUT=$(cat)
# only gate the tester's tasks; let everything else through
echo "$INPUT" | jq -e '.task.assignee == "team-tester"' >/dev/null 2>&1 || exit 0
if ! npm test >/dev/null 2>&1; then
echo "Tests are red — task cannot complete until green." >&2
exit 2 # exit 2 blocks completion and sends feedback to the teammate
fi
exit 0(Swap npm test for the project's test command. The exit-2 convention is the native TaskCompleted block-and-feed-back mechanism.)
4. Run the team (~3 min to kick off)
From your main session (you are the lead), run /review-team <task> and follow the lead playbook: frame the task as a single spec, spawn the implementer, then spawn the tester and reviewer on the resulting diff, arbitrate, and decide merge-readiness only when the tester is green and the reviewer has no blocking finding.
Why the review is genuinely independent (context isolation)
Each teammate runs in its own context window, and the lead's conversation history does not carry over. So when the lead spawns the tester and reviewer, they load the project context (CLAUDE.md, skills) and the spawn prompt — but not the implementer's reasoning about why the code is correct. They judge the artifact, not the argument for it. Your only job as lead is to not defeat this: hand them the diff + spec, never paste in the implementer's rationale. That isolation is what makes the reviewer's "looks correct" mean something.
Worked example — the reviewer catches a bug the implementer missed
Task (spec from the lead): "Add applyDiscount(price, percent) that returns the price after a percentage discount. percent is 0–100. Reject invalid input."
Implementer (team-implementer) produces:
js
function applyDiscount(price, percent) {
if (percent < 0 || percent > 100) throw new Error("percent out of range");
return price - price * (percent / 100);
}The implementer's report to the lead: "Done — validates the 0–100 range and applies the discount." It reads correct, and the implementer believes its own logic.
Reviewer (code-reviewer, read-only) — reviewing the diff + spec, not the report — flags:
Bug —
priceis never validated. The spec says reject invalid input, but onlypercentis checked.applyDiscount(-50, 10)returns-45(a negative price), andapplyDiscount("10", 10)returns"10" - NaN→NaNsilently. Blocking: add apricevalidation (finite,>= 0) and a test for negative/non-numeric price. I can't edit — returning to the implementer via the lead.
Tester (team-tester), independently: adds applyDiscount(-50, 10) and applyDiscount("10", 10) cases — both fail against the current implementation, confirming the reviewer's finding with red tests.
Lead arbitrates: routes a fix task back to the implementer (not the reviewer, who can't edit; not the tester, who must not patch). The implementer adds the price guard; the tester re-runs green; the reviewer clears the block. Now the lead calls it merge-ready.
The single-session version would have shipped the negative-price bug, because the same agent that wrote the percent-only check also believed it was complete. Role isolation is what surfaced it.
Limitations
- The feature is experimental and may change; it is gated behind the env flag above.
- File isolation is your responsibility (step 2) — the feature won't prevent overwrites for you.
- This is heavier than a normal change. Use it where independent review pays for the extra coordination, not for routine edits.