Parallel Agent Orchestration — Research & Decision Path
Status: Research parked, resume after vacation Written: 2026-04-10 Decision owner: Paul Context: Considering whether to adopt Conductor.build, a similar tool, or extend BoB/warp-drive with parallel-agent capabilities.
⚠️ Staleness warning: This space moves weekly. Before acting on anything below, re-check each tool’s current state. Claude Code’s native Agent Teams in particular may have shipped stable — if so, most of this document becomes moot.
TL;DR decision
🔑 Critical constraint discovered after the original draft: Paul’s actual workflow is
ssh farm-01 → tmux → git pull → warp-drive. Code lives on the remote VM, not the Mac. This eliminates Conductor and Crystal entirely (both are local desktop apps that can’t see remote code) and reshapes the decision around remote-first orchestrators.
- Skip Conductor and Crystal. Wrong shape — they operate on local Mac worktrees, can’t see code on
farm-01. - Build
cdforkregardless of what Anthropic ships. Nanawall alone justifies it — the paying-client project has 18 independent custom modules, ongoing mechanical cleanup sweeps, and a headless migration that screams contract-driven parallel work. This is no longer speculative. - Check Anthropic’s Agent Teams in parallel — if it’s stable and in-process, use it in addition to
cdforkfor in-repo work that doesn’t need worktree isolation. They’re complementary, not competing:cdforkfor worktree fan-out, Agent Teams for intra-session delegation. - Steal the autonomous CI-fix + merge-conflict idea from Composio regardless — it’s valuable for single-track warp-drive too.
Guiding principle: The orchestrator must run where the code lives. For Paul that’s a remote Linux VM accessed via SSH/tmux, which means desktop apps are categorically wrong and CLI/in-process tools are categorically right.
The problem being solved
Conductor-class tools supervise N parallel Claude Code / Codex agents, each in an isolated git worktree, with a dashboard for glanceable monitoring and diff-based review/merge. The distinctive value is fan-out + human referee.
BoB/warp-drive is orthogonal: a declarative harness + single-track autonomous loop driven by GitHub Issues. Depth within one agent’s lifecycle, not breadth across agents.
The two are complementary, not competing. Warp-drive can plausibly run inside a Conductor workspace unchanged.
What Conductor has that BoB doesn’t
- Parallel fan-out across agents/models
- Native macOS GUI with live multi-session dashboard
- Transparent worktree-per-task management
- Multi-model (Claude + Codex side-by-side on the same task)
- Visual diff comparison and “pick a winner” review UX
- Zero-config onboarding
What BoB has that Conductor doesn’t
- Declarative provisioning (manifests, registry, symlink distribution)
- GitHub-Issue-driven work discovery (vision → capability → requirement)
- Full autonomy loop (discover → code → test → commit → report → PR)
- Dev environment IaC (
dev.json, seeded test users, health checks) - Hooks system (deterministic guardrails)
- Shared skills, commands, agents, memory
- RDB / remote control
- PM scaffolding (retros, standups, journals, temporal guardrails)
The actual gap to close
Three features, in priority order:
- Worktree-per-task orchestration — warp-drive is already worktree-aware (see “Worktree status” below); what’s missing is the outer orchestrator that spawns N worktrees and launches a warp-drive session in each
- Multi-agent dashboard in
cdsshowing N live sessions - Bake-off mode — run the same requirement through two models, diff, pick
Worktree status in warp-drive (as of 2026-04-10)
Verified against commands/warp-drive.md:41-51 and :178-182:
- Warp-drive detects if it’s running inside a worktree (
git rev-parse --git-dirvs--git-common-dir) - Warns the user, records
"worktree": truein the state file, and pushes to remote after every commit as a safety net against worktree-cleanup data loss - Disaster recovery CLI can cherry-pick commit hashes from remote if a worktree is lost
- Warp-drive does not create worktrees itself — it assumes you (or
/start-work) set that up
Implication: building cdfork (spawn worktrees + launch warp-drive in each) is closer to a ~50-line bash script than a deep refactor. The safety rails already exist.
Subagents ≠ parallelism
When you observe “multiple subagents” during a warp-drive run, those are Claude Code’s normal intra-chunk delegation via the Agent tool (Explore, Plan, code-reviewer, etc.) — not parallel work on multiple requirements. Same behavior as any regular Claude Code session. Parallelism across requirements is still the unsolved piece.
Landscape as of 2026-04-10
Tier 1 — GUI supervisors (Conductor-alikes)
| Tool | Shape | Why consider |
|---|---|---|
| Conductor.build | Native macOS, closed source | Most polished; used by Linear/Vercel/Notion |
| Crystal | Electron, open source | Start here — forkable, same model as Conductor |
Tier 2 — CLI/PR orchestrators
| Tool | Shape | Why consider |
|---|---|---|
| Composio Agent Orchestrator | CLI | Closest in ambition to warp-drive; handles CI fixes + merge conflicts autonomously |
| code-conductor (ryanmac) | CLI | Hackable, GitHub-native |
Tier 3 — In-process agent frameworks
| Tool | Shape | Why consider |
|---|---|---|
| Claude Code Agent Teams | Anthropic built-in | The one to watch — if stable, obsoletes most third-party orchestrators |
| Ruflo (ruvnet) | Framework | Swarm-style, heavy |
| wshobson/agents | Collection | Specialist agent patterns |
| barkain/claude-code-workflow-orchestration | Plugin | Task decomposition + plan mode integration |
| oh-my-claudecode | Layer | Claims 3-5x speedup, 30-50% token savings |
Action plan when you resume
Phase 0 — Re-baseline (30 min)
- Re-check Anthropic’s Agent Teams status: stable? recommended patterns? Does it use worktrees?
- Check if Conductor, Crystal, and Composio have shipped major changes
- Skim 1–2 recent blog posts on the multi-agent landscape for anything new
Phase 1 — Pick the orchestrator shape (~1 hour)
Decision tree, in order:
-
Is Anthropic Agent Teams stable?
- Check
code.claude.com/docs/en/agent-teamsfor status - If yes → that’s the answer. Skip to Phase 2A.
- If no → continue
- Check
-
Build
cdforkv0. ~50 lines of bash on farm-01:git worktree addper branchtmux new-windowper worktree, launching warp-drive- Print session/window summary
- No state file, no daemon — tmux is the state
Phase 2 — Use it on real work (1 week)
A. If using Agent Teams:
- Pick a capability with 2–3 approved requirements
- Configure a team-lead session that delegates each requirement to a teammate
- Note: where does Agent Teams put each teammate’s filesystem? If in-process context split (no worktree), tests/builds may collide. Find out and document.
B. If using cdfork:
- Start with nanawall. Pick a cluster of 3–4 open issues across different custom modules.
cdfork fix/a feat/b cleanup/c - Walk between tmux windows to spot-check
- Note frictions: cleanup ergonomics, branch naming, tmux navigation, warp-drive cross-session conflicts
- Then try
cdfork-pairon one CAP-XX headless migration chunk (one Drupal + one SvelteKit worktree, contract-driven)
Phase 3 — Decide what (if anything) to formalize
Three outcomes:
A. Parallelism sticks with Agent Teams → wire warp-drive’s GitHub-Issue work-discovery into the team-lead prompt. Warp-drive becomes the “team lead” that picks N requirements and dispatches them. This is the highest-leverage outcome — minimum new code, maximum integration with existing PM hierarchy.
B. Parallelism sticks with cdfork → harden the script:
- Auto-pick branch names from open GitHub Issues (filter out
serial-onlylabeled) - Add
cdfork status(justtmux ls+git worktree list, formatted) - Add
cdfork merge <branch>andcdfork drop <branch>for cleanup - Add
cdfork-pairfor cross-repo contract-driven work (headless migration pattern) - Pinch-point awareness: honor
serial-onlylabel for config/schema/dep/deploy issues; refuse to fan them out - Optional: extend
cdsdashboard to show tmux windows for the project - Target: still <300 lines total, still bash
C. Parallelism doesn’t stick → keep single-track warp-drive. Steal Composio’s autonomous CI-fix + merge-conflict handling idea regardless — valuable for single-track too. Document why parallelism didn’t fit (likely: context-switching cost > throughput gain for a solo dev).
Do not build
- Native GUI of any kind (Conductor/Crystal already exist if you ever decide you want one — and they don’t fit the workflow anyway)
- PTY streaming, visual diff UX, multi-model subscription plumbing
- Per-suffix Farm sandboxes (no longer blocking — sandbox abstraction isn’t part of the daily inner loop)
- A daemon, a state file, or a database for
cdfork. Tmux + git are the state.
Nanawall — the anchor use case
Nanawall alone justifies building cdfork. It’s the paying-client project and the single highest-leverage target for parallelism in the entire portfolio. Everything else is a bonus.
Structural facts (verified 2026-04-10)
Two repos at ~/Sites/nanawall/:
nanawalld8/— Drupal 11.3.2 on Acquia Cloud (not D8; directory name is legacy). Web rootdocroot/.nanawall-web/— SvelteKit headless frontend under active construction. Will consume Drupal as JSON:API. Migration in progress, not yet replacing the Drupal-rendered site.
Inside nanawalld8:
- 18 custom modules: nanaresources, nanagallery_order, nanaimage, nanamedia, nanatwig, nanafunctions, nanaheaders, nanaview, repfinder, sentry_fullstory, wistiamods, bytes_format, custom_error_handler, imagesource, inline_to_media, mylocation, nanadrush, r2drupal, remove_empty_paragraphs
- 2 custom themes: nanaclaro, nanawall22
- Full test stack: PHPUnit + Jest + Playwright e2e
- Acquia pipeline: local → dev → test → prod
- Active sweeps:
cleanup-progress.md,PARAGRAPH-CLEANUP-MANIFEST.md
Structural prerequisites for parallelism already exist
Most projects that could benefit from parallelism don’t have the scaffolding to make it safe. Nanawall does:
- Capability/requirement decomposition (CAP-01..11 + RSF system) — pre-sized, traceable chunks
- Branch/issue decomposition habit —
/start-workproduces clean, pre-sized branches ready to spawn into worktrees. Note: worktrees themselves are NOT yet part of the nanawall workflow (/start-workdoes not support them despite earlier doc drift suggesting otherwise, corrected 2026-04-11).cdforkwill be introducing the worktree pattern to nanawall — which is fine since git handles it cleanly and there’s no existing convention to preserve - Multi-suite test coverage — the safety net that makes “set 4 agents loose” tolerable
- Acquia staging gate — pre-prod verification catches anything the agents miss
- Journal + decision tracking — paper trail for every agent’s output
Where parallelism wins on nanawall
- Module work fan-out — most weeks have 3–4 independent tickets across different custom modules. Textbook fan-out target.
- Cleanup sweeps —
PARAGRAPH-CLEANUP-MANIFEST.mdis almost certainly a list of N mechanical tasks. One agent per entry is the highest-leverage use of multi-agent in the whole portfolio. - Headless migration — this is the biggest long-term win. Contract-driven cross-repo pattern:
- Define the JSON:API contract up front
- Agent A extends the Drupal side (endpoint, fields, permissions, tests)
- Agent B builds the SvelteKit consumer (against a mock matching the contract)
- Integrate at the end with real endpoint + e2e
- Mirrors how a human backend/frontend team would split the work
- 11 capabilities × 2 agents per cap, contract-isolated
- Test farm — when PHP, JS, and e2e are all red for unrelated reasons, debug all three in parallel.
Where parallelism breaks — Drupal pinch points
Nanawall has single-source-of-truth pinch points that cannot be parallelized:
| Pinch point | Why |
|---|---|
config/sync/ | Drupal config import/export. Two agents = merge hell. |
Database updates / update.php | Sequential by definition. |
composer.json / composer.lock | Dep changes serialize (lockfile conflicts). |
| Acquia deploy to staging/prod | Single pipeline. |
| Compiled theme assets | Build artifacts collide unless isolated per worktree. |
Rule: parallelism works for module code, frontend code, tests, cleanup. Stop at config sync, schema, deps, deploy. Any orchestrator needs a “serialize this” gate for those four.
Two new features nanawall adds to the build list
cdfork-pair — contract-driven cross-repo mode for the headless migration:
cdfork-pair \
--contract docs/contracts/cap-04-product-listing.md \
--backend ../nanawalld8 \
--frontend ../nanawall-web \
--branch feat/cap-04-product-listingSpawns two worktrees (one per repo) on the same branch name, both warp-drive sessions starting from the same contract doc. They build their respective sides, integrate at the end.
Pinch-point awareness — orchestrator needs to know which issues cannot fan out:
- Tag GitHub Issues with a
serial-onlylabel for config/schema/dep/deploy work cdforkrefuses to fan outserial-onlyissues and runs them sequentially- Easiest implementation: filter the issue list before worktree spawn
- For nanawall specifically: any issue touching
config/sync/,composer.json,update.php, or theme SCSS builds should carry the label
Concrete cdfork invocation for a typical nanawall week
# On farm-01, in nanawall/nanawalld8:
cdfork \
fix/nanaresources-rendering \
feat/nanaimage-cropping \
cleanup/nanafunctions-unused \
fix/sentry-fullstory-initFour worktrees, four tmux windows, four warp-drive sessions. Paul walks between windows, spot-checks, intervenes when needed. Each agent finds its matching GitHub Issue, runs the loop, commits, pushes.
The actual workflow (corrected)
Paul does not edit code on his Mac. Day-to-day:
ssh farm-01 # Hetzner VM
tmux a # or new tmux session
cd <project>
git pull
warp-drive # runs on the VM
The Mac is a thin SSH client. The ~/Sites/thefarm repo and farm CLI exist as infrastructure but aren’t part of the daily inner loop. Cloudflare sandboxes exist but Paul mostly sidesteps them by working directly on the VM.
Implications that reshape everything:
- Desktop GUI orchestrators are categorically wrong. Conductor, Crystal, anything Electron-on-Mac — they cannot see code on
farm-01. Strike them from the candidate list. - The orchestrator must run on the VM. Whatever it is,
ssh farm-01 && start-orchestratorhas to work. That means: bash + tmux, or in-process inside Claude Code. - Sandbox isolation is mostly moot. Since Paul doesn’t use the
farmsandbox abstraction day-to-day, the per-suffix sandbox work I previously flagged as Phase 3 item #0 is no longer a prerequisite. It’s still nice-to-have if Farm sandboxes ever become part of the inner loop, but it’s not blocking parallel agents. - Parallelism on the VM is just more tmux windows. The mental model is dead simple: each parallel agent = one worktree + one tmux window + one warp-drive session. No GUI, no IPC, no dashboard required for v1 —
tmux lsis the dashboard.
Viable orchestrator shapes for this workflow
| Shape | Fits? | Notes |
|---|---|---|
| Conductor / Crystal (desktop GUI) | ❌ | Wrong host. Can’t see remote code. |
cdfork (bash on VM) | ✅ | git worktree add + tmux new-window + warp-drive. ~50 lines. |
| Composio Agent Orchestrator (CLI) | ✅ | Runs anywhere with git + node. Worth evaluating on the VM. |
| code-conductor (CLI) | ✅ | Same — CLI, runs on VM. |
| Anthropic Agent Teams (in-process) | ✅✅ | Best fit. Runs inside the existing Claude Code session in the existing tmux window. Zero new infrastructure. |
| Ruflo / wshobson (frameworks) | ⚠️ | Possible but heavy. Evaluate only if Agent Teams isn’t ready. |
What cdfork looks like for this workflow
# On farm-01, in a project directory:
cdfork feat-x feat-y feat-z
# Does:
# 1. For each branch: git worktree add ../<project>-<branch> <branch>
# 2. For each worktree: tmux new-window -n <branch> "cd <worktree> && warp-drive"
# 3. Print: tmux session name + window listThat’s it. No daemon, no state, no GUI. tmux ls shows what’s running. tmux a -t <session> to inspect any agent. git worktree remove + tmux kill-window to clean up. Recovery is whatever tmux + warp-drive’s existing state machine already gives you.
Open questions to resolve on return
- Does Anthropic’s Agent Teams use worktrees or in-process context splits? (Architecture matters — worktrees integrate with warp-drive, in-process doesn’t.)
- Is there a way to run Conductor/Crystal headless so warp-drive can spawn workspaces programmatically?
- For the bake-off use case, is Codex actually worth the plumbing vs. just running two Claude sessions with different prompts?
- Would RDB mode work inside a Conductor workspace, or does the GUI intercept notifications?
Reference: the original conversation
- Started from “how does conductor.build compare to BoB/warp-drive?”
- Established: Conductor = breadth across agents; BoB = depth within one agent’s lifecycle
- User instinct: “I like having control” → flagged as good for agent layer, not window management
- Agreed path: try off-the-shelf before building, revisit after vacation
- This doc written 2026-04-10, resume after a few weeks