Parallel Agent Orchestration — Research & Decision Path

Status: Research parked, resume after vacation Written: 2026-04-10 Decision owner: Paul Context: Considering whether to adopt Conductor.build, a similar tool, or extend BoB/warp-drive with parallel-agent capabilities.

⚠️ Staleness warning: This space moves weekly. Before acting on anything below, re-check each tool’s current state. Claude Code’s native Agent Teams in particular may have shipped stable — if so, most of this document becomes moot.

TL;DR decision

🔑 Critical constraint discovered after the original draft: Paul’s actual workflow is ssh farm-01 → tmux → git pull → warp-drive. Code lives on the remote VM, not the Mac. This eliminates Conductor and Crystal entirely (both are local desktop apps that can’t see remote code) and reshapes the decision around remote-first orchestrators.

Skip Conductor and Crystal. Wrong shape — they operate on local Mac worktrees, can’t see code on farm-01.
Build cdfork regardless of what Anthropic ships. Nanawall alone justifies it — the paying-client project has 18 independent custom modules, ongoing mechanical cleanup sweeps, and a headless migration that screams contract-driven parallel work. This is no longer speculative.
Check Anthropic’s Agent Teams in parallel — if it’s stable and in-process, use it in addition to cdfork for in-repo work that doesn’t need worktree isolation. They’re complementary, not competing: cdfork for worktree fan-out, Agent Teams for intra-session delegation.
Steal the autonomous CI-fix + merge-conflict idea from Composio regardless — it’s valuable for single-track warp-drive too.

Guiding principle: The orchestrator must run where the code lives. For Paul that’s a remote Linux VM accessed via SSH/tmux, which means desktop apps are categorically wrong and CLI/in-process tools are categorically right.

The problem being solved

Conductor-class tools supervise N parallel Claude Code / Codex agents, each in an isolated git worktree, with a dashboard for glanceable monitoring and diff-based review/merge. The distinctive value is fan-out + human referee.

BoB/warp-drive is orthogonal: a declarative harness + single-track autonomous loop driven by GitHub Issues. Depth within one agent’s lifecycle, not breadth across agents.

The two are complementary, not competing. Warp-drive can plausibly run inside a Conductor workspace unchanged.

What Conductor has that BoB doesn’t

Parallel fan-out across agents/models
Native macOS GUI with live multi-session dashboard
Transparent worktree-per-task management
Multi-model (Claude + Codex side-by-side on the same task)
Visual diff comparison and “pick a winner” review UX
Zero-config onboarding

What BoB has that Conductor doesn’t

Declarative provisioning (manifests, registry, symlink distribution)
GitHub-Issue-driven work discovery (vision → capability → requirement)
Full autonomy loop (discover → code → test → commit → report → PR)
Dev environment IaC (dev.json, seeded test users, health checks)
Hooks system (deterministic guardrails)
Shared skills, commands, agents, memory
RDB / remote control
PM scaffolding (retros, standups, journals, temporal guardrails)

The actual gap to close

Three features, in priority order:

Worktree-per-task orchestration — warp-drive is already worktree-aware (see “Worktree status” below); what’s missing is the outer orchestrator that spawns N worktrees and launches a warp-drive session in each
Multi-agent dashboard in cds showing N live sessions
Bake-off mode — run the same requirement through two models, diff, pick

Worktree status in warp-drive (as of 2026-04-10)

Verified against commands/warp-drive.md:41-51 and :178-182:

Warp-drive detects if it’s running inside a worktree (git rev-parse --git-dir vs --git-common-dir)
Warns the user, records "worktree": true in the state file, and pushes to remote after every commit as a safety net against worktree-cleanup data loss
Disaster recovery CLI can cherry-pick commit hashes from remote if a worktree is lost
Warp-drive does not create worktrees itself — it assumes you (or /start-work) set that up

Implication: building cdfork (spawn worktrees + launch warp-drive in each) is closer to a ~50-line bash script than a deep refactor. The safety rails already exist.

Subagents ≠ parallelism

When you observe “multiple subagents” during a warp-drive run, those are Claude Code’s normal intra-chunk delegation via the Agent tool (Explore, Plan, code-reviewer, etc.) — not parallel work on multiple requirements. Same behavior as any regular Claude Code session. Parallelism across requirements is still the unsolved piece.

Landscape as of 2026-04-10

Tier 1 — GUI supervisors (Conductor-alikes)

Tool	Shape	Why consider
Conductor.build	Native macOS, closed source	Most polished; used by Linear/Vercel/Notion
Crystal	Electron, open source	Start here — forkable, same model as Conductor

Tier 2 — CLI/PR orchestrators

Tool	Shape	Why consider
Composio Agent Orchestrator	CLI	Closest in ambition to warp-drive; handles CI fixes + merge conflicts autonomously
code-conductor (ryanmac)	CLI	Hackable, GitHub-native

Tier 3 — In-process agent frameworks

Tool	Shape	Why consider
Claude Code Agent Teams	Anthropic built-in	The one to watch — if stable, obsoletes most third-party orchestrators
Ruflo (ruvnet)	Framework	Swarm-style, heavy
wshobson/agents	Collection	Specialist agent patterns
barkain/claude-code-workflow-orchestration	Plugin	Task decomposition + plan mode integration
oh-my-claudecode	Layer	Claims 3-5x speedup, 30-50% token savings

Action plan when you resume

Phase 0 — Re-baseline (30 min)

Re-check Anthropic’s Agent Teams status: stable? recommended patterns? Does it use worktrees?
Check if Conductor, Crystal, and Composio have shipped major changes
Skim 1–2 recent blog posts on the multi-agent landscape for anything new

Phase 1 — Pick the orchestrator shape (~1 hour)

Decision tree, in order:

Is Anthropic Agent Teams stable?
- Check code.claude.com/docs/en/agent-teams for status
- If yes → that’s the answer. Skip to Phase 2A.
- If no → continue
Build cdfork v0. ~50 lines of bash on farm-01:
- git worktree add per branch
- tmux new-window per worktree, launching warp-drive
- Print session/window summary
- No state file, no daemon — tmux is the state

Phase 2 — Use it on real work (1 week)

A. If using Agent Teams:

Pick a capability with 2–3 approved requirements
Configure a team-lead session that delegates each requirement to a teammate
Note: where does Agent Teams put each teammate’s filesystem? If in-process context split (no worktree), tests/builds may collide. Find out and document.

B. If using cdfork:

Start with nanawall. Pick a cluster of 3–4 open issues across different custom modules. cdfork fix/a feat/b cleanup/c
Walk between tmux windows to spot-check
Note frictions: cleanup ergonomics, branch naming, tmux navigation, warp-drive cross-session conflicts
Then try cdfork-pair on one CAP-XX headless migration chunk (one Drupal + one SvelteKit worktree, contract-driven)

Phase 3 — Decide what (if anything) to formalize

Three outcomes:

A. Parallelism sticks with Agent Teams → wire warp-drive’s GitHub-Issue work-discovery into the team-lead prompt. Warp-drive becomes the “team lead” that picks N requirements and dispatches them. This is the highest-leverage outcome — minimum new code, maximum integration with existing PM hierarchy.

B. Parallelism sticks with cdfork → harden the script:

Auto-pick branch names from open GitHub Issues (filter out serial-only labeled)
Add cdfork status (just tmux ls + git worktree list, formatted)
Add cdfork merge <branch> and cdfork drop <branch> for cleanup
Add cdfork-pair for cross-repo contract-driven work (headless migration pattern)
Pinch-point awareness: honor serial-only label for config/schema/dep/deploy issues; refuse to fan them out
Optional: extend cds dashboard to show tmux windows for the project
Target: still <300 lines total, still bash

C. Parallelism doesn’t stick → keep single-track warp-drive. Steal Composio’s autonomous CI-fix + merge-conflict handling idea regardless — valuable for single-track too. Document why parallelism didn’t fit (likely: context-switching cost > throughput gain for a solo dev).

Do not build

Native GUI of any kind (Conductor/Crystal already exist if you ever decide you want one — and they don’t fit the workflow anyway)
PTY streaming, visual diff UX, multi-model subscription plumbing
Per-suffix Farm sandboxes (no longer blocking — sandbox abstraction isn’t part of the daily inner loop)
A daemon, a state file, or a database for cdfork. Tmux + git are the state.

Nanawall — the anchor use case

Nanawall alone justifies building cdfork. It’s the paying-client project and the single highest-leverage target for parallelism in the entire portfolio. Everything else is a bonus.

Structural facts (verified 2026-04-10)

Two repos at ~/Sites/nanawall/:

nanawalld8/ — Drupal 11.3.2 on Acquia Cloud (not D8; directory name is legacy). Web root docroot/.
nanawall-web/ — SvelteKit headless frontend under active construction. Will consume Drupal as JSON:API. Migration in progress, not yet replacing the Drupal-rendered site.

Inside nanawalld8:

18 custom modules: nanaresources, nanagallery_order, nanaimage, nanamedia, nanatwig, nanafunctions, nanaheaders, nanaview, repfinder, sentry_fullstory, wistiamods, bytes_format, custom_error_handler, imagesource, inline_to_media, mylocation, nanadrush, r2drupal, remove_empty_paragraphs
2 custom themes: nanaclaro, nanawall22
Full test stack: PHPUnit + Jest + Playwright e2e
Acquia pipeline: local → dev → test → prod
Active sweeps: cleanup-progress.md, PARAGRAPH-CLEANUP-MANIFEST.md

Structural prerequisites for parallelism already exist

Most projects that could benefit from parallelism don’t have the scaffolding to make it safe. Nanawall does:

Capability/requirement decomposition (CAP-01..11 + RSF system) — pre-sized, traceable chunks
Branch/issue decomposition habit — /start-work produces clean, pre-sized branches ready to spawn into worktrees. Note: worktrees themselves are NOT yet part of the nanawall workflow (/start-work does not support them despite earlier doc drift suggesting otherwise, corrected 2026-04-11). cdfork will be introducing the worktree pattern to nanawall — which is fine since git handles it cleanly and there’s no existing convention to preserve
Multi-suite test coverage — the safety net that makes “set 4 agents loose” tolerable
Acquia staging gate — pre-prod verification catches anything the agents miss
Journal + decision tracking — paper trail for every agent’s output

Where parallelism wins on nanawall

Module work fan-out — most weeks have 3–4 independent tickets across different custom modules. Textbook fan-out target.
Cleanup sweeps — PARAGRAPH-CLEANUP-MANIFEST.md is almost certainly a list of N mechanical tasks. One agent per entry is the highest-leverage use of multi-agent in the whole portfolio.
Headless migration — this is the biggest long-term win. Contract-driven cross-repo pattern:
- Define the JSON:API contract up front
- Agent A extends the Drupal side (endpoint, fields, permissions, tests)
- Agent B builds the SvelteKit consumer (against a mock matching the contract)
- Integrate at the end with real endpoint + e2e
- Mirrors how a human backend/frontend team would split the work
- 11 capabilities × 2 agents per cap, contract-isolated
Test farm — when PHP, JS, and e2e are all red for unrelated reasons, debug all three in parallel.

Where parallelism breaks — Drupal pinch points

Nanawall has single-source-of-truth pinch points that cannot be parallelized:

Pinch point	Why
`config/sync/`	Drupal config import/export. Two agents = merge hell.
Database updates / `update.php`	Sequential by definition.
`composer.json` / `composer.lock`	Dep changes serialize (lockfile conflicts).
Acquia deploy to staging/prod	Single pipeline.
Compiled theme assets	Build artifacts collide unless isolated per worktree.

Rule: parallelism works for module code, frontend code, tests, cleanup. Stop at config sync, schema, deps, deploy. Any orchestrator needs a “serialize this” gate for those four.

Two new features nanawall adds to the build list

cdfork-pair — contract-driven cross-repo mode for the headless migration:

cdfork-pair \
  --contract docs/contracts/cap-04-product-listing.md \
  --backend  ../nanawalld8 \
  --frontend ../nanawall-web \
  --branch   feat/cap-04-product-listing

Spawns two worktrees (one per repo) on the same branch name, both warp-drive sessions starting from the same contract doc. They build their respective sides, integrate at the end.

Pinch-point awareness — orchestrator needs to know which issues cannot fan out:

Tag GitHub Issues with a serial-only label for config/schema/dep/deploy work
cdfork refuses to fan out serial-only issues and runs them sequentially
Easiest implementation: filter the issue list before worktree spawn
For nanawall specifically: any issue touching config/sync/, composer.json, update.php, or theme SCSS builds should carry the label

Concrete `cdfork` invocation for a typical nanawall week

# On farm-01, in nanawall/nanawalld8:
cdfork \
  fix/nanaresources-rendering \
  feat/nanaimage-cropping \
  cleanup/nanafunctions-unused \
  fix/sentry-fullstory-init

Four worktrees, four tmux windows, four warp-drive sessions. Paul walks between windows, spot-checks, intervenes when needed. Each agent finds its matching GitHub Issue, runs the loop, commits, pushes.

The actual workflow (corrected)

Paul does not edit code on his Mac. Day-to-day:

ssh farm-01           # Hetzner VM
tmux a                # or new tmux session
cd <project>
git pull
warp-drive            # runs on the VM

The Mac is a thin SSH client. The ~/Sites/thefarm repo and farm CLI exist as infrastructure but aren’t part of the daily inner loop. Cloudflare sandboxes exist but Paul mostly sidesteps them by working directly on the VM.

Implications that reshape everything:

Desktop GUI orchestrators are categorically wrong. Conductor, Crystal, anything Electron-on-Mac — they cannot see code on farm-01. Strike them from the candidate list.
The orchestrator must run on the VM. Whatever it is, ssh farm-01 && start-orchestrator has to work. That means: bash + tmux, or in-process inside Claude Code.
Sandbox isolation is mostly moot. Since Paul doesn’t use the farm sandbox abstraction day-to-day, the per-suffix sandbox work I previously flagged as Phase 3 item #0 is no longer a prerequisite. It’s still nice-to-have if Farm sandboxes ever become part of the inner loop, but it’s not blocking parallel agents.
Parallelism on the VM is just more tmux windows. The mental model is dead simple: each parallel agent = one worktree + one tmux window + one warp-drive session. No GUI, no IPC, no dashboard required for v1 — tmux ls is the dashboard.

Viable orchestrator shapes for this workflow

Shape	Fits?	Notes
Conductor / Crystal (desktop GUI)	❌	Wrong host. Can’t see remote code.
`cdfork` (bash on VM)	✅	`git worktree add` + `tmux new-window` + `warp-drive`. ~50 lines.
Composio Agent Orchestrator (CLI)	✅	Runs anywhere with git + node. Worth evaluating on the VM.
code-conductor (CLI)	✅	Same — CLI, runs on VM.
Anthropic Agent Teams (in-process)	✅✅	Best fit. Runs inside the existing Claude Code session in the existing tmux window. Zero new infrastructure.
Ruflo / wshobson (frameworks)	⚠️	Possible but heavy. Evaluate only if Agent Teams isn’t ready.

What `cdfork` looks like for this workflow

# On farm-01, in a project directory:
cdfork feat-x feat-y feat-z
 
# Does:
# 1. For each branch: git worktree add ../<project>-<branch> <branch>
# 2. For each worktree: tmux new-window -n <branch> "cd <worktree> && warp-drive"
# 3. Print: tmux session name + window list

That’s it. No daemon, no state, no GUI. tmux ls shows what’s running. tmux a -t <session> to inspect any agent. git worktree remove + tmux kill-window to clean up. Recovery is whatever tmux + warp-drive’s existing state machine already gives you.

Open questions to resolve on return

Does Anthropic’s Agent Teams use worktrees or in-process context splits? (Architecture matters — worktrees integrate with warp-drive, in-process doesn’t.)
Is there a way to run Conductor/Crystal headless so warp-drive can spawn workspaces programmatically?
For the bake-off use case, is Codex actually worth the plumbing vs. just running two Claude sessions with different prompts?
Would RDB mode work inside a Conductor workspace, or does the GUI intercept notifications?

Reference: the original conversation

Started from “how does conductor.build compare to BoB/warp-drive?”
Established: Conductor = breadth across agents; BoB = depth within one agent’s lifecycle
User instinct: “I like having control” → flagged as good for agent layer, not window management
Agreed path: try off-the-shelf before building, revisit after vacation
This doc written 2026-04-10, resume after a few weeks

🪴 Quartz 4.0

Explorer

parallel-agent-orchestration

Parallel Agent Orchestration — Research & Decision Path

TL;DR decision

The problem being solved

What Conductor has that BoB doesn’t

What BoB has that Conductor doesn’t

The actual gap to close

Worktree status in warp-drive (as of 2026-04-10)

Subagents ≠ parallelism

Landscape as of 2026-04-10

Tier 1 — GUI supervisors (Conductor-alikes)

Tier 2 — CLI/PR orchestrators

Tier 3 — In-process agent frameworks

Action plan when you resume

Phase 0 — Re-baseline (30 min)

Phase 1 — Pick the orchestrator shape (~1 hour)

Phase 2 — Use it on real work (1 week)

Phase 3 — Decide what (if anything) to formalize

Do not build

Nanawall — the anchor use case

Structural facts (verified 2026-04-10)

Structural prerequisites for parallelism already exist

Where parallelism wins on nanawall

Where parallelism breaks — Drupal pinch points

Two new features nanawall adds to the build list

Concrete `cdfork` invocation for a typical nanawall week

The actual workflow (corrected)

Viable orchestrator shapes for this workflow

What `cdfork` looks like for this workflow

Open questions to resolve on return

Reference: the original conversation

Graph View

Table of Contents

Backlinks

🪴 Quartz 4.0

Explorer

parallel-agent-orchestration

Parallel Agent Orchestration — Research & Decision Path

TL;DR decision

The problem being solved

What Conductor has that BoB doesn’t

What BoB has that Conductor doesn’t

The actual gap to close

Worktree status in warp-drive (as of 2026-04-10)

Subagents ≠ parallelism

Landscape as of 2026-04-10

Tier 1 — GUI supervisors (Conductor-alikes)

Tier 2 — CLI/PR orchestrators

Tier 3 — In-process agent frameworks

Action plan when you resume

Phase 0 — Re-baseline (30 min)

Phase 1 — Pick the orchestrator shape (~1 hour)

Phase 2 — Use it on real work (1 week)

Phase 3 — Decide what (if anything) to formalize

Do not build

Nanawall — the anchor use case

Structural facts (verified 2026-04-10)

Structural prerequisites for parallelism already exist

Where parallelism wins on nanawall

Where parallelism breaks — Drupal pinch points

Two new features nanawall adds to the build list

Concrete cdfork invocation for a typical nanawall week

The actual workflow (corrected)

Viable orchestrator shapes for this workflow

What cdfork looks like for this workflow

Open questions to resolve on return

Reference: the original conversation

Graph View

Table of Contents

Backlinks

Concrete `cdfork` invocation for a typical nanawall week

What `cdfork` looks like for this workflow