Skip to content

OpenAI Codex Worktrees and Parallel Agents Playbook

OpenAI Codex Worktrees and Parallel Agents Playbook

Section titled “OpenAI Codex Worktrees and Parallel Agents Playbook”

Worktrees are one of the most important Codex desktop concepts because they turn agent parallelism from chaos into reviewable isolation. Without worktrees, multiple agents editing the same checkout can collide with each other and with the developer’s own changes. With worktrees, each task can run in its own Git-backed copy while the main working tree stays usable.

The point is not to run the maximum number of agents. The point is to run independent work in isolation, then compare the results with normal engineering discipline.

Use Codex worktrees when a task is independent, likely to modify files, or long enough that you do not want it touching your active checkout. Keep the first rollout conservative: two to four parallel agents are usually enough until the team knows its review capacity. More agents create more diffs, more tests, more follow-up decisions, and more cleanup.

Use a worktree for:

  • multi-file features;
  • refactors;
  • dependency migrations;
  • UI changes that may need repeated attempts;
  • bug fixes where you want to preserve your local state;
  • alternative implementation experiments;
  • scheduled automations that might edit files;
  • subagent exploration where each agent owns a different question.

Do not use a worktree just because it sounds sophisticated. For a one-line comment or a small local test change, the local checkout may be simpler.

A good worktree task has:

ElementExample
Objective”Add CSV export for billing reports”
Boundary”Stay within reports, billing, and tests unless blocked”
Verification”Run npm test — billing and npm run typecheck”
Evidence”Report commands, failures, files changed, and migration risk”
Review path”Do not commit; leave a clean diff for review”

The worktree does not remove the need for scope. It just gives the agent a safer place to work.

Use one Codex thread to implement and a second read-only or review-focused thread to inspect the diff.

Good for:

  • production fixes;
  • complex refactors;
  • security-sensitive changes;
  • migrations where regressions are easy to miss.

Prompt shape:

Agent A: implement the smallest fix for the failing auth refresh test.
Agent B: independently review Agent A's eventual diff for behavior changes,
security issues, missing tests, and migration risk. Do not edit files.

Use two agents when the design choice is unclear.

Good for:

  • UI architecture choices;
  • data model migrations;
  • package replacement decisions;
  • performance fixes with competing tradeoffs.

The reviewer should compare:

  • diff size;
  • test coverage;
  • migration risk;
  • maintainability;
  • performance evidence;
  • rollback complexity.

Do not merge both. The value is in choosing one stronger path.

Use several agents for read-heavy investigation when the problem has independent dimensions:

  • one agent maps the frontend path;
  • one maps the API path;
  • one maps tests and fixtures;
  • one looks for prior incidents or TODOs.

Then consolidate before any implementation begins. This reduces context pollution and helps the main agent work from sharper evidence.

Parallelism is bounded by human review, not model throughput.

Use this simple capacity rule:

Team stateRecommended active write agents per repo
New to Codex1
Comfortable with reviewable diffs2
Strong tests and named reviewers3 to 4
Dedicated developer productivity workflowMore, only with queue policy

If reviewers are already behind, more agents make the system worse. The bottleneck moves from typing code to deciding whether code is safe.

Worktrees consume disk space because each one can have repo files, dependencies, build output, and caches. The official Codex worktrees docs note that Codex manages worktrees and keeps a limited number by default, but teams should still treat cleanup as an operating habit.

Cleanup policy:

  1. Archive threads that are no longer useful.
  2. Pin only worktrees with unresolved value.
  3. Delete failed explorations after extracting learning.
  4. Avoid leaving dependency caches in many stale worktrees.
  5. Write a short decision note before deleting an alternative path.

The goal is not perfect cleanliness. The goal is to avoid an invisible backlog of half-decisions.

Codex can move threads between local and worktree modes. Use this intentionally:

  • start in a worktree when the agent may explore;
  • hand back to local only after the direction is accepted;
  • avoid handoff when your local checkout has uncommitted changes that conflict with the agent’s work;
  • rerun verification after handoff because environment state can differ.

Treat handoff as a merge step, not a magic teleport.

RiskControl
Two agents change the same file differentlyAssign file ownership in the prompt
Reviewer cannot understand diff intentRequire plan and summary before final review
Worktree uses stale dependenciesRun setup or local environment scripts explicitly
Automation edits active workUse background worktrees for recurring changes
Too many stale worktreesArchive or delete after decision
Subagents spend tokens without valueUse subagents only for independent questions

This playbook is based on OpenAI’s Codex worktrees documentation, Codex app features, and subagents documentation.