OpenAI Codex Desktop Prompt Playbook

Codex desktop prompts should not sound like generic chatbot prompts. The best prompts behave like small work orders: they define the job, boundary, verification method, and reporting format. Codex can do more when given tools and context, but the prompt still determines whether the result is reviewable.

Use these templates as starting points. Replace file paths, commands, and constraints with the real repository context.

1. Read-only codebase exploration

Use this before allowing writes in an unfamiliar repo.

Explore this repository without editing files.

Goal:
- explain how [feature or flow] works from entry point to persistence layer.

Scope:
- inspect only the files needed to answer;
- do not run network commands;
- do not change files.

Return:
- main files involved;
- request or data flow;
- important abstractions;
- tests that cover the behavior;
- risks or unclear areas;
- a short follow-up plan if we later modify this feature.

Why it works: it builds context and reveals whether Codex understands the system before creating diff review burden.

2. Plan before patch

Use this when the implementation may touch several files.

Before editing, make a short implementation plan for [task].

Include:
- files you expect to inspect;
- files you expect to change;
- assumptions that need verification;
- tests or commands you will run;
- risks that might make the task larger than expected.

Wait for approval before editing.

Why it works: it separates reasoning from file modification and gives the human a checkpoint.

3. Small bug fix with evidence

Fix [bug or failing test].

Constraints:
- make the smallest behavior-preserving change that solves the issue;
- stay within [directories/files] unless you find a direct dependency;
- do not update dependencies;
- do not weaken tests.

Verification:
- run [exact test command];
- if the command fails for an unrelated reason, report the failure and stop.

Final response:
- root cause;
- files changed;
- test command and result;
- remaining risk.

Why it works: it blocks the common failure mode where an agent expands scope to make progress.

4. Refactor without behavior change

Refactor [module/file] to [desired structure] without changing behavior.

Rules:
- preserve public API and existing behavior;
- avoid opportunistic cleanup outside the target area;
- keep commits or changes reviewable by logical step;
- add or update tests only if the refactor exposes missing coverage.

Verification:
- run [unit test command];
- run [typecheck/lint command].

Report:
- what changed structurally;
- why behavior should be unchanged;
- commands run;
- any behavior risk reviewers should inspect.

Why it works: it tells Codex that “cleaner” is not permission to redesign.

5. Frontend visual QA loop

Open the local app preview and inspect [page/component].

Check:
- desktop viewport [size];
- mobile viewport [size];
- horizontal overflow;
- CTA visibility;
- spacing and typography;
- interactive states if relevant.

If an issue is visible, make the smallest code change, then re-check.

Return:
- viewports tested;
- issue observed;
- files changed;
- before/after evidence;
- limitations of the check.

Why it works: visual tasks need evidence and repeatability, not taste-only comments.

6. PR review with risk categories

Review the current diff as if you are a senior reviewer.

Focus on:
- correctness bugs;
- security and permission changes;
- missing tests;
- behavior regressions;
- performance or reliability risk;
- maintainability problems.

Do not rewrite the code unless asked.

Return findings first, ordered by severity, with file references and a short explanation.
If there are no findings, say that explicitly and list residual testing gaps.

Why it works: it forces a review mindset and prevents the agent from drifting into unsolicited implementation.

7. Subagent exploration

Use only when questions are independent.

Spawn separate agents to investigate these independent questions.
Do not edit files.

1. Map the frontend flow for [feature].
2. Map the backend/API flow for [feature].
3. Find relevant tests and fixtures.
4. Identify migration or compatibility risks.

Wait for all agents, then consolidate:
- strongest evidence;
- conflicting findings;
- recommended implementation boundary;
- files likely to change.

Why it works: subagents are useful for parallel exploration, but expensive for tasks that are not truly separable.

8. Worktree implementation

Start this task in a Codex-managed worktree so it does not touch my local checkout.

Task:
- implement [specific change].

Boundaries:
- allowed files/directories: [list];
- forbidden changes: [list];
- do not commit or merge.

Verification:
- run [commands].

Final response:
- branch/worktree context;
- files changed;
- commands and results;
- whether the change is ready for human review.

Why it works: it makes isolation explicit and preserves local developer state.

9. Turn a repeated workflow into a skill

Create a Codex skill for this repeated workflow: [workflow].

The skill should include:
- a focused description;
- when to use it;
- required inputs;
- step-by-step instructions;
- verification commands;
- output format;
- failure and escalation rules.

Do not overgeneralize. Keep the skill focused on one job.
After drafting it, explain how I should test whether Codex triggers it appropriately.

Why it works: it turns reliable prompting into reusable operations without jumping straight to plugin complexity.

10. Draft an automation safely

Draft a Codex automation for [recurring task].

Before creating it, propose:
- schedule;
- whether it should be a thread or standalone automation;
- whether it should use local mode or a worktree;
- allowed actions;
- forbidden actions;
- evidence required in each run;
- stop conditions;
- first-run review plan.

Do not create the automation until the proposal is clear.

Why it works: unattended work needs more policy than interactive work.

11. Debug a failed Codex run

Analyze why the previous Codex attempt failed.

Separate possible causes:
- unclear task scope;
- missing repository context;
- wrong environment or shell;
- sandbox or approval limitation;
- missing dependency;
- bad test command;
- model/tool mistake;
- prompt instruction conflict.

Return:
- most likely cause;
- evidence;
- smaller retry prompt;
- what I should change in repo instructions or setup.

Why it works: it prevents users from blaming the model when the real failure is environment or task design.

12. Non-code knowledge work using Codex

Use Codex as a structured work agent for [report, spreadsheet, document, slide deck, inbox, or research task].

Inputs:
- [files, folders, connected app, or source list].

Rules:
- preserve source evidence;
- do not invent facts;
- identify uncertainty;
- create a reviewable artifact;
- do not send, publish, or modify external systems without approval.

Return:
- artifact created or updated;
- sources inspected;
- assumptions;
- next actions requiring human judgment.

Why it works: Codex desktop can support knowledge work, but the same engineering principles apply: source, boundary, artifact, review.

Prompt quality checklist

Before sending a Codex prompt, check:

Is the task small enough to review?
Does Codex know which files or systems are in scope?
Are forbidden actions explicit?
Is there a verification command or evidence requirement?
Should Codex plan before editing?
Should the task run in a worktree?
Is a plugin, skill, MCP server, browser, or computer use actually needed?
What should Codex report at the end?

If the prompt lacks a verification path, it is usually not ready for write-enabled work.

Codex desktop for engineering teams Use these prompts inside a larger operating model for real repositories.

Codex automations playbook Automations need durable prompts with stop conditions and review policy.

Codex skills, plugins, and MCP Turn repeated prompts into skills only after the workflow is proven.

Source notes

This page is based on OpenAI’s Codex prompting concept documentation, Codex app features, Codex automations documentation, Codex skills documentation, and Codex subagents documentation.

OpenAI Codex Desktop Prompt Playbook

OpenAI Codex Desktop Prompt Playbook

1. Read-only codebase exploration

2. Plan before patch

3. Small bug fix with evidence

4. Refactor without behavior change

5. Frontend visual QA loop

6. PR review with risk categories

7. Subagent exploration

8. Worktree implementation

9. Turn a repeated workflow into a skill

10. Draft an automation safely

11. Debug a failed Codex run

12. Non-code knowledge work using Codex

Prompt quality checklist

Related paths

Source notes