Codex Agent Orchestration SOP

Scope: Personal operator playbook for using Codex to run reusable, bounded, multi-role workflows safely and consistently.

Evidence standard: This SOP is based on March 2026 official Codex documentation and guidance. Where a capability is documented only in the changelog or marked experimental, that status is called out explicitly.


1. What Codex officially supports today

1.1 Core customization layers

Codex documents a layered customization model built around:

This is the core architecture surface for orchestration as of March 2026.

1.2 Interactive invocation

The Codex app supports slash commands from the thread composer. It also supports explicit skill invocation by typing $ in the composer. Enabled skills also appear in the slash command list.

Available slash commands (March 2026):

Deeplinks: The Codex app supports codex:// URL scheme for direct navigation to settings, skills, automations, threads, and new thread creation with optional prompt parameters.

Operational meaning:

1.3 Skills

Skills are Codex’s reusable capability layer. Official docs describe them as bundles of instructions, references, and optional scripts, with progressive disclosure so Codex only loads the full skill content when needed.

Skills follow the open Agent Skills standard (agentskills.io). This is an open format originally developed by Anthropic and adopted across multiple platforms including Codex, Claude Code, Cursor, GitHub Copilot, Windsurf, Gemini CLI, and others. Authoring skills to the open standard ensures portability.

Codex-specific extension: Skills can optionally include agents/openai.yaml for Codex-specific UI metadata (interface), invocation policy (policy.allow_implicit_invocation), and tool dependencies (dependencies.tools). This file is ignored by non-Codex platforms.

Operational meaning:

1.4 AGENTS.md

AGENTS.md is the documented place for standing project instructions. Codex reads it before doing work.

Discovery model: Codex builds an instruction chain on startup by scanning from global (~/.codex/) through repo root to current directory. Later files override earlier ones. AGENTS.override.md takes precedence over AGENTS.md at the same level.

Size limit: project_doc_max_bytes defaults to 32 KiB. Beyond that, content is truncated.

Operational meaning:

1.5 Multi-agent and subagents

Subagents are officially documented and can run specialized agents in parallel, wait for results, and return a consolidated response.

Official docs recommend it for highly parallel work such as codebase exploration or multi-part review. They also warn about context pollution and coordination issues, especially for write-heavy tasks.

Platform support (March 2026):

Configuration: Custom agents are defined in TOML files under .codex/agents/*.toml (repo) or ~/.codex/agents/*.toml (global), including name, description, and developer_instructions.

Concurrency limits: agents.max_threads defaults to 6; agents.max_depth defaults to 1 (prevents deep nesting). agents.job_max_runtime_seconds defaults to 1800s for CSV batch jobs.

Operational meaning:

1.6 Automations

Codex automations are officially documented as scheduled recurring background tasks in the Codex app. The app must be running and the selected project must be available on disk. In Git repos, automations can run either in the local project or in a dedicated worktree.

Automations can invoke skills by including $skill-name in the automation prompt.

Sandbox modes: Read-only, workspace-write, or full access. Admin-enforced requirements may restrict modes.

Operational meaning:

1.7 Hooks status

Codex’s official top-level docs for app orchestration center on scheduled automations, not general lifecycle hooks. The Codex changelog for CLI 0.114.0 mentions an experimental hooks engine with SessionStart and Stop events.

Operational meaning:


2. Skill authoring standard

2.1 Dual-compliance requirement

All skills in this repo must comply with both:

  1. The open Agent Skills standard (agentskills.io/specification) — ensures portability to Claude Code, Cursor, Copilot, Windsurf, Gemini CLI, Perplexity Computer, and any future skills-compatible platform.
  2. Codex skill conventions (developers.openai.com/codex/skills/) — ensures optimal behavior within the primary execution environment.

2.2 Portable skill structure (agentskills.io)

skill-name/
├── SKILL.md              # Required: frontmatter + instructions
├── scripts/              # Optional: executable code
├── references/           # Optional: documentation
├── assets/               # Optional: templates, resources
└── agents/
    └── openai.yaml       # Optional: Codex-specific UI/policy/dependencies

2.3 SKILL.md frontmatter (agentskills.io spec)

Field Required Constraints
name Yes Max 64 chars. Lowercase letters, numbers, hyphens only. Must match parent directory name. No leading/trailing/consecutive hyphens.
description Yes Max 1024 chars. Describe what the skill does AND when to use it. Include keywords for auto-selection.
license No License name or reference to bundled file.
compatibility No Max 500 chars. Environment requirements if any.
metadata No Arbitrary key-value map (e.g., author, version).
allowed-tools No Space-delimited tool list. Experimental.

Example:

---
name: red-team
description: >
  Apply red teaming / devil's advocate reasoning to challenge assumptions,
  find flaws, and stress-test plans. Use when evaluating architecture decisions,
  editorial content, or any proposal that needs adversarial review.
license: MIT
metadata:
  author: codewithshabib
  version: "1.0"
  one-piece-name: "Mihawk the Rival"
---

2.4 Codex-specific extension (agents/openai.yaml)

Only add when needed. This file is ignored by non-Codex platforms.

interface:
  display_name: "Red Team"
  short_description: "Devil's advocate reasoning"
  icon_small: "./assets/icon.svg"
  brand_color: "#DC2626"

policy:
  allow_implicit_invocation: true

dependencies:
  tools: []

2.5 Progressive disclosure budget

Layer Token budget Loaded when
Metadata (name + description) ~100 tokens Startup, all skills
Full SKILL.md body < 5000 tokens (~500 lines) Skill activated
references/, scripts/, assets/ As needed Explicitly referenced

2.6 Portability rules


3. Architecture pattern to use

Use this layered model:

  1. AGENTS.md holds standing project rules.
  2. Slash command or $skill invocation starts the workflow.
  3. Top-level orchestration skill defines the reusable workflow contract.
  4. Specialized skills handle bounded expert work.
  5. Multi-agent parallelizes only the lanes that truly benefit from parallelism.
  6. Deterministic validation gates run before packaging or integration.
  7. Human review remains the final judgment step.
  8. Automations schedule recurring runs.
  9. External orchestration handles downstream side effects when reliability matters.

Key rule: Interactive invocation, reusable capability, parallel execution, validation, scheduling, and side effects should not all live in one prompt.


4. Decision framework

Use AGENTS.md when

Examples:

Use a skill when

Examples:

Use slash commands or $skill when

Examples:

Use multi-agent when

Examples:

Use automations when

Examples:

Use external orchestration when

Examples:


5. Personal operating procedure

5.1 Before starting

  1. Start from a clean tree.
  2. Rebase or update main first.
  3. Confirm the task belongs to one bounded workflow.
  4. Confirm AGENTS.md reflects current project rules.
  5. Confirm the necessary skills exist and are still valid.
  6. Decide whether the task is interactive, scheduled, or both.

5.2 Starting the workflow

Use one of these entry modes:

Rule: the workflow should always have one clear entry point.

5.3 Orchestration

The top-level orchestration skill should do the following:

Rule: the orchestration skill coordinates; it should not become a giant catch-all skill that owns everything.

5.4 Parallel work

Use multi-agent only when justified.

Preferred first use cases:

Avoid or minimize parallelism when:

5.5 Validation

Before packaging or integration, run deterministic checks.

Examples:

Rule: “looks good” is not a gate.

5.6 Human review

Human review remains the final control point.

Check:

5.7 Recurring execution

When the workflow should run on a schedule:

  1. Create a Codex automation.
  2. Prefer worktree mode for Git repos when isolation matters.
  3. Include the orchestration skill explicitly with $skill-name.
  4. Review findings in the automation inbox.
  5. Adjust model or reasoning settings only if needed.

5.8 Downstream actions

For publish, PR, notify, or chain-next-step behavior:


Workflow shape


7. What not to do


8. Capability summary table

Capability Official status in March 2026 Best use
AGENTS.md Documented, stable Standing project rules
Skills (agentskills.io spec) Documented, stable, open standard Reusable portable workflows and capability bundles
agents/openai.yaml Codex-specific extension UI metadata, invocation policy, MCP dependencies
Slash commands Documented in app Interactive operator control
$skill invocation Documented in app/CLI Explicit skill launch
Subagents Documented Parallel bounded tasks
Automations Documented in app Scheduled recurring background runs
Worktree automations Documented Isolated recurring runs in Git repos
Hooks engine Mentioned in CLI changelog, experimental Emerging; do not rely on broadly
Deeplinks (codex://) Documented in app Direct navigation to app features

9. Source list


Appendix: SOP change log

Version Date Changes
1.0 2026-03-15 Initial SOP
2.0 2026-03-15 Added Agent Skills open standard (agentskills.io) as dual-compliance requirement. Added Section 2 (Skill authoring standard) with portable structure, frontmatter spec, Codex extension, progressive disclosure budget, and portability rules. Added slash command reference list. Added deeplinks. Added AGENTS.md discovery model and size limit. Added multi-agent platform support status (CLI-only). Added concurrency limits. Added automation sandbox modes. Updated capability table with stable/experimental status clarification. Added portability anti-pattern to Section 7.
2.1 2026-03-17 Updated subagent/platform guidance: CLI and Codex App visibility, IDE extension visibility pending, custom-agent TOML location under .codex/agents/*.toml, explicit-operator delegation rule, and source link moved from multi-agent to subagents docs.