CodeWithShabib Agentic Workflow Master Plan

Last updated: 2026-03-15 Document type: Unified Master Plan (PRD + TRD) Version: 3.3 — validated against March 2026 official documentation


Document Location and Versioning

This Master Plan is stored in two locations:

  1. Primary (versioned): docs/master-plan.md in the shabib87/shabib87.github.io repository. Changes to this file are tracked by git and follow the same PR workflow as all repo changes. Major revisions should reference an ADR.
  2. Reference copy in Linear: Attach the current version as a Linear document or paste into the project description of [ORCHESTRATION] Agentic Workflow Design. This is a read reference only — the repo copy is authoritative.

The plan is NOT stored only in Linear because:


Purpose

This document is the single source of truth for the agentic operating model for the shabib87/shabib87.github.io repository. It consolidates the current-state audit, constraints, target architecture, Linear operating model, Codex orchestration design, editorial quality system, CI strategy, ADR strategy, Slack setup, and the implementation backlog discussed across the planning sessions.

The goal is to run this repository like a small specialist team:


North Star

Desired experience

A new idea should flow like this:

  1. Start a chat from Perplexity (iOS, Android, Mac, or web) or ChatGPT (iOS, Android, Mac, or web).
  2. Create a Linear issue from the conversation (Perplexity via Linear connector; ChatGPT via conversation).
  3. The Linear issue body becomes the execution brief.
  4. If needed, the parent issue is decomposed into smaller sub-issues.
  5. A GitHub workflow_dispatch is triggered (from ChatGPT web/Mac via Codex sidebar, Perplexity via GitHub connector, or gh CLI). The dispatch workflow prepares a branch, fetches the Linear brief, and writes docs/tasks/CWS-NNN.md.
  6. Shabib starts Codex (CLI, Mac app, IDE extension, or Codex Cloud via ChatGPT web/iOS sidebar) with the correct orchestrator prompt.
  7. Codex agents perform the scoped work and open a draft PR.
  8. Local hooks catch most failures before push.
  9. CI validates the PR with the correct pipeline.
  10. Graphite manages stacks when work is split across multiple PRs.
  11. Linear tracks progress, ownership, and review state.
  12. GitHub Pages auto-deploys when merged to main.

Human-in-the-loop: exactly two touchpoints

The only manual steps in the steady-state workflow are:

  1. Idea creation — drafting a blog post idea, PRD, or task description in a Perplexity or ChatGPT conversation (iOS, Android, Mac, or web), then creating the Linear issue.
  2. Final review before merge — reviewing the draft PR, running manual judgment checks if editorial, and approving/merging.

Everything between those two steps is automated: dispatch, branch prep, task file creation, Codex notification, agent execution, PR creation, CI validation, and Linear status sync.

Note: Codex execution itself is currently a manual trigger (Shabib opens Codex with the task). This is the one intermediate step that cannot be fully automated yet — there is no public API to programmatically start a Codex task as of March 2026. The dispatch workflow should post a Slack notification and Linear comment so Shabib knows a task is ready for Codex pickup.

Operating philosophy


Definition of Ready

A ticket must meet ALL of the following criteria before an agent can pick it up.

Definition of Ready for agent-pickable tickets:

  1. Linear issue exists with status = Todo and label = agent-task
  2. Title follows convention: [DEV], [EDITORIAL-NEW], or [EDITORIAL-UPDATE] prefix
  3. Description is complete: Contains the structured execution brief (goal, acceptance criteria, scope boundaries, files expected to change)
  4. Acceptance criteria are testable: Each criterion maps to a deterministic check (test passes, lint passes, build succeeds) or an explicit human judgment check (marked as [HUMAN-REVIEW])
  5. Scope is bounded: Single concern, estimated ≤ 1 Codex session. If larger, decompose into sub-issues first.
  6. Branch is prepared: docs/tasks/CWS-NNN.md exists on a prepared branch (created by dispatch workflow)
  7. Dependencies are resolved: No blockers from other issues. If the issue depends on another, that issue must be Done.
  8. Required skills exist: If the task needs a specific Codex skill, that skill must already be committed to the repo.

Who enforces DoR:


Verified Current State

Repository reality

The repository already has meaningful automation and orchestration foundations:

Existing content posture

The blog has a mix of older iOS-era posts and stronger recent principal-level posts. This means the repo needs two editorial tracks:

Existing CI posture

There are already GitHub Actions workflows in place, but they are too coarse and not yet split cleanly into development vs editorial responsibilities.

Existing Linear posture

The Linear workspace already has a team created and the team key is now CWS. There are default onboarding issues in the team that should be archived before the real backlog is created.


Hard Constraints

These constraints are mandatory and drive every design choice.

Cost constraints

Shabib’s device profile

Device OS Role in workflow
Mac (desktop/laptop) macOS Primary development, Codex (CLI, App, IDE extension), GitHub, Linear, Graphite CLI, local Jekyll builds
iPhone iOS Perplexity, ChatGPT (+ Codex Cloud sidebar), Linear, Slack, GitHub — idea capture and task triage on the go
Android phone Android Perplexity, ChatGPT, Linear, Slack — secondary mobile surface, same capabilities as iOS
GitHub-hosted runners Linux CI only — runs GitHub Actions workflows. Not a user-facing surface.

No Windows machines. All tooling, scripts, Makefiles, and documentation must assume macOS for local development and Linux for CI. No PowerShell, no .bat/.cmd files, no Windows path separators.

Scripting language policy

The repo currently has:

Policy:

Platform constraints

Workflow constraints


Current Gaps and Problems

1. _drafts/ is gitignored

The _drafts/ folder is currently in .gitignore. Agents cannot create or collaborate on draft posts in branches. Fix: Remove _drafts/ from .gitignore and commit the folder. Jekyll will not publish drafts in production builds unless explicitly configured.

2. CI is not yet intentionally split

Current workflows are useful but not optimized for cost or clarity. The system needs separate pipelines for:

3. Local validation is not yet the primary gate

The correct cost-aware pattern is:

4. TDD is not yet first-class everywhere

Scripts exist, but not every script has complete test coverage. Tests need to become mandatory for all automation, including editorial validation scripts.

5. Editorial quality is only partially codified

The desired editorial quality system includes:

Only some of that is currently formalized.

6. Prompt and agent flow need stronger contracts

The repo already has prompts and agents, but they need a more explicit contract model tied to Linear issues, per-task docs/tasks/CWS-NNN.md input, and test-first behaviour.

7. Repo organization can be clearer

The repo is powerful but dense. It needs better top-level organization and a formal ADR structure so architectural choices are recorded and discoverable.

8. Slack visibility is not yet wired

Slack should be used for free visibility, but only with integrations available on free plans. The 10-integration limit must be budgeted carefully.

9. Task file is global, not per-issue

The current CODEX_TASK.md is a single file at a fixed location. It should be per-task under docs/tasks/ with the Linear issue ID in the filename, persisted as a historical record after merge.


Trigger Surfaces — Validated March 2026

Perplexity (iOS, Android, Mac, Web)

Available surfaces: iOS app, Android app, Mac app (Perplexity Computer), web app (perplexity.ai).

Connected integrations (already active):

Use for:

ChatGPT (iOS, Android, Mac, Web)

Available surfaces: iOS app, Android app, Mac desktop app, web app (chatgpt.com).

Capabilities:

Use for:

Important limitation: ChatGPT’s GitHub integration is read-only. To trigger workflow_dispatch, use one of:

  1. Perplexity (via GitHub connector)
  2. gh workflow run from CLI
  3. GitHub REST API call
  4. GitHub Actions UI

Codex (CLI, Mac App, IDE Extension, Cloud)

Available surfaces as of March 2026:

Surface Platform Status
Codex CLI macOS (user), Linux (CI-only) Stable, open-source, Rust-based
Codex Mac App macOS (Apple Silicon) Stable since Feb 2026
Codex IDE Extension VS Code, Cursor, Windsurf, JetBrains (macOS) Stable
Codex Cloud Web (chatgpt.com/codex), ChatGPT iOS/Android sidebar Stable

Key capabilities:

Trigger model: Manual. Shabib starts Codex after the dispatch workflow prepares the branch and task file. No programmatic trigger API exists.

Use only after the issue exists and the branch has been prepared with docs/tasks/CWS-NNN.md.

Codex is not the planner of record. Linear is.


Canonical Flow

Flow A: Development workflow

  1. User discusses a repo/process/site improvement on Perplexity or ChatGPT (iOS, Android, Mac, or web).
  2. Assistant creates a [DEV] Linear issue using the structured template (Perplexity via Linear connector; ChatGPT via conversation then manual creation).
  3. If the task crosses more than two concern layers, it is decomposed into sub-issues.
  4. Workflow dispatch is triggered via one of: Perplexity GitHub connector, gh workflow run, GitHub API, or GitHub UI.
  5. GitHub dispatch workflow fetches the Linear issue body and writes docs/tasks/CWS-NNN.md into a new branch. Posts a Slack notification and Linear comment that the task is ready for Codex.
  6. Shabib opens Codex (CLI, app, IDE extension, or Cloud) with the dev orchestrator prompt, pointing to the prepared branch.
  7. Codex reads docs/tasks/CWS-NNN.md, verifies a failing test exists or writes it first.
  8. Codex implements the task and opens a draft PR.
  9. Local hooks and CI validate.
  10. Shabib reviews, approves, and merges.

Flow B: Editorial-new workflow

  1. User has a new blog idea on Perplexity or ChatGPT (iOS, Android, Mac, or web).
  2. Assistant creates a [EDITORIAL-NEW] Linear issue using the structured template.
  3. If needed, work is decomposed into sub-issues or a short stack.
  4. Dispatch is triggered (same mechanisms as Flow A).
  5. GitHub writes docs/tasks/CWS-NNN.md to a prepared branch.
  6. Shabib opens Codex with the editorial-new orchestrator prompt.
  7. Agents create content in _drafts/ first, then move through research, drafting, editing, fact-check framing, and publishing prep. When ready, the post is moved from _drafts/ to _posts/ with proper front matter and date.
  8. Automated editorial checks run locally and in CI.
  9. Manual judgment checks are run by Shabib via Codex prompts.
  10. Shabib merges when satisfied.

Flow C: Editorial-update workflow

  1. User identifies an old post that needs SEO/UX refresh on Perplexity or ChatGPT (any surface).
  2. Assistant creates an [EDITORIAL-UPDATE] Linear issue.
  3. Dispatch writes docs/tasks/CWS-NNN.md.
  4. Shabib runs Codex with the editorial-update orchestrator prompt.
  5. The historical-post-editor applies metadata/UX/SEO-safe changes only.
  6. Automated checks run.
  7. SEO review and final manual sign-off occur.
  8. Shabib merges.

Per-Task File Strategy

Why per-task files

A single CODEX_TASK.md creates conflicts when multiple tasks are in flight and loses history after each run. Per-task files solve both problems.

Location and naming

docs/tasks/CWS-NNN.md

Where NNN is the Linear issue number (e.g., docs/tasks/CWS-42.md).

Lifecycle

Phase State
Dispatch workflow runs File created at docs/tasks/CWS-NNN.md with Linear issue content
Codex picks up task Agent reads from docs/tasks/CWS-NNN.md
Task complete, PR merged File persists as historical record

File structure

# CWS-NNN: [Issue Title]

## Linear Issue

- **ID:** CWS-NNN
- **URL:** https://linear.app/codewithshabib/issue/CWS-NNN
- **Workflow:** [Dev | Editorial-New | Editorial-Update]
- **Executor:** [Agent | Human | Hybrid]
- **Created:** YYYY-MM-DD

## Brief

[Full issue description from Linear]

## Acceptance Criteria

[Extracted from issue body]

## Labels

[Labels from Linear]

## Decomposition

[Sub-issues if any, with their CWS IDs]

Git tracking

The docs/tasks/ directory is tracked by git. Task files are committed on the prepared branch by the dispatch workflow and persist through merge to main.

Agent contract

Every orchestrator prompt must reference the task file by its per-task path:

Read the execution brief from docs/tasks/CWS-NNN.md (the specific file path is provided when Codex is started).


Linear Operating Model

Team

Clean-up step

Archive the default Linear onboarding issues before creating the real backlog:

These are onboarding artifacts, not project work.

Statuses

Use the existing team workflow states:

Label taxonomy

Executor

Exactly one of:

Bottleneck

Exactly one of:

Workflow

Exactly one of:

Stage

Exactly one of:

Type / focus tags

Reusable labels:

Ownership model

Human means

A task requires Shabib to:

Agent means

A task is safe for Codex to execute with the right prompt and constraints.

Hybrid means

An agent can do most of the implementation, but the task includes a human checkpoint or approval step.

Linear ↔ GitHub integration

Linear’s native GitHub integration (free, included) provides:

Configure per-team workflow automations in Linear settings.


Branching and PR Conventions

Canonical branch names

The CWS-NNN in the branch name triggers Linear’s GitHub integration to auto-link the PR.

PR body requirement

Every PR must include:

Stacked PR strategy

Rule of thumb

Concern layers

Examples of layers:

Examples

Single PR examples:

Stack examples:

Graphite free tier scope

On the Hobby (free) tier, Graphite provides:

Not available on free tier:

Stack merge on free tier uses standard GitHub merge — Graphite CLI handles rebase ordering.


Repo Structure Strategy

Existing important directories

Target organization improvements

Keep

Add / improve


TDD Policy

Core rule

No automation or validation code ships without tests.

Applies to

Test layout

Required behaviour

Agent contract update

The developer agent instructions must explicitly say:

Before writing any implementation, create a failing test that covers the acceptance criteria from the Linear issue. Do not open a PR without test coverage for every new or modified script.


Local Hooks Strategy

pre-commit

Purpose: catch fast editorial issues before a commit exists.

Run on staged _posts/** and _drafts/** files only

pre-commit behavior

pre-push

Purpose: catch expensive failures before remote CI.

Run before every push

pre-push behavior

Installation


CI Architecture

Principles

dev-pipeline.yml

Trigger paths:

dev-pipeline job order

  1. test
  2. lint
  3. security
  4. governance
  5. jekyll-build

dev-pipeline notes

editorial-pipeline.yml

Trigger paths:

editorial-pipeline job order

  1. test
  2. spell-check
  3. grammar-and-style
  4. markdown-lint
  5. seo-audit

editorial-pipeline notes

Code security

Security scanning strategy:

Layer 1: Semgrep CE (local + CI)

Layer 2: CodeQL (CI only)

Layer 3: Dependabot (already available)

Security = first-class guardrail rule:

codex-dispatch.yml

Trigger: workflow_dispatch with inputs issue_id and workflow_type.

See Dispatch Workflow Design section for full spec.

notify-merge.yml

A lightweight merge notification workflow posts to Slack when commits land on main.

Optional: codex-pr-review.yml

Use openai/codex-action@v1 for automated PR review comments. This is the only CI workflow that should use an OpenAI API key. Scoped to pull_request events only. Runs Codex in a sandboxed read-only mode to post review feedback, not execute tasks.


Editorial Quality System

Automated checks (deterministic, free, CI-friendly)

These are the baseline editorial tests:

  1. Front matter schema
  2. Markdown lint
  3. Spelling
  4. Grammar
  5. Storytelling structure
  6. Tone and writing style
  7. SEO audit

Note

The original discussion grouped six automated checks, but storytelling, tone/style, and SEO were all important enough that the implemented system should treat them as distinct validation concerns even if some share the same Vale engine.

Manual judgment checks (agent-assisted, human-triggered)

These are required before editorial merge:

  1. Audience alignment check
  2. Fact check
  3. Authority check
  4. Final editorial sign-off

Why manual?

Because these require judgment, not just rule matching. They should be run as manual Codex prompts and reviewed by Shabib before merge.

Editorial draft workflow

New posts follow the Jekyll _drafts/ convention:

  1. Agent creates post in _drafts/ (no date prefix required per Jekyll convention).
  2. Automated and manual checks run against the draft.
  3. When approved, agent moves the file to _posts/ with the proper YYYY-MM-DD- date prefix.
  4. PR is ready for final merge review.

Editorial Standards to Encode

Voice profile

A new file is required:

It should define:

Authority rubric

A new file is required:

It should score:

  1. Non-obvious perspective — does this say more than a search result summary?
  2. Concrete opinion — does the piece take a position?
  3. Experience signal — does it reflect real trade-offs, not abstract restatement?
  4. Principal-level depth — does it go beyond mechanics into reasoning and consequences?

Vale custom rules

Create a style pack in:

Rules to include:


Codex Multi-Agent Design

Roles already present in the repo

The repo already has agent definitions for:

Agent Identity System

Each agent has a One Piece character identity that maps to their role. These identities are used in Codex config.toml agent definitions, skill descriptions, prompt headers, and Slack notifications for personality and quick identification.

Agent Name One Piece Character Role Rationale
Luffy the Captain Monkey D. Luffy team-lead / orchestrator Captain who sets direction and delegates to the crew
Zoro the Swordsman Roronoa Zoro developer First mate — cuts through problems with raw skill and discipline
Nami the Navigator Nami researcher Charts the course — finds information, maps the territory
Sanji the Cook Sanji writer Crafts something nourishing from raw ingredients — turns research into prose
Chopper the Doctor Tony Tony Chopper editor Diagnoses problems and heals the content — fixes what’s broken
Robin the Scholar Nico Robin fact-checker Archaeologist who deciphers truth from history and sources
Franky the Shipwright Franky publisher-release Builds and maintains the ship — packaging, deployment, CI
Brook the Musician Brook seo-expert Brings life and rhythm to content — SEO, discoverability, audience reach
Jinbe the Helmsman Jinbe historical-post-editor Steady hand that steers legacy content through safe waters without capsizing

Agent names follow the format <Name> the <Role>. These names are used in Codex config.toml agent definitions, skill descriptions, prompt headers, and Slack notifications for personality and quick identification.

The naming convention is extensible. New agents (including reasoning agents) follow the same pattern using One Piece characters whose traits match the role.

Agent Safety Rules

Loop prevention

If an agent attempts the same action 3 times with the same inputs and gets the same result, it MUST stop immediately. The agent MUST:

  1. Summarize what was attempted and what failed.
  2. Post the summary as a comment on the Linear issue.
  3. Set the issue status to Blocked.
  4. Do NOT retry, do NOT attempt a workaround.

For scripts: All retry-capable scripts in scripts/ MUST include a retry counter (max 2 retries). After exhausting retries, the script MUST exit with a non-zero code and a human-readable error message. Silent infinite retries are prohibited.

Stuck-agent escalation

If an agent cannot make progress for any reason (missing file, permission error, ambiguous requirement), it MUST:

  1. Stop work immediately.
  2. Write a comment on the Linear issue describing the blocker.
  3. Post a notification to #codex-runs via Slack webhook.
  4. Do NOT guess or improvise around the blocker.

Concurrency limits

Set explicit limits in config.toml:

[agents]
max_threads = 3
max_depth = 1

The orchestrator skill MUST include:

CONCURRENCY LIMIT: Never delegate more than 3 parallel agents.
If the plan has more than 3 parallel lanes, batch them:
- Turn 1: Launch lanes 1-3, wait for results.
- Turn 2: Launch remaining lanes.
- Final: Synthesize all results.

Agent Configuration Structure

Each agent is defined by three layers, separating identity, behavior, and function:

.codex/agents/<agent-name>/
├── config.toml          # Functional: tools, permissions, model assignment
├── soul.md              # Behavioral: identity, values, guardrails, personality
└── instructions.md      # Operational: what to do, how to do it, boundaries

soul.md defines who the agent IS — personality, values, and behavioral guardrails. Inspired by DeerFlow’s SOUL.md pattern. This is separate from functional instructions.

Example soul.md for Zoro the Swordsman (developer):

# Zoro the Swordsman — Developer

You are Zoro the Swordsman — disciplined, direct, and relentless.

## Values

- Precision over speed. Measure twice, cut once.
- Tests come before code. Always.
- Your scope is your boundary. Stay in your lane.

## Guardrails

- You MUST NOT start coding until tests are written.
- You MUST NOT touch files outside your assigned scope.
- You MUST NOT merge or approve PRs — only create them.
- You MUST NOT modify editorial content (\_posts/, \_drafts/, \_pages/).
- If you're stuck, you report the obstacle — you MUST NOT hack around it.

instructions.md defines the operational boundaries using MUST NOT language:

## Boundaries

- MUST NOT modify: \_posts/, \_drafts/, \_pages/, docs/editorial/
- MUST NOT run: make publish-draft, make qa-publish, make finalize-merge
- MUST NOT merge, approve, or close PRs
- MUST NOT modify AGENTS.md, config.toml, or other agent configs
- MUST report and STOP if a task requires changes outside these boundaries

Each agent role gets its own boundary set. See the full boundary matrix below.

Agent Boundary Matrix

Agent MUST NOT modify MUST NOT run Special restrictions
Zoro (developer) _posts/, _drafts/, _pages/, docs/editorial/ make publish-draft, make qa-publish MUST NOT touch editorial content
Sanji (writer) scripts/, .github/, Makefile, config files make setup, make ci-setup MUST NOT touch infrastructure
Chopper (editor) scripts/, .github/, Makefile, config files make setup, make ci-setup MUST NOT touch infrastructure
Robin (fact-checker) ALL files (read-only) Any write commands MUST NOT modify any file — only report findings
Nami (researcher) ALL files (read-only) Any write commands MUST NOT modify any file — only report findings
Franky (publisher) _posts/ content, _drafts/ content N/A MUST NOT change post body prose — only packaging/CI
Brook (SEO) scripts/, .github/, Makefile make setup MUST NOT touch infrastructure
Jinbe (historical editor) scripts/, .github/, Makefile, config files make setup, make ci-setup MUST NOT change original publish dates
Luffy (orchestrator) Direct file modifications Direct implementation commands MUST delegate, MUST NOT execute directly

Required prompt set

Dev

Editorial new

Editorial update

Manual judgment prompts

Create:

Orchestrator Design — Two-Phase Execution

The orchestrator skill (Luffy the Captain) separates planning from execution. This is inspired by DeerFlow’s Coordinator/Planner/Executor separation.

Phase 1: Plan

  1. Read the task file (docs/tasks/CWS-NNN.md).
  2. Read docs/agent-context.md for project context.
  3. Produce a numbered execution plan:
    • What agents are needed (by name)
    • What each agent will do (bounded scope)
    • Execution order: parallel vs sequential
    • Validation gates between steps
    • Expected deliverables
  4. Post the plan as a Linear issue comment (permanent record).
  5. Post a short notification to #codex-runs via Slack webhook: “Plan ready for CWS-NNN: [Linear link]”.
  6. STOP and wait for Shabib’s approval.

This is the Option C plan review gate: Linear is the record, Slack is the notification. Shabib reviews the plan in Linear (any device), replies with approval or edits.

Phase 2: Execute

Once the plan is approved (Shabib re-invokes Codex with “Plan approved, proceed” or provides edits):

  1. Execute each step by delegating to the appropriate agent.
  2. Track progress against the numbered plan.
  3. If any step fails, STOP and report — do NOT improvise a workaround.
  4. After all steps complete, run the self-audit checklist (see below).
  5. Create the PR.

Prompt requirements

Every orchestrator prompt MUST:

Clarification Protocol

Before starting work, the orchestrator MUST verify:

  1. SCOPE: Is the task boundary clear? If not, ask: “Should this include X or is X out of scope?”
  2. APPROACH: Is there more than one reasonable approach? If so, ask: “Option A vs Option B — which do you prefer?”
  3. RISK: Does the change touch published content, CI config, or AGENTS.md? If so, ask: “This changes [X] which affects [Y]. Proceed?”
  4. MISSING: Is any required input missing (file path, date, slug)? If so, ask for it.

Ask ONE question at a time. Wait for the answer before proceeding. Do NOT ask clarifying questions for routine tasks fully covered by existing AGENTS.md rules.

Special constraint for editorial-new

New posts MUST be created in _drafts/ first and only moved to _posts/ after all automated checks pass.

Special constraint for editorial-update

The agent MUST NOT:

Self-Audit Checklist

Before creating the PR, the orchestrator MUST verify all of the following. If any item fails, the agent MUST fix it or report the failure — MUST NOT create the PR with known failures.


Agent Context System

Agent context purpose

Codex agents have no cross-session memory. Each task starts from scratch. To provide project continuity, agents read and update a persistent context file.

File: docs/agent-context.md

Format: Markdown (not JSON/JSONL). Rationale:

Structure

# Agent Context — CodeWithShabib

_Last updated: YYYY-MM-DD by [agent-name] during CWS-NNN_

## Current Focus

- [3-5 bullet points of active workstreams]

## Recent Decisions

- [ADR references, recent architectural choices with brief rationale]

## Known Constraints

- [Free tier limits, tooling quirks, unresolved issues, blockers]

## Editorial State

- Posts in draft: [list]
- Recently published: [list]
- Scheduled: [list]

## Open Questions

- [Anything unresolved that the next agent should be aware of]

Rules


Reasoning Agent System

Reasoning agent purpose

Reasoning agents apply structured mental models to improve decision quality across all workflows. They can be invoked:

Mental Model Skills

Skill Name Mental Model When to Use One Piece Name
$first-principles First Principles Thinking Decompose a problem to its fundamental truths before building up Vegapunk the Scientist
$second-order Second-Order Thinking Evaluate consequences of consequences before committing Shanks the Strategist
$socratic Socratic Questioning Challenge assumptions through systematic questioning Rayleigh the Mentor
$red-team Red Teaming / Devil’s Advocate Actively find flaws, attack the plan, stress-test assumptions Mihawk the Rival
$inversion Inversion Work backward from failure — what would make this fail? Crocodile the Schemer
$pareto 80/20 (Pareto Principle) Identify the 20% of effort that delivers 80% of value Doflamingo the Puppeteer
$opportunity-cost Opportunity Cost What are we giving up by choosing this path? Garp the Veteran
$circle-of-competence Circle of Competence Stay within what we know; identify when we’re outside it Whitebeard the Elder
$margin-of-safety Margin of Safety Build buffers against what can go wrong Kuma the Protector
$feedback-loop Feedback Loops Identify reinforcing and balancing loops in the system Katakuri the Predictor
$bayesian Bayesian Updating Update beliefs based on new evidence; avoid anchoring Dragon the Revolutionary

Invocation modes

  1. Explicit by operator: $red-team in Codex thread composer, or paste the portable prompt template into Perplexity/ChatGPT
  2. Explicit by operator — combo: “Run $first-principles then $red-team on this architecture decision”
  3. Automatic by orchestrator: The orchestrator skill can invoke reasoning skills based on task context:
    • Architecture decisions → $first-principles + $second-order + $red-team
    • Editorial content → $socratic + $red-team + $circle-of-competence
    • Priority/backlog grooming → $pareto + $opportunity-cost
    • Risk assessment → $inversion + $margin-of-safety
    • Post-incident / retrospective → $feedback-loop + $bayesian

Skill file structure

Each reasoning skill lives at .agents/skills/<skill-name>/SKILL.md with:

Portable prompt templates

Each mental model also has a companion prompt template at .codex/docs/reasoning-prompts/<model-name>.md that can be copy-pasted into Perplexity or ChatGPT for use outside the repo.

Files to create

Under .agents/skills/:

Under .codex/docs/reasoning-prompts/:


Dispatch Workflow Design

Why dispatch exists

The dispatch workflow is a cheap prep step, not an execution engine. It automates everything between “Linear issue exists” and “Codex is ready to pick up the task.”

codex-dispatch.yml responsibilities

  1. Accept issue_id (Linear issue identifier, e.g., CWS-42) and workflow_type (dev, editorial-new, editorial-update) as workflow_dispatch inputs.
  2. Call scripts/fetch-linear-issue.sh to fetch issue content from Linear API.
  3. Write docs/tasks/CWS-NNN.md with the structured task file template.
  4. Create and push a prepared branch following the naming convention (feature/CWS-NNN-slug, editorial/CWS-NNN-slug, etc.).
  5. Post a Slack notification to the appropriate channel (#codex-runs) that a task is ready.
  6. Post a Linear comment on the issue with a link to the branch and instructions for Codex pickup.

Triggering the dispatch

Surface Method
Perplexity (iOS/Android/Mac/web) GitHub connector → trigger workflow_dispatch
ChatGPT (iOS/Android/Mac/web) Not directly supported for workflow_dispatch; use Codex Cloud to create a task that calls gh workflow run, or ask Shabib to trigger via CLI
CLI (terminal) gh workflow run codex-dispatch.yml -f issue_id=CWS-42 -f workflow_type=dev
GitHub UI Actions tab → codex-dispatch → Run workflow
GitHub REST API POST /repos/{owner}/{repo}/actions/workflows/codex-dispatch.yml/dispatches

March 2026 update: The GitHub workflow_dispatch API now returns run_id in the response when return_run_details=true is passed, making it possible to track the dispatch run programmatically.

scripts/fetch-linear-issue.sh responsibilities

Secret policy

Document required secrets in:

Required secrets:

No OpenAI secrets are used for task execution in CI.

Why LINEAR_API_KEY is needed despite native integrations: Linear’s GitHub integration only syncs issues ↔ PRs via branch naming and magic words — it does NOT provide an API to fetch issue content from within a GitHub Actions workflow. Linear’s Perplexity/ChatGPT connectors work in conversation context only and cannot be called from CI. The dispatch workflow (codex-dispatch.yml) runs in GitHub Actions and needs to fetch the Linear issue body to write docs/tasks/CWS-NNN.md. The only way to do this from CI is Linear’s GraphQL API, which requires a LINEAR_API_KEY. The key is a personal API key (free, generated from Linear Settings → API → Personal API Keys). No paid plan required. The native integrations (GitHub ↔ Linear, Perplexity ↔ Linear, ChatGPT ↔ Linear) handle everything else.


Developer Experience (DX) Setup

Three setup targets, all driven through the Makefile:

make setup — New Mac bootstrap

Must handle:

make ci-setup — GitHub Actions runner bootstrap

Runs in CI workflows as a first step. Must handle:

make codex-setup — Codex environment verification

Run at the start of a Codex session to verify the agent has what it needs:


Slack Strategy

Free plan constraints (March 2026)

Integration budget (10 slots)

# Integration Purpose
1 Linear Issue creation from Slack, status updates, bidirectional comment sync
2 GitHub PR notifications, merge alerts, CI status
3 Incoming Webhooks Custom notifications from CI workflows (dispatch ready, merge, failures)
4 Perplexity Computer AI agent tasks from Slack (already connected)
5-10 Reserved Future integrations as needed

Important: Graphite → Slack integration requires the Starter tier ($20/user/month) and is not available on the free plan. All PR/stack notifications should go through GitHub Actions → Slack webhooks instead.

Channel structure

Channel responsibilities

#ci-dev

#ci-editorial

#linear-updates

#merges

#codex-runs

Supporting docs

Add:


ADR Strategy

Why ADRs matter here

This system is intentionally architectural: agent workflows, CI design, cost constraints, publishing process, branching model, and Linear conventions are all decisions that should be preserved.

Directory

Create:

Helper

Create:

Expose via:

ADR-triggering changes

If a PR touches:

then it should create or reference an ADR.


Project Portfolio in Linear

Create these projects:

1. [INFRA] Repo Process & Tooling

Purpose:

2. [ORCHESTRATION] Agentic Workflow Design

Purpose:

3. [EDITORIAL] Content Quality System

Purpose:


Linear Task Breakdown Strategy

How to break the Master Plan into Linear work:

Epic structure (3 projects, each with epics):

Project: [INFRA] Repo Process & Tooling

Project: [ORCHESTRATION] Agentic Workflow Design

Project: [EDITORIAL] Content Quality System

Breakdown rules:

  1. Each epic maps to a section of this Master Plan.
  2. Each task under an epic maps to ONE bounded deliverable (a script, a workflow file, a skill, a doc, a Makefile target).
  3. Tasks are labeled agent-task or human-task.
  4. Tasks have explicit acceptance criteria that reference make targets or test commands.
  5. Dependencies between tasks are expressed via Linear’s “blocked by” relation.
  6. Execution order: INFRA → ORCHESTRATION → EDITORIAL (but some tasks can parallel).

I (Computer/Perplexity) can create these Linear issues for you once you approve the final scope. Each issue will have a high-quality description with the structured execution brief format.


Backlog Summary

This section summarizes the implementation backlog at epic level.

INFRA epics

  1. Remove _drafts/ from .gitignore and track in git
  2. Create label taxonomy for the CWS team
  3. Migrate branch naming to Linear-first convention
  4. Implement local pre-commit and pre-push quality gates
  5. Establish TDD as first-class across the repo
  6. Configure markdownlint, vale, and cspell for Jekyll/editorial quality
  7. Split CI into dev and editorial pipelines
  8. Upgrade the PR template
  9. Establish ADR infrastructure
  10. Wire Slack notifications and workspace/channel setup (budget 10 integrations)
  11. Create supporting docs for tools and secrets

ORCHESTRATION epics

  1. Design structured Linear intake templates for dev, editorial-new, editorial-update
  2. Write the multi-surface trigger guide (Perplexity iOS/Android/Mac/web, ChatGPT iOS/Android/Mac/web, CLI, GitHub UI)
  3. Rewrite all orchestrator prompts to reference per-task files
  4. Build the workflow_dispatch branch-prep flow with per-task docs/tasks/CWS-NNN.md
  5. Implement fetch-linear-issue.sh
  6. Document Codex usage across all surfaces (CLI, Mac app, IDE extension, Cloud)
  7. Define decomposition rules for single PR vs stacked PR
  8. Update team-lead instructions with decomposition logic
  9. Design and implement One Piece agent identity system in config.toml and prompts
  1. Build 11 mental model reasoning skills (.agents/skills/)
  2. Write 11 portable reasoning prompt templates (.codex/docs/reasoning-prompts/)
  3. Create reasoning agent orchestration rules (when to auto-invoke which models)

EDITORIAL epics

  1. Build automated editorial validation suite
  2. Implement front matter validator (covering both _posts/ and _drafts/)
  3. Implement SEO audit
  4. Implement custom Vale storytelling rule
  5. Implement tone/style Vale rules
  6. Wire all automated checks into editorial CI (trigger on _posts/** and _drafts/**)
  7. Build manual judgment prompts
  8. Create audience-check prompt
  9. Create fact-check prompt
  10. Create authority-check prompt
  11. Create final-sign-off prompt
  12. Write editorial voice profile
  13. Write editorial evaluation rubric

Files to Create or Update

Modified files

New docs

New agent config files

For each agent (zoro, sanji, chopper, robin, nami, franky, brook, jinbe, luffy):

New reasoning prompt templates

New prompts

New reasoning skills

New scripts

New tests

New CI workflows

New config files


Human vs Agent Responsibility Matrix

Area Human Agent Notes
Linear project setup Yes No Workspace/admin task
Label taxonomy Yes No Best created deliberately
Slack workspace creation Yes No Human account creation required
Slack integration wiring docs Yes Yes Human does setup, agent can document
.gitignore fix for _drafts/ Review Yes Agent-friendly, human verifies
Prompt writing Hybrid Yes Human sets policy, agent drafts structure
Hook implementation Review Yes Agent-friendly
Test writing Review Yes Must be test-first
Editorial voice policy Yes Assist only Human source of truth
Authority rubric Yes Assist only Human judgment framework
ADR policy Yes Assist only Human-owned architecture decisions
CI YAML implementation Review Yes Agent-friendly
Dispatch workflow Review Yes Agent implements, human reviews
Per-task file template Hybrid Yes Human defines structure, agent implements
Trigger surface guide Yes Assist only Human documents actual usage patterns
Final editorial merge decision Yes No Human-only
Final architecture merge decision Yes No Human-only

Immediate Execution Order

  1. Remove _drafts/ from .gitignore and commit.
  2. Create docs/tasks/.gitkeep and commit.
  3. Archive default onboarding issues in Linear.
  4. Create label taxonomy in Linear.
  5. Create the three projects in Linear.
  6. Create the implementation backlog under those projects.
  7. Implement branch naming and PR template updates.
  8. Implement local hooks (pre-commit covers _posts/** and _drafts/**).
  9. Implement TDD restructuring.
  10. Implement markdownlint / vale / cspell.
  11. Split CI (editorial pipeline triggers on _posts/** and _drafts/**).
  12. Add ADR system.
  13. Create Slack workspace and free integrations (budget 10 slots).
  14. Create intake templates.
  15. Write trigger surface guide (all Perplexity/ChatGPT/CLI/GitHub UI surfaces).
  16. Rewrite orchestrator prompts (reference per-task docs/tasks/CWS-NNN.md files).
  17. Implement dispatch workflow and fetch-linear-issue.sh (writes to docs/tasks/).
  18. Implement editorial validators and manual judgment prompts.

Definition of Done

This plan is complete when all of the following are true:


Appendix A: Tool Capabilities Matrix (March 2026)

Tool Surfaces Key Capabilities Free Tier Limits
Perplexity iOS, Android, Mac, Web Research, Linear connector, GitHub connector, Slack connector Pro Search limits on free; connectors require Pro/Max
ChatGPT iOS, Android, Mac, Web Conversation, Codex Cloud sidebar, GitHub read-only app Codex included with Plus+
Codex CLI macOS (user), Linux (CI-only) Local agent, multi-agent (experimental), MCP server, open-source Included with ChatGPT subscription
Codex App macOS Multi-agent management, worktrees, automations, skills Included with ChatGPT subscription
Codex IDE Extension VS Code, Cursor, Windsurf, JetBrains In-IDE agent, Cloud delegation Included with ChatGPT subscription
Codex Cloud Web (chatgpt.com/codex), ChatGPT iOS/Android Remote sandboxed execution, PR creation Included with ChatGPT subscription
Codex GitHub Action CI openai/codex-action@v1, PR review, patch application Requires API key
Linear Web, iOS, Android, Mac GraphQL API, GitHub integration (free), Slack integration (free) Free plan: unlimited issues
Graphite CLI, VS Code, Web Stacked PRs, PR inbox, limited AI reviews Hobby: personal repos, CLI, no Slack, no merge queue
GitHub Actions CI workflow_dispatch (API returns run_id since Feb 2026), path filtering Free for public repos; 2000 min/month private
Slack iOS, Android, Mac, Web Channels, webhooks, app integrations Free: 90-day history, 10 integrations, 5 GB storage
Jekyll Build tool _drafts/ ignored in prod by default, --drafts flag for local preview N/A

Appendix B: Change Log

# Change Reason
1 _drafts/ must be tracked by git Agents need to create/edit drafts in branches; currently gitignored
2 All trigger surfaces expanded to iOS + Android + Mac + Web Perplexity and ChatGPT available on all four surfaces
3 Codex surfaces documented (CLI, Mac App, IDE Extension, Cloud) All surfaces are available as of March 2026
4 CODEX_TASK.md replaced with per-task docs/tasks/CWS-NNN.md Per-task files avoid conflicts, provide history, map to Linear issues
5 Human-in-the-loop reduced to two touchpoints Idea creation and final review; everything else automated
6 Codex start remains manual (no API) No programmatic trigger API exists as of March 2026
7 Slack 10-integration budget documented Free plan hard limit; integration slots must be planned
8 Graphite free tier constraints documented Slack integration, merge queue, and org repos require paid tiers
9 ChatGPT GitHub integration clarified as read-only Cannot trigger workflow_dispatch; Codex or CLI needed for writes
10 GitHub workflow_dispatch API return_run_details noted Feb 2026 change enables tracking dispatch runs
11 Codex GitHub Action (openai/codex-action@v1) added as optional CI tool Available for PR review without running full agent tasks
12 Editorial pipeline triggers expanded to include _drafts/** Drafts need the same validation as published posts
13 iOS-intake-guide replaced with multi-surface trigger guide Covers all surfaces, not just iOS
14 Codex usage doc scoped to macOS surfaces (Linux = CI-only) CLI, Mac app, IDE extension, Cloud
15 Tool capabilities matrix added as Appendix A Quick reference for all tool constraints validated against March 2026 docs
16 Windows references removed User does not use Windows; reduces noise
17 One Piece agent naming system added Agents get memorable identities matching their roles
18 Mental model reasoning agents added 11 structured thinking skills as Codex skills + portable prompts
19 Master Plan storage location defined docs/master-plan.md in repo (authoritative) + Linear reference copy
20 Reasoning agent auto-invocation rules added Orchestrator can intelligently apply mental models based on task context
21 Explicit device profile added (v3.1) Mac + iPhone + Android phone + Linux (CI-only). No Windows.
22 Android added to all mobile surface references Perplexity, ChatGPT, Linear, Slack, Codex Cloud sidebar
23 Codex CLI/IDE Linux clarified as CI-only User runs Codex CLI on macOS; Linux only appears in GitHub-hosted runners
24 Trigger surfaces expanded to iOS + Android + Mac + Web Both mobile OSes Shabib uses are now represented
25 Scripting language policy documented Bash default, Ruby for Jekyll/YAML; Makefile is single entry point
26 One-shot DX setup targets defined make setup, make ci-setup, make codex-setup
27 Definition of Ready added 8-point checklist gating agent task pickup
28 Linear API key necessity clarified Needed only for dispatch workflow CI; free personal API key
29 Semgrep CE + CodeQL added as first-class security guardrails Free SAST tools, blocking on PRs
30 Linear task breakdown strategy added Epic structure, breakdown rules, execution order
31 Known gaps and risks section added 9 blind spots documented with mitigations
32 Definition of Done updated make security added alongside make check
33 DeerFlow-inspired agent safety rules added Loop prevention, stuck-agent escalation, retry limits in scripts
34 soul.md + instructions.md per agent Separates identity/personality from operational boundaries
35 MUST NOT language adopted for agent boundaries Stronger than MAY NOT — unambiguous prohibition
36 Agent boundary matrix added Explicit per-role file and command restrictions
37 Persistent agent context file (docs/agent-context.md) Markdown-based cross-session memory for agents
38 Two-phase orchestrator design Plan phase (post to Linear, notify Slack) → approval → execute phase
39 Option C plan review gate Linear comment = permanent record, Slack notification = doorbell
40 Structured clarification protocol SCOPE/APPROACH/RISK/MISSING taxonomy for agent questions
41 Self-audit checklist before PR creation 10-point verification before any PR is opened
42 Concurrency limits set (max_threads=3, max_depth=1) Prevents coordination chaos in multi-agent execution
43 Agent context system added to backlog New epic under INFRA and ORCHESTRATION

Known Gaps and Risks

Blind spots identified:

  1. No rollback strategy. If a Codex agent produces a bad PR that gets merged, there’s no documented rollback procedure. GitHub Pages auto-deploys on merge — a bad merge means a bad deploy. Add: “Revert PR template” and “rollback Makefile target” to backlog.

  2. No agent output size budget. Codex agents could produce enormous PRs that are hard to review. Add: max diff size guideline per task (e.g., ≤ 500 lines changed). If larger, decompose.

  3. No monitoring for the live site. After deploy, there’s no health check. A bad Jekyll build could produce a broken site. Add: post-deploy smoke test (curl the homepage, check for 200 + expected content).

  4. Secrets rotation. LINEAR_API_KEY and SLACK_WEBHOOK_URL have no rotation schedule documented.

  5. No cost monitoring for GitHub Actions. Starting March 2026, self-hosted runners have a $0.002/min charge. Your public repo uses GitHub-hosted runners (free for public repos), but if the repo ever goes private, costs would start. Document this constraint.

  6. Skill versioning. Skills are in the repo and versioned by git, but there’s no semantic versioning or changelog per skill. When a skill changes behavior, agents using it won’t know. Add: metadata.version in SKILL.md frontmatter + CHANGELOG.md per skill.

  7. No branch cleanup automation. After merging, stale branches from completed tasks may accumulate. Add: branch auto-delete on merge (GitHub repo setting) + periodic cleanup of orphaned docs/tasks/ files.

  8. Draft post collision. If two agents work on drafts simultaneously (unlikely with current manual trigger, but possible with future automation), _drafts/ could have conflicts. The multi-agent “one writer per file” rule covers this but isn’t enforced by CI.

  9. No emergency bypass. If CI or hooks break, there’s no documented way to force-push an urgent fix. Add: emergency bypass procedure (admin merge with [EMERGENCY] label, post-incident ADR required).

  10. Agent boundary enforcement is instruction-based only. Codex has no programmatic way to restrict file access or command execution per agent. The MUST NOT rules in instructions.md rely on the LLM following instructions. A misbehaving agent could still modify restricted files. Mitigation: the self-audit checklist catches scope violations before PR creation, and PR review is mandatory. Consider adding a CI check that validates diff scope against the agent role specified in the task file.


Final Notes

This system is intentionally designed as a solo-human / multi-agent operating model.

Shabib is not trying to automate away judgment. He is trying to automate away repetitive coordination, setup, validation, and mechanical execution so that his time is spent on:

That is the correct role split.

The two human touchpoints — idea creation and final review — are non-negotiable. Everything between them should require zero manual intervention except the current limitation of manually starting Codex (which may be resolved when OpenAI ships a programmatic trigger API).