Glossary

Key terms in A-Z order. `(Ch N)` marks the chapter where the term is introduced or discussed in depth.

A

AGENTS.md: An open-standard configuration file for coding agents. Contains project rules, code style, and stack information. Read by Codex, Amp, GitHub Copilot, and other AI tools. Managed by the Linux Foundation's Agentic AI Foundation. (Ch 3, 5)
Agent Teams: Anthropic's Claude Code mechanism for multi-agent collaboration. Composed of `TeamCreate` + `TaskCreate` + `SendMessage` + `addBlockedBy`/`addBlocks` — a dependency-graph-based collaboration system. (Ch 6, 9)
AAR (Autonomous Alignment Research): An experiment published by Anthropic in April 2026. Nine Claude Opus 4.6 instances independently ran alignment research over 5 days, achieving PGR 0.97. (Ch 12)
Approval policy: A setting in Codex `config.toml`. Three options: `untrusted` (prompt for every untrusted command) / `on-request` (prompt when the model asks; recommended default) / `never` (auto-approve). Sets the human-in-the-loop gate. The earlier `on-failure` value is deprecated. (Ch 3, 5)
Autoresearch: Karpathy's 2026 experiment running 700 GPT-2 training optimization experiments in 2 days using an autonomous agent loop. A production example of the AI Scientist pattern. (Ch 9, 12)

B

Branch-per-task: Codex's default work model. Each task runs on an independent git branch and completes as a PR. Naturally supports parallel execution of multiple tasks. (Ch 2, 3)

C

Compaction: Claude Code's context management technique. When the context window grows long, Claude summarizes past content and recycles the window. Opus 4.6 achieved 84% on BrowseComp with this. (Ch 2, 4)
config.toml: `~/.codex/config.toml`. The Codex CLI's global configuration file. Sets `model`, `model_reasoning_effort`, `sandbox_mode`, `approval_policy`, and other parameters. (Ch 3, 5)

E

Effort level: Codex setting that controls the model's reasoning depth. Four levels: `low` / `medium` / `high` / `xhigh`. In GPT-5.5, `xhigh` uses ~240x more reasoning tokens than `none`. (Ch 2, 3, 7)

H

Harness: The code and configuration that determines what context the model receives, what tools it can use, and how knowledge is stored and retrieved. Technically: a `while True` loop + tool execution + context management. (Ch 1, 4)
Harness engineering: The engineering practice of designing and optimizing a harness. Three core patterns: (P1) use general-purpose tools, (P2) expand autonomy, (P3) set boundaries carefully. (Ch 4)

K

Knowledge externalization: The real cost of Claude→Codex migration. The process of moving project knowledge locked inside model conversations or tool-specific memory into plain-text files like `AGENTS.md`, `HANDOFF.md`, `TASKS.md`. (Ch 1)

L

LLM Wiki: Andrej Karpathy's Obsidian vault-based personal knowledge management pattern. Three-layer structure: raw sources (L1) + LLM-owned wiki (L2) + schema (L3). (Ch 10)

M

Memory folder: Claude Code's memory persistence pattern. Agents write context to files in `.claude/memory/` and read as needed. +6.8 percentile points on BrowseComp-Plus. (Ch 2, 4)
Meta-harness: An outer-loop system where agents improve the harness they follow (AGENTS.md / config.toml). First systematized in the Meta-Harness paper (arXiv 2603.28052). (Ch 9)

S

Sandbox mode: Codex's execution environment restriction setting. `read-only` (read-only; most conservative) / `workspace-write` (writes in current workspace; recommended default) / `danger-full-access` (unrestricted system access; use with caution). (Ch 2, 3, 5)
Skills (SKILL.md): Codex's reusable task instruction files. Defined via SKILL.md frontmatter; trigger conditions declared in AGENTS.md. Same format as Claude Code skills. (Ch 5)
Subagent: A specialized agent focused on a specific role. Defined as `.claude/agents/.md` in Claude Code and `.codex/agents/.toml` in Codex. (Ch 5, 6)

V

Vibe coding: LLM-driven development where natural language instructions generate code. Fast but inherently unverifiable when done by a single agent. Multi-agent separation makes it verifiable. (Ch 6)

X

xhigh effort: The maximum reasoning effort setting in Codex / Claude Code. In GPT-5.5, uses ~240x more reasoning tokens than `none`. Used for complex refactoring and design tasks. (Ch 7)