OpenAI just shipped Goals for Codex — persistent objectives that keep a coding agent working toward a defined outcome across multiple turns. Instead of restating your intent after every intermediate result, you set a completion contract and Codex keeps going until the evidence says it is done.
Codex already handles well-scoped tasks: fix a bug, add a test, explain a failure. Goals are for work where the next step depends on what Codex learns along the way — profiling, patching, benchmarking, reproducing a flaky test, or turning a research question into an evidence-backed audit. Those tasks do not need a bigger prompt. They need a persistent objective.
Key Takeaway
When to Use Goals vs Normal Prompts
Use a Goal when the task has a clear finish line but the path to that finish line is uncertain. A normal prompt remains the right tool for a one-off edit or a simple explanation.
| Use a Goal | Use a Normal Prompt |
|---|---|
| Performance optimization with a target metric | One-line bug fix |
| Flaky test investigation and reproduction | Simple code explanation |
| Dependency migration across multiple files | Short code review |
| Benchmark-driven tuning with constraints | Single function refactor |
| Research tasks requiring a final artifact | Quick question about an API |
The mental model is straightforward. A normal prompt follows the pattern: ask, work, result, wait. A Goal follows: work, check, continue or complete. Goals are strongest when you would otherwise find yourself saying "keep going" or "try the next fix" after every turn.
Getting Started with Goals
Goals are available starting in Codex 0.128.0. Install or update with npm or Homebrew, then set a Goal with the /goal command followed by your desired outcome:
Command Surface
Once active, Codex inspects the code, runs relevant commands, makes changes, tests the result, and continues until it reaches a stopping condition. That condition may be success, pause, clear, interruption, budget limit, or a blocker that requires your input.
Anatomy of a Strong Goal
A good Goal is more than a larger prompt. It is a compact contract that defines how Codex should work, what counts as success, and what should happen when success is not yet reachable. The strongest Goals define six things:
- 1Outcome — what should be true when the work is done.
- 2Verification surface — the test, benchmark, or artifact that proves it.
- 3Constraints — what must not regress while Codex works.
- 4Boundaries — which files, tools, or resources Codex may use.
- 5Iteration policy — how Codex should decide what to try next.
- 6Blocked stop condition — when Codex should stop and report.
Weak vs Strong Goal Examples
A weak Goal gives Codex no reliable completion condition:
/goal Improve performance
A strong Goal names the end state, verification method, and constraints:
/goal Reduce p95 checkout latency below 120 ms, verified by the checkout benchmark, while keeping the correctness suite green. Use only the checkout service, benchmark fixtures, and related tests. Between iterations, record what changed, what the benchmark showed, and the next best experiment to try. If the benchmark cannot run or no valid paths remain, stop with the attempted paths, the evidence gathered, the blocker, and the next input needed.
The strong version gives Codex three things: an outcome it can measure, a verification method it can run, and a constraint it must preserve. If p95 improves from 180 ms to 135 ms, the Goal is not done. If latency drops below 120 ms but correctness tests fail, the Goal is not done either.
How Goals Work Under the Hood
Goals are implemented as persisted thread state, not as global memory or project-level instructions. The objective belongs to the thread where the relevant context lives — the files inspected, commands run, diffs produced, and reasoning trail built up.
Continuation is event-driven rather than a simple loop. Codex checks for continuation only at safe boundaries:
- After a turn has finished
- When no other work is pending
- When no user input is queued
- When the thread is idle
The dispatcher is deliberately conservative. Plan-only work does not trigger continuation. Interruptions pause the objective. If a continuation turn makes no tool call, the next automatic continuation is suppressed to prevent spinning.
Budget Handling
Goals for Research and Investigation
The same principles apply to research tasks. Define the evidence standard before the work begins — what counts as exact reproduction, what counts as partial reconstruction, and what should be treated as blocked.
A strong research Goal might look like this:
/goal Produce the strongest evidence-backed reproduction of the paper using available materials and local resources. Attempt headline results where feasible, verify outputs where possible, and end with a report that separates confirmed findings, approximate reconstructions, blocked claims, and remaining uncertainty.
This keeps the work moving after blockers appear while keeping the final language honest. A trained replacement can support a claim, a close numerical match can raise confidence, but neither should be described as recovering the original experiment exactly.
Practical Patterns for Writing Goals
A useful template for structuring Goals:
/goal [desired end state] verified by [specific evidence] while preserving [constraints]. Use [allowed inputs, tools, or boundaries]. Between iterations, [how to choose the next action]. If blocked, [what to report and what would unlock progress].
When the task is clear but the Goal is not, you can ask Codex to help write it. Describe the work in plain language and ask Codex to turn it into a draft Goal. Review the draft, tighten the success condition and constraints, then activate it.
When Not to Use Goals
- One-line edits, simple explanations, or short code reviews — a normal prompt is faster.
- Vague finish lines — "make this better" gives Codex no reliable completion condition.
- Hiding uncertainty — if data may be unavailable, say so in the Goal rather than hoping Codex figures it out.
What This Means for Developer Workflows
Goals change the operating model of coding agents. They turn a thread from a sequence of isolated prompts into a stateful work loop around a defined outcome. For developers, this means:
- 1Less babysitting — no more "keep going" or "run the benchmark again" after every turn.
- 2Evidence-based completion — work is not done because the model believes it is probably done. It is done when the evidence confirms it.
- 3Honest blockers — when Codex cannot proceed, it reports what it tried, what failed, and what input it needs rather than guessing.
- 4Budget awareness — you control how much compute a Goal can consume, and Codex summarizes progress when the budget runs out.
This pattern aligns with the broader shift toward orchestration-era agentic coding, where agents manage multi-step workflows autonomously. Goals formalize what was previously implicit: the completion contract between developer and agent.
If you are already using Codex for Symphony-style orchestration workflows, Goals add the missing persistence layer — the objective survives across turns without you restating it. And for teams exploring how to review LLM-generated code effectively, Goals make the audit trail explicit: every iteration records what changed, what was tested, and what the evidence showed.
The Bottom Line
Goals are the difference between asking an agent to do the next thing and telling it what done looks like. The architecture is intentionally bounded — thread-scoped, lifecycle-controlled, budget-aware, and evidence-gated. That makes Goals most useful for the work where coding agents already shine: debugging, optimization, migration, testing, and research.
The user supplies the objective. Codex follows the evidence. The Goal keeps both connected until the work is either complete or honestly blocked. For complex tasks, that is the difference between generating an answer and producing an audit.