OpenAI Graduates Codex Goal Mode for Long-Running Coding Tasks

OpenAIThursday, May 21, 20264 min read

OpenAI says Codex’s goal mode is now a persistent workflow for assigning the agent a concrete software milestone and letting it work until the stated completion criteria are met, even over hours or days. The feature, available in the Codex app, IDE extension and CLI, turns a `/goal` prompt into the task definition Codex uses to judge when it is done. OpenAI argues the mode is best suited to work with observable endpoints, while still allowing users to steer, inspect, pause, resume or revise the goal as the run progresses.

Goal mode turns a prompt into a persistent milestone

OpenAI describes goal mode in Codex as a way to give the agent a concrete milestone and let it keep working toward it for long-running tasks, including work that may take hours or days. The functionality is available in the Codex app, the IDE extension, and the CLI.

The operating model is to define what “done” means, let Codex continue until that condition is met, and intervene when the work needs steering, inspection, pausing, or revision. A user starts by typing /goal in the message composer, then writes the objective Codex should pursue. That goal serves two roles: it is the initial instruction that starts the work, and it is the instruction Codex uses to determine whether the task has been completed.

The example shown is a codebase migration:

Migrate this codebase from JavaScript to TypeScript. The app should compile in strict mode without explicit 'any' type definitions.

That example illustrates the kind of target the feature is built around. The goal is not merely “improve the codebase” or “migrate to TypeScript.” It includes observable constraints: the app should compile in strict mode, and it should do so without explicit any type definitions. The guidance is that the best goals make it clear whether Codex has achieved them, either through a measurable target or through test criteria that have to pass.

For software work, goal writing becomes part of task design. The user has to state the outcome in a form that can be checked: compilation, tests, validation rules, migration completeness, or another acceptance criterion. Goal mode is most legible when the work is too large to resolve cleanly in a single turn, but specific enough that progress can be judged against an endpoint.

Codex can help turn broad intent into a usable goal

If a user is not ready to write the implementation goal directly, Codex can help shape it first. One route is to use /plan before /goal, then convert the resulting plan into the goal Codex will execute against.

The example shown in the app is a plan request: “Create a plan to migrate our current database to Postgres.” /plan is presented here as a way to prepare a goal: take a broad intention and make it precise enough to become the target for a longer implementation run.

Another option is to ask Codex to interview the user before crafting the goal, then have it set the goal itself. The pattern is to clarify the work before delegating persistence. Goal mode does not remove the need for specification; it can shift some of that specification process into Codex, through plan mode or an interview-style exchange.

Human checkpoints stay available while the task keeps running

Once a goal is running, users can continue sending additional messages to steer Codex. In the TypeScript migration example, Codex is shown “pursuing” the migration goal while reading files. The user adds guidance: “Use zod for any type validation of external data.”

That steering message is a correction or refinement while the main task is underway. Side chats are a separate mechanism for inspecting progress. A side conversation can be opened in what the interface describes as “an ephemeral fork.” In the example, the user asks: “How is it going in 1-2 sentences?” The stated purpose is to let the user understand the work completed so far without interrupting the main task.

The distinction matters in a long-running agent workflow: steering changes the direction or constraints of the task; a side chat is for observation and clarification.

Pause, resume, and edit make longer runs manageable

Because some goals may run for hours or days, the workflow includes pause and resume. The example given is practical: a user may be leaving the office while a laptop is about to lose internet connection. In that case, the goal can be paused and resumed later.

The interface example shows a goal in progress with code changes accumulated: “15 files changed,” with additions and deletions displayed as “+1064 -421.” The goal is then paused. Pausing is not described as cancellation; it is a way to stop ongoing work and continue later.

Resuming is not the only option. A user can also edit the goal before continuing if the objective has changed or if more concrete guidance is needed. The edit dialog shown modifies the original JavaScript-to-TypeScript migration goal by adding requirements: split the codebase into multiple files for easier maintenance, and translate tests to TypeScript with Vitest.

The initial goal starts the work and defines completion, but the user can later revise it as understanding improves. The workflow keeps human checkpoints in the loop: correction, inspection, pausing, resuming, and revision.

OpenAI says single goals have run for more than a hundred hours

OpenAI’s strongest duration claim is that it has seen many people use Codex to make progress on complicated tasks for “upwards of a hundred hours” on a single goal.

100+ hours

reported progress on a single Codex goal

The claim explains why goals are being positioned as more than a convenience for short coding requests. Codex is being framed as an agent that can keep working toward a specified endpoint across extended periods, provided the endpoint is concrete enough to guide the run.

Agents and Autonomy Human-AI Interaction Coding Assistants