Orply.

Codex Can Now Operate Local Mac Apps Without Taking Over

Romain HuetAri WeinsteinOpenAITuesday, May 12, 20266 min read

OpenAI’s Ari Weinstein argues that computer use turns Codex from a coding agent into a system that can operate local Mac applications by seeing interfaces, clicking, typing and continuing work in the background. In a demonstration with Romain Huet, Weinstein presents the feature as distinct from a full-desktop takeover: Codex uses a separate cursor, combines screenshots with macOS accessibility data, and requires app-by-app permission before it can see or type into local software.

Codex is being extended from code execution into local app operation

Romain Huet frames computer use as the part of Codex that takes it beyond code, files, and tool integrations into the work people do inside local Mac apps. Ari Weinstein describes the change more concretely: Codex could already run commands and write code, but much of a computer is still organized around graphical interfaces meant for a person looking, moving a mouse, clicking, and typing.

Computer use gives Codex those interaction primitives. Weinstein says it can use “literally any application” on the computer by seeing the interface, moving through it, clicking, and entering text. This is not just another API integration. It is a way for Codex to operate software that was built for human use.

The setup flow makes the control model visible before Codex starts acting. When a user asks Codex to turn on dark mode in System Settings, Codex first asks to enable computer use. The onboarding screen says Codex Computer Use needs permissions “to use apps on your Mac” and that those permissions are “only used when you ask Codex to perform tasks.” It requests macOS Accessibility permission, described as allowing Codex to access app interfaces, and Screenshots permission, described as letting Codex know where to click.

Those initial macOS permissions are distinct from the app-by-app approval model Weinstein discusses later. The first step lets Codex use the operating-system capabilities required for computer use at all; the later safety mechanism limits which individual applications Codex may see and type into when a task targets them.

Weinstein emphasizes that the setup is designed to keep the user oriented. After the user presses allow, the panel animates into the macOS Settings window to show where to look next, and macOS still requires authorization because system settings are being changed. Once enabled, Codex clicks through the requested setting and dark mode is turned on.

A separate cursor keeps the Mac usable while Codex works

Codex’s background-work model rests on not monopolizing the user’s desktop. Weinstein contrasts Codex with other computer-use systems by pointing to the cursor: when Codex opens UTM to create a Mac virtual machine, the cursor entering the app is not his own. Codex can click around without interrupting the person using the Mac.

A lot of computer use implementations, in fact, every computer use implementation I've ever seen, takes over your entire computer.
Ari Weinstein

The UTM task is a concrete case for that design. Weinstein says he uses virtual machines to test software in older Mac operating systems, but creating one is a “pain” because it requires clicking through setup steps and running the macOS setup assistant. In Codex, he types: “Make a new Mac VM in @UTM.” The “@” picker exposes local apps on the computer, and selecting UTM directs the agent to use that app. Codex opens UTM and reaches the point where macOS is downloading for the new VM. Weinstein says that after the download finishes, Codex can continue to the next step and set up macOS as well.

The same design extends to multiple apps. Weinstein starts a Spotify task — “I wanna focus. Play some good music for work for me in Spotify” — and then issues another request: “Add a reminder in the Reminders app tonight to look through my tax documents.” Codex begins using Spotify and then Reminders. Weinstein describes the result as a Mac becoming “this multitasking environment” where agents can handle things he does not want to spend time on.

The cursor is treated as more than a cosmetic layer. Weinstein says the team wanted the experience to feel natural and legible while the user watches Codex operate apps. The cursor motion uses curves that feel “kind of whimsical”; the arrow turns in the direction of travel so it appears to swim across the screen. Huet’s response is not only aesthetic: he says the cursor helps him understand what the agent is doing in each app.

Accessibility data changes both accuracy and model choice

Romain Huet asks about using computer use with a faster model, Spark, and about the combination of multimodal understanding and accessibility data. Ari Weinstein says traditional computer use has been based on screenshots: a multimodal model sees the interface and clicks or types by coordinate. Codex still uses that visual mode, but Weinstein says the team also extracts hidden information from the application interface through the macOS accessibility framework.

That accessibility data is textual and describes the interface. According to Weinstein, it lets the model understand more than a screenshot alone can provide: elements that are scrolled off-screen, and the role of elements on screen. He says this makes the model more accurate at performing tasks.

It also changes the model requirement. Because the system does not necessarily require images, Weinstein says Codex can use non-multimodal models such as Codex Spark, which he describes as “super fast.” In that mode, he says, computer use can operate software faster than a person can.

The Spark demonstration is a Messages task. The interface shows the model changing from GPT-5.5 to GPT-5.5-Spark, and the prompt asks Codex to “remind romain in @Messages to try computer use for debugging apps.” Codex opens Messages, types the message, and sends it while the user remains free to do other things. Huet receives the notification during the demonstration.

Weinstein links the implementation to earlier OpenAI computer-use products, saying that Operator and ChatGPT agent used dedicated models trained for computer use. Since then, he says, OpenAI’s research team has brought those capabilities into the main GPT models. Codex is therefore built on the same models available through the API, which Weinstein describes as both a product capability and an internal workflow simplification.

His forward-looking target is speed, but he presents it as a roadmap ambition rather than a measured benchmark. Weinstein says computer use should become “superhuman,” operating a computer “two, five, ten times as fast as a person.” In his view, that is when it becomes indispensable for many computing tasks, because it saves time and lets users focus elsewhere.

Permissions are app-by-app, not a whole-desktop handoff

Romain Huet raises the safety issue directly: Codex can now drive apps on a Mac, so how should users think about control and exposure? Ari Weinstein acknowledges that this type of technology can feel scary because it performs the user’s own computer actions and can have access to a lot of material.

The safety mechanism he emphasizes is application-level permissioning. Codex can only access applications the user allows. The first time Codex tries to use an app, it asks for permission. If the user says yes, Weinstein says Codex can see and type into that app, but cannot see or interact with other apps on the computer.

Huet sharpens the distinction: this is not streaming the entire desktop, and it is not granting access to all files. It is “case by case, app by app” as the user tries to be productive.

That matters because the examples are not limited to programming environments. Weinstein says he uses computer use to update financial-tracking spreadsheets, including spreadsheets he keeps in Apple Numbers. He also describes using a wide variety of software: web apps, Apple native apps, and other local applications. Huet’s broader framing is that Codex already had file-system access and plugins for online services; computer use is the missing local-app layer.

Codex computer use is available for Mac, according to Huet, with Windows support planned “very soon.” He suggests trying it on the kind of task that crosses five apps and consumes hours — work where the value is not a single click, but Codex’s ability to keep operating across local software without breaking the user’s flow.

The frontier, in your inbox tomorrow at 08:00.

Sign up free. Pick the industry Briefs you want. Tomorrow morning, they land. No credit card.

Sign up free