OpenAI Plans to Replace Chat With Persistent Personal Agents

Greg BrockmanAlex KantrowitzWednesday, July 1, 202618 min read

OpenAI president and co-founder Greg Brockman argues that ChatGPT is moving beyond chat toward a persistent “personal AGI” that can understand context, use tools and act on a user’s behalf. In a Big Technology Podcast interview, Brockman says the limiting factors for that shift are not just model quality but trust, permissions, dynamic context, natural voice interaction and, above all, compute. He also makes the case that prices for a given level of intelligence will fall even as demand for frontier capability keeps rising, with health as one of the clearest early areas for widespread use.

OpenAI wants the interface to disappear into a persistent agent

Greg Brockman described OpenAI’s product direction as a move beyond chat as the central unit of interaction. ChatGPT, in his account, began as “a language model”: conversational intelligence that could talk back, but lacked memory, tools, and context. The company’s aim now is to build what he called a “personal AGI” — a persistent AI that understands goals, has access to relevant tools, and can act on a user’s behalf.

That is a sharper claim than ChatGPT becoming a larger app. Alex Kantrowitz put the “super app” framing directly: perhaps, when a user needs to do anything, the interaction starts with a prompt in ChatGPT, and OpenAI’s technology uses a browser or computer to get it done. Brockman called that “a pretty good perspective,” but zoomed out to a more radical endpoint: “almost no interface.”

“You want no product,” Brockman said. The comparison he returned to was not a dashboard or operating system, but a human assistant or co-worker: someone who can be spoken to, who knows enough context to be useful, and who can carry out tasks. In the near-term examples, that means an AI that can organize an inbox, help pursue a health plan, search for medical information, or move from discussion to execution when a user approves an action.

You want this to be like, what's the interface between you and me? Right? It's just being able to talk to a persistent entity of some form that's able to go and accomplish goals for you.

Greg Brockman · Source

Kantrowitz made the product implication concrete. Today, ChatGPT often ends answers with suggestions: it may offer to make a nutrition plan or a travel agenda. He asked whether the next step is that, after discussing health decisions, the system might suggest seeing a specialist, offer to make an appointment, and then actually do so. Brockman answered: “That’s exactly right.”

For Brockman, Codex is the proof point. Although Codex is named and positioned around coding, he said OpenAI’s goal is “to really bring the power of Codex to everyone” and “bring agents to everyone.” He described Codex less as a coding product than as a “general purpose, tool-using harness” — an agent capable of connecting to Slack, Gmail, calendars, and other tools.

Inside OpenAI, Brockman said, Codex is already used broadly by non-technical employees. One communications-team example involved organizing an event: the agent identified attendees, asked for dietary preferences, assembled seating-chart work, and handled parts of the process so the employee could focus on the event’s intended outcome. The significance of the example is not that an AI wrote an email; it is that the work crossed applications and required context, tool access, and judgment about when to ask for approval.

In Brockman’s description, the system might use a Gmail connector, search the inbox for attendees, determine whose dietary restrictions are already known and whose are missing, draft emails, and then either ask for permission to send, leave the user to send them, or — in a higher-trust configuration — send them automatically.

That last distinction led to one of Brockman’s central product constraints: delegation depends on earned trust. He did not argue for agents that automatically seize broad permissions. He described trust as something OpenAI must help users build gradually by providing “tools and control and oversight and supervision” to the person on whose behalf the AI is operating.

It's not something that we can grant.

Greg Brockman

The resulting product is not simply a smarter chatbot. It is a system where permissions, oversight, connectors, context, memory, and model capability all become part of the user experience. Brockman called trust “a key product feature and differentiator” for the agentic era.

Plugins failed because the model and context were not ready

The agent vision is not entirely new. Kantrowitz pointed to earlier attempts to make actions happen inside chat, including an OpenAI move to let users call an Uber within ChatGPT, as part of a longer line of companies trying to make chat into an action layer. Those attempts did not take off. Brockman’s explanation, in the related case of plugins, was blunt: the form factor was directionally right, but the models were not ready.

He cited OpenAI’s 2023 plugin launch as an example. The idea — an AI connected to Gmail or other external services — was “obviously” correct. But the early systems had severe limitations. Brockman said OpenAI could expose only a few connectors at a time before the model started forgetting things. Context windows were around 2,000 or 4,000 tokens. The system had no memory. The model could not reliably manage many tools or enough context to make the experience useful.

He compared that stage to early computers with tiny memory banks. The trajectory, in his account, is toward models with far more working memory, many more accessible tools, and enough capability to use them. He said current systems can have “hundreds of different tools accessible,” can be connected to whole file systems, and can have “almost the full power of the internet” and many applications at their fingertips.

Brockman also described much larger contexts — “five, twelve million token context, depends how you squint at it” — as part of why agents are now becoming practical. The capability level, he said, has improved enough that models are solving unsolved math and physics problems and helping people achieve things they could not otherwise achieve.

That combination — larger context, more tools, better reasoning, and greater reliability — is what he believes changes the outcome from failed plugin demos to useful agents. The old plugin pattern assumed an AI could trigger an external service. The agentic pattern assumes an AI can understand a goal, inspect relevant context, choose tools, draft or execute work, and escalate decisions when trust boundaries require it. That is why Brockman said agents are “not sci-fi anymore.”

The operating-system analogy is useful, but Brockman thinks it is too small

Kantrowitz pushed Brockman on whether OpenAI is effectively becoming an operating system. If ChatGPT becomes the interface through which users interact with other apps, it begins to occupy the role that iOS has historically played on the phone: the layer that mediates access to software and user intent. Kantrowitz described a future where users do not tap from app to app, but route interactions through ChatGPT.

Greg Brockman accepted that one “could describe it that way,” but said he thinks about it differently. To him, an operating system belongs to an older way of understanding computing. The core question is not what layer of the stack OpenAI occupies, but what the ideal interface to an AGI should be.

His answer again was conversation with a persistent assistant. The AI may have its own computer. It may have delegated access to the user’s system. It may have an inbox of its own, or a way for users to forward information to it. It may sometimes operate directly on the user’s machine, in the way a co-worker might sit down and type something. But Brockman framed all of those as variants of working with another actor, not as variants of app launching.

That view also shaped his response to potential competition with Apple and Siri. Alex Kantrowitz raised Apple’s position directly: Siri may become an intelligence layer over iPhone apps, while ChatGPT remains an app on the iPhone. Brockman did not dwell on platform dependence. Instead, he argued that a new level of AI capability creates an opportunity to rethink “everything”: the interface, what the technology can do, and what users can become capable of doing.

The examples he reached for were not reminders, timers, or app actions. They were scientific and medical. He said OpenAI had announced that, in peer-reviewed literature, doctors were using O3, one of OpenAI’s earlier reasoning models, to find diagnoses for patients who had lacked answers for years. Brockman cited an example of someone who had spent 20 years with a mysterious ailment and, according to him, finally received a diagnosis through use of the technology. His point was that, if models can produce that kind of value, the competitive question cannot be reduced to whether one app has distribution through another company’s device.

Brockman did not deny competition. He said there will be competition and that it will be good. But he rejected the premise that the market is mainly about who owns the screen. The more consequential shift is from conversational intelligence to agents that can do useful work, provided they have the right context and the right trust boundaries.

This also shaped his discussion of hardware. Asked about publicly reported OpenAI device work, Brockman replied only that it had “certainly” been publicly reported. Kantrowitz then said he had been in OpenAI’s office in December and that Sam Altman had told him the work was happening and involved multiple devices. Brockman did not confirm those details or describe a device roadmap. Instead, he returned to the agentic form factor.

A device, in Brockman’s framing, is an interface to an AI, not the AI itself. He compared it to a phone: a phone is not the person, but a way to call, text, or email the person. Similarly, users may access agents synchronously or asynchronously, through different surfaces. A dedicated device could matter because it makes the agent easier to reach and helps feed context into the system, but Brockman argued that the decisive layer is not the physical object. It is the agent’s access to evolving context and its ability to operate within trusted boundaries.

The co-worker analogy became more explicit when he described business use. Imagine, he said, hiring someone with a PhD in every field and multiple Nobel prizes — or hiring a hundred such people — and then not inviting them to meetings. They would not be useful. The same applies to AI: raw intelligence is not enough if the model lacks dynamic context about the business, its meetings, its workflows, and its processes.

The problem, then, is building a context layer that evolves as the organization evolves, while making access ergonomic and permissions trustworthy. That is a different problem from shipping a smarter app. It is closer to designing how another worker enters the organization.

Voice is still constrained by turn-taking, and Brockman expects that to break

Voice is central to Brockman’s idea of an interface that “melts away,” but he was clear that current voice products still reflect machine limitations. Asked about reports of bidirectional voice models, Greg Brockman declined to share specifics, then described the general direction of the field.

The older voice architecture chained together separate systems: speech-to-text, text-to-text, and text-to-speech. Brockman called that “horribleness.” Even with more unified models that can take input and produce a response, he said, the experience is still constrained by turn-taking. The user speaks, waits, the model speaks, and interruption is awkward.

Human conversation does not work that way. People overlap, interject, interrupt, hesitate, and respond while still listening. Current systems approximate that with hacks: models that guess when a turn has ended or started. Brockman’s objection was conceptual as much as technical: “Why are we talking about turns?” To him, turns are another example of humans bending themselves around machine constraints.

The desired model is one that can process input and output at the same time, enabling more fluid, natural conversation. Brockman said many people in the field are pursuing that goal. He described the current ChatGPT Voice experience as “magical” in some settings, such as asking questions during a commute, but also frustrating when the system breaks the illusion by talking over the user or failing to handle an interjection.

He connected voice not only to consumer use, but to work with agents. Some of his “most magical” Codex experiences, he said, have involved operating it through voice. The reason is simple: short typed feedback is easy, but writing a paragraph of instructions is often unpleasant. A real-time spoken feedback loop changes the experience of supervising an agent. If agents are to become co-workers, voice is one of the ways the user manages, corrects, and directs them without turning the interaction back into clerical labor.

Brockman sees no model wall; the constraint is building enough compute

Asked whether large language models will hit a wall, Greg Brockman gave a two-part answer. First, he argued that the empirical record supports continued scaling. Second, he said the practical challenge is not whether improvement is possible, but whether the necessary infrastructure can be built.

Brockman described scaling laws as one of the most mysterious and important empirical observations in modern science. OpenAI’s experience, he said, is that models continue to improve with more data, more compute, and better architectures. When results have failed to scale as expected, he said, the problem has typically been a bug, a mismatch between implementation and math, or some other issue in execution rather than a fundamental wall.

He widened the claim historically. Neural nets, he said, were designed in the 1940s before computers, with the first hardware implementation in 1959 through the Perceptron. Looking across landmark results in the field, Brockman said they follow an “incredibly smooth, deterministic path” as more compute is applied. For 70 or 80 years, he said, people have predicted neural networks would not work, would not scale, or would hit a wall. “There’s still no wall in sight.”

But he paired that confidence with an account of operational difficulty. Building massive supercomputers is hard and expensive. OpenAI has teams working across the stack, including designing its own network protocol. Brockman described neural-network systems as having few clean abstractions: a small issue in one layer can ripple through and appear much later in a training run or performance graph. The work requires people who deeply understand the full stack and can grind through hard technical problems.

When Alex Kantrowitz asked where differentiation comes from if multiple model makers can produce extremely capable systems — one with the equivalent of “15 PhDs,” another with “13 PhDs” — Brockman first answered in terms of compute scarcity. He said the market is heading toward an “attractor state” in which every provider sells out all available compute.

His claim was not merely that OpenAI needs more chips. It was that AI demand will grow faster than global compute supply. He said current agent usage is on the order of 10 million or 20 million users, while ChatGPT has around a billion users, and the agentic capability has not yet been brought to that full scale. Usage depth is also, in his view, still tiny compared with where it is going. If agents become part of everyday work across the economy, compute becomes a scarce input to economic activity.

10–20 million

approximate current agent users cited by Brockman

On that basis, Brockman said model-making remains a good business and that new entrants can still find room. Even providers with lower capability could sell useful capacity into a market where demand exceeds supply.

But he also rejected the idea that intelligence is one-dimensional. Raw general intelligence helps, but domain expertise still matters. A model that has never practiced pitching will not be good at pitches on its first attempt; a model that has never operated spreadsheets will not automatically succeed at complex financial modeling. OpenAI, he said, must prioritize domains because it cannot be excellent at every area at once.

The result is a second source of differentiation: depth in domains. Brockman invoked AlphaGo’s move 37 as a model for how AI can deepen human understanding rather than end a field. In his telling, the lesson is that once an AI gets good enough to push a domain forward, it opens more work rather than less. Science, he said, is not a finished map; each solved mystery unlocks more mysteries. That leaves room for companies to differentiate by how deeply their systems can advance fields, not just by general leaderboard performance.

Falling prices do not resolve the compute shortage

Kantrowitz pressed the financial risk of OpenAI’s compute buildout. If the company is spending heavily on infrastructure in a new category, is there a possibility it cannot pay back the commitments? Greg Brockman answered from “fundamentals”: compute takes years to arrive, demand is already visible, and OpenAI believes it must invest ahead of that demand.

He said OpenAI has been investing in its own chip program for multiple years and expects to have more to announce soon. He described that work as part of “full vertical integration of the supply chain,” and argued that the future economy will not have enough compute to satisfy AI demand. He pointed to the exponential growth of ChatGPT, current growth curves, and a recently announced chemistry result involving improving a reaction as signs of underexplored demand.

The tension is straightforward: if demand is so large and compute is so expensive, how does the math work amid reports of price cuts and a brewing AI price war?

Brockman’s answer was that OpenAI has always tried to increase intelligence while cutting the price for a fixed level of intelligence. He invoked Jevons paradox: as the cost of a capability falls, usage expands rather than total demand shrinking. Frontier intelligence, he said, will remain the priciest tier. But today’s premium intelligence will become far cheaper over time as a newer, better frontier appears above it.

When asked directly whether OpenAI will cut prices, Brockman said, “the answer is always yes,” but rejected the idea of a sudden short-term collapse. Over a year-long time horizon, today’s premier level of intelligence should become much cheaper. But there will be a new model that is much better, making users ask why they would use the older one.

The enterprise side of the answer was about value measurement. Brockman said companies initially approached AI agents with fear of being left behind. More recently, customers have begun asking for ROI, spend controls, and observability. He called that a healthy shift because it means customers are asking the right questions. He said OpenAI had just released spend controls and is investing in enterprise readiness, not merely releasing models.

That pricing logic also shaped his response to the “models are a commodity” argument Kantrowitz attributed to Microsoft CEO Satya Nadella. Brockman did not directly engage the interpersonal or strategic tension with Microsoft. Instead, he argued that no layer of the AI stack will be removed from the value chain. Compute, models, orchestration, domain-specific workflows, and enterprise context all multiply together.

He used compute as the analogy. One could say compute is commoditized because it is “just flops.” But in practice, he argued, compute has fundamental value because “no compute, no AI.” The market’s valuation of chip companies and compute providers reflects that. He cited H100s and Hopper-generation chips as an example: in a normal, non-supply-constrained environment, prior-generation chips would not command such demand. But in the current market, he said, prices are up relative to where they were before because everyone faces an avalanche of demand.

His conclusion was that commoditization does not necessarily erase value, margins, or strategic importance. A layer can be widely supplied and highly competitive while still being critical and profitable.

He applied the same reasoning to models. Brockman acknowledged that competition among models is real and good for enterprises, customers, and consumers. But he said OpenAI’s models have “always been the sort of smartest ones,” especially for solving very hard problems. The transformative impact of that intelligence, in his view, is only beginning. If a model can speed up science, the smarter model speeds it up more. That is a different use case from a conversational model that books travel or organizes a calendar.

At the same time, Brockman agreed that enterprise systems and domain workflows create substantial value. Regulated industries, education, and other complex domains require expertise in how workflows should operate and how different parties interact. In education, he noted, parents, teachers, and students all have distinct roles that must be handled thoughtfully. For such domains, companies with expertise can build value by orchestrating models in the right way.

The economic argument, then, has two parts that remain in tension. OpenAI expects unit prices for a given intelligence level to fall. It also expects demand to expand so quickly, and frontier capability to improve so much, that revenue and compute needs continue to climb. Brockman said the market size, OpenAI’s revenue ramp, and industry growth remain early enough that “none of us are anticipating how steep” the trajectory will become.

When Kantrowitz returned to Nadella specifically — characterizing him as calling models a commodity, trying to build frontier intelligence, talking to OpenAI’s potential customers about learning loops from their data, and having access to OpenAI IP until 2032 — Brockman again sidestepped conflict. He said the most important thing is AI’s usage in the economy to transform it and uplift people, and that more people trying to make that happen is better for everyone.

Health is where Brockman thinks AI becomes standard, not exceptional

Health was the area Greg Brockman returned to most personally and with the least qualification. Alex Kantrowitz asked whether recent striking stories about AI-assisted medical interventions are outliers or signs of what will become standard. He mentioned GitLab CEO Sid Sijbrandij using extensive diagnostic testing and ChatGPT, with assistance from people who had built a purpose-built application, in connection with fighting cancer. He also mentioned “Rosie the dog” in Australia, while explicitly cautioning that he might get some details wrong: a dog with cancer whose mutations, in his telling, were run across AlphaFold, leading to an mRNA vaccine designed with chatbot assistance and followed by tumor shrinkage and improved mobility. The question was whether such cases are anomalies that make good headlines or early examples of a normal future.

Brockman’s answer was direct: “Absolutely going to become standard.” He said he personally knows several friends who have done similar things: collecting health diagnostics and using Codex or related models to draw insights from them.

He also gave a usage figure: about 230 million people each week use ChatGPT for health queries. In his description, those uses range from uploading scans to navigating conflicting information from doctors. He framed this as a response to an existing imbalance: patients, he said, are not empowered, yet they remain accountable for outcomes. If a doctor misses something, the patient may bear the consequences for life.

230 million

weekly ChatGPT health-query users cited by Brockman

Brockman made the issue personal. His wife has several health conditions, he said, and he does not know how they would manage many of them now without chat. He argued that even patients with excellent medical teams and access to top experts face limits: sometimes a detail is missed, sometimes the chart is not fully read, and sometimes the relevant knowledge is beyond current human reach.

The opportunities he described span several levels. Some AI health work will involve mass-market drugs and drug discovery. Some will involve “N of one” cases, such as rare-disease diagnosis. Some will involve understanding conditions and proposing possible therapeutics. He said these are already happening, not merely theoretical.

Brockman also emphasized the system-level effects. Health care is a large share of the economy, doctors and nurses are burned out, and prevention or earlier intervention could reduce strain. He did not argue that AI automatically solves those problems; he said AI has the potential to help if deployed and used “wisely and well.”

For Brockman, medicine is not just another vertical for agent adoption. He described it as one of AI’s most astounding possibilities and a personal motivation for his work at OpenAI. The same ingredients discussed elsewhere — models that can reason, agents with context, user trust, domain depth, natural interaction, and enough compute — converge in health because the stakes are high, the information is fragmented, and the cost of missed connections can be lifelong.

AI Application Architecture AI Labs and Strategy AI in Healthcare and Life Sciences Voice and Audio AI Agents and Autonomy AI Infrastructure and Compute AI Business Models Human-AI Interaction Coding Assistants Enterprise AI Adoption