Orply.

AI’s Bottlenecks Move From Models To Infrastructure And Control

Applied AISaturday, May 9, 20261h 6m to watch14 min read

Today’s sources describe an applied-AI market increasingly constrained by compute, power, chips, and governed deployment rather than demand alone. Reports on Anthropic’s access to Colossus capacity, Apple-Intel talks, Three Mile Island’s planned restart, GPT-5.5 Instant safety plumbing, Codex in Chrome, and ServiceNow’s governance pitch all point to the same shift: scaling AI now depends on physical capacity and reliable control over actions.

Compute is now the revenue bottleneck

The day’s applied-AI stories point less to a single model breakthrough than to a change in where the bottlenecks sit. In the source material, the binding constraints are no longer only algorithmic capability or user demand. They are compute, electricity, chips, physical sites, safety systems, browser permissions, and enterprise governance.

The SpaceX-Anthropic agreement is the sharpest expression of that shift. On All-In, Chamath Palihapitiya framed the deal as evidence that frontier AI revenue is being governed by supply constraints rather than demand. In his telling, Anthropic and OpenAI’s revenue curves would likely be even steeper if they had “infinite power.” The reported arrangement gave Anthropic access to all compute capacity at Colossus 1, including more than 300 megawatts of new capacity and over 220,000 NVIDIA GPUs coming within the month. The announced customer-facing effect was practical: higher Claude rate limits, fewer peak-hour reductions, and larger API limits for some Claude Opus users.

300+ MW
new Colossus 1 capacity Anthropic was reported to access through SpaceX

That turns a capacity lease into a market-structure signal. If compute is scarce, the companies that can build, power, and operate large clusters gain leverage over model companies that otherwise have customers waiting. The All-In panel did not establish a formal new hyperscaler entity around SpaceX, xAI, or Colossus. But Brad Gerstner and Jason Calacanis both described an emerging infrastructure layer in which Musk-linked companies could monetize data centers, power access, factory-building ability, and AI demand before xAI itself has converted its model investments into comparable revenue.

Gerstner argued that leasing capacity to Anthropic could derisk the compute layer by turning infrastructure spending into near-term revenue. David Sacks made the same point more directly: xAI carries large frontier-AI capital commitments, while SpaceX has profitable businesses elsewhere; renting capacity to Anthropic helps convert some of that commitment into cash. Sacks also emphasized where he thinks the immediate revenue engine sits: coding and enterprise work, not every speculative AI use case.

The market debate then moved from infrastructure scarcity to monopoly risk. Sacks argued that Anthropic’s reported trajectory, if sustained, could make it “the most powerful monopoly ever created in human history.” His warning rested on reported annualized revenue growth: from roughly $10 billion at the start of the year to $30 billion at the end of March, then $44 billion in April, with a possible path toward $100 billion by year-end. The claim was not that Anthropic already has an established monopoly. It was that revenue curves this steep, if capacity keeps arriving, could force market-structure questions before the market feels mature.

Gerstner pushed back on the timing and framing. He argued that AI has “not even left the starting gate,” and that OpenAI, Google, Amazon, Microsoft, and others still have capital, distribution, talent, and active products. In his view, preemptive monopoly language risks becoming winner-picking before the market has stabilized. Sacks’s counter was that regulators usually notice market power only after it has already formed, and that safety rhetoric can become a moat if it turns into a standing approval regime or a Washington gatekeeper.

Speaker or sourceEmphasisWhy it matters
Chamath PalihapitiyaFrontier-AI revenue is supply-constrained by data centers and power.Demand may not be the limiting factor for leading labs.
David SacksCoding and enterprise work are the immediate revenue engine; Anthropic’s trajectory could invite monopoly concerns.The compute bottleneck becomes a market-structure and policy question.
Brad GerstnerThe market remains early and competitive; leasing compute can derisk infrastructure spending.Infrastructure monetization may precede a clear model-market winner.
Jason CalacanisMusk-linked assets could resemble an emerging compute business.Physical infrastructure, energy, and factory-building become strategic advantages.
The All-In panel differed on implications but shared a compute-scarcity premise

That disagreement matters because the panel largely agreed on the underlying constraint while disagreeing about its implications. Palihapitiya emphasized power and data-center supply. Sacks emphasized enterprise and coding demand. Gerstner emphasized early competition and the need to let the market play out. Calacanis extended the infrastructure thesis into a broader “Elon Web Services” frame. The common premise was that frontier-AI demand is no longer the only question. The applied-AI business problem is whether companies can obtain enough usable compute to turn demand into revenue.

John Coogan and Jordi Hays reached the same bottleneck from a different route. In their discussion of Intel, Apple, Nvidia, DeepSeek, and Anthropic’s access to xAI-linked compute, Coogan’s shorthand was that “demand for compute finds a way.” DeepSeek’s reported $7 billion raise at a $50 billion valuation was discussed mainly as a compute-acquisition story: capital to buy capacity, release models more often, and return to the frontier curve. The xAI-Anthropic arrangement, which Coogan said he had not expected because of cultural tension between Elon Musk and Anthropic, became another example of the same force overriding softer obstacles.

The useful synthesis is not that one company is destined to win the market. The sources do not establish that. It is that scarce compute is changing the order of business questions. Infrastructure owners get negotiating power. Model companies’ revenue becomes capacity-bound. Policy and monopoly questions arrive while the competitive field is still unsettled.

SpaceX-Anthropic Deal Highlights Compute as AI’s Revenue BottleneckAll-In Podcast
Apple’s Reported Intel Deal Shows Compute Bottlenecks Driving Industrial PolicyTBPN

Power and fabs turn AI into industrial policy

Once compute becomes the bottleneck, the story broadens from cloud contracts to the physical supply chain. The Apple-Intel and Three Mile Island pieces show two sides of the same pressure: AI workloads require not only models and chips, but advanced fabs, grid capacity, power contracts, and political support.

Coogan framed Intel’s reported preliminary manufacturing agreement with Apple as industrial policy as much as a supplier story. The reported deal was described as the product of intensive talks, with the Trump administration pushing Apple and Intel toward an arrangement that would support domestic semiconductor capacity. Coogan emphasized that Intel’s foundry ambitions depend on convincing major external customers to bet on its next advanced manufacturing process.

Apple’s interest, in his telling, was not simply patriotic. Apple remains heavily dependent on TSMC for chips across major devices, while TSMC capacity is increasingly contested by Nvidia and other AI-chip designers. Coogan noted that Apple had historically enjoyed unusual leverage as a top TSMC customer, but the surge in AI-chip demand changes that position. A reported Intel relationship therefore functions as a capacity hedge, a geopolitical hedge, and a way to reduce dependence on Taiwanese supply chains.

The government’s role was explicit in the source material. Coogan said the administration converted nearly $9 billion in federal grants into Intel stock, giving the U.S. government a 10% stake. Nvidia had also invested $5 billion in Intel and announced a partnership under which Intel would build custom data-center CPUs for Nvidia. The reported Apple talks sat alongside those moves and Musk-linked fab ambitions. In Coogan’s account, the state was not merely subsidizing an industry in the abstract; it was trying to create customer commitments around a domestic manufacturing base that AI demand has made strategically important.

Infrastructure pressureSource exampleWhy it matters
Advanced manufacturingReported Apple-Intel manufacturing talksApple gets a capacity and geopolitical hedge; Intel gets a possible anchor customer for foundry ambitions.
State capitalNearly $9B in federal grants converted into Intel stockThe government becomes a direct stakeholder in rebuilding domestic semiconductor capacity.
Strategic investmentNvidia’s $5B Intel investmentAI-chip leaders support domestic supply options while demand for TSMC capacity intensifies.
ElectricityThree Mile Island restart planningAI demand pulls old nuclear assets back into the power mix before new reactor designs are ready.
How the day’s infrastructure stories connect fabs, capital, and power

The Three Mile Island story is the energy-side version of the same infrastructure problem. Bloomberg’s Will Wade reported that the site associated with the 1979 accident is being prepared to return to service as soon as mid-2027, rebranded as the Crane Clean Energy Center, to supply electricity for chatbots and other AI applications. Wade’s framing was not that nuclear’s old problems have disappeared. It was that AI electricity demand and the money behind it have changed the market enough to make old assets relevant again.

mid-2027
earliest reported timing for Three Mile Island to return to service

Wade distinguished the earlier climate rationale for nuclear from the present commercial driver. Technology companies once emphasized nuclear power because it was clean and useful for climate goals. In Wade’s telling, that remains part of the context, but the stronger force now is AI electricity demand. Caroline Hyde noted that $30 billion had been invested in nuclear since 2020, using the figure to frame renewed interest in the sector.

The tension in Wade’s reporting is that a frontier software buildout may depend, in the near term, on a decades-old nuclear site whose public meaning is still shaped by the worst nuclear accident in U.S. history. Wade also stressed that the waste problem has not been solved: spent fuel remains stored in casks on site, while a central U.S. repository remains politically stalled. New reactor designs, including small modular reactors, may arrive around 2030 or the mid-2030s, in Wade’s estimate. That is too late for the current AI power demand cycle.

The paired stories do not amount to a policy endorsement. They show how the AI buildout is pulling in actors well beyond model labs: governments trying to rebuild domestic semiconductor capacity, chipmakers investing in alternative supply, utilities and nuclear operators seeking new demand, and technology companies searching for enough electricity to run data centers. In the sources’ framing, applied AI at scale is becoming an industrial project because the bottlenecks are industrial.

Apple’s Reported Intel Deal Shows Compute Bottlenecks Driving Industrial PolicyTBPN
AI Power Demand Is Bringing Three Mile Island Back OnlineBloomberg Technology

Default models are becoming powerful enough to make safety plumbing matter

The GPT-5.5 Instant article shifts the Brief from infrastructure supply to deployment risk. Its importance, as Károly Zsolnai-Fehér framed it, is not that GPT-5.5 Instant is the flashiest frontier model. It is that it is the default ChatGPT model used at large scale. Improvements and weaknesses in that layer matter because ordinary users touch it directly, including in high-stakes contexts such as medicine and law.

The straightforward improvement is factuality. In OpenAI’s shown system-card chart, high-stakes responses with factual errors fell from 10.1% in GPT-5.3 Instant to 4.4% in GPT-5.5 Instant. High-stakes claims with factual errors fell from 7.4% to 4.8%. Zsolnai-Fehér summarized the medical and legal improvement as hallucination rates being cut roughly in half.

10.1% → 4.4%
high-stakes responses with factual errors, GPT-5.3 Instant to GPT-5.5 Instant

The harder issue is that the default instant model is also approaching expert or thinking-model performance in some sensitive domains. Zsolnai-Fehér said GPT-5.5 Instant is OpenAI’s first Instant model treated as “High capability” in the biological domain. On TroubleshootingBench, a biology evaluation built around tacit wet-lab knowledge and expert-written protocols, he described GPT-5.5 Instant as a bit below top PhD experts, while slower thinking models remained stronger. On a professional Capture the Flag cybersecurity evaluation, GPT-5.5 Instant was shown at 94.11% pass@12, close to GPT-5.5 Thinking at 96.3% and above GPT-5.4 Thinking at 88.23%.

That combination — default reach plus high capability — makes safety architecture part of the product rather than an appendix to it. The most consequential table in the source was not the capability benchmark; it was the biological-safety refusal comparison. On production data, GPT-5.5 Instant’s model-only refusal rate was very high, at 0.989. On easy synthetic biological-safety prompts, it remained high at 0.944. But on hard synthetic prompts, GPT-5.5 Instant fell to 0.481, compared with 0.894 for GPT-5.4 Thinking and 0.813 for GPT-5.5 Thinking.

Classifier-based safeguards changed that result substantially. After OpenAI’s patch, GPT-5.5 Instant’s hard synthetic refusal rate rose from 0.481 to 0.923, and its easy synthetic refusal rate rose from 0.944 to 0.989. Zsolnai-Fehér said the patch worked “spectacularly well,” while still expressing concern that the underlying base-model weakness remained behind guardrails.

Safety conditionGPT-5.5 Instant refusal rate
Production biological-safety data, model-only0.989
Easy synthetic biological-safety data, model-only0.944
Hard synthetic biological-safety data, model-only0.481
Hard synthetic biological-safety data, after classifier patch0.923
The GPT-5.5 Instant safety tension is strongest under hard adversarial biological prompts

This is not the same as saying the model is unsafe in ordinary use. The production-data refusal rate was high. The weakness appeared in hard synthetic adversarial cases, which Zsolnai-Fehér associated with multi-turn role-playing or reframing attacks. His concern was operational: a skilled attacker may discover a pattern that others can copy. OpenAI’s disclosure of the unflattering model-only result also matters; Zsolnai-Fehér gave the company credit for publishing it.

The broader applied-AI point is that model capability and deployment surface are now inseparable. Once a high-capability model becomes the default answer engine for hundreds of millions of users, the question becomes less “can it answer?” and more “what systems surround the answer?” That is also the bridge to the day’s agent stories. A model that can answer sensitive questions needs safety plumbing around the answer; an agent that can act in a browser needs permission and control plumbing around the action.

GPT-5.5 Instant Cuts High-Stakes Errors but Exposes Safety GapsTwo Minute Papers

Agents are entering real user environments

Codex’s Chrome extension moves the same systems question into day-to-day work. OpenAI’s Dominik Kundel presented the extension as a way for Codex to operate inside a user’s actual Chrome session on macOS and Windows: signed-in apps, cookies, open tabs, local context, and browser-native workflows. The product point is narrower than general autonomy, but operationally important. Agents are getting closer to the surfaces where knowledge workers already act.

Kundel was careful to preserve a distinction between structured connectors and browser access. If there is a plugin for a known task, he said, that is usually faster because Codex does not have to click through an interface to read a document, check a message, or create a file. The visible plugin list included tools such as Computer Use, Browser Use, Spreadsheets, GitHub, Slack, Notion, Linear, Gmail, Google Calendar, Google Drive, Teams, and SharePoint. Chrome matters when the work depends on the user’s actual browser state, a full web-app feature not available through a plugin, or an application with no adequate connector.

RouteBest fit in Kundel’s framingExample from the source
Plugin or connectorKnown tasks where structured access is faster than clicking through a UI.Reading email context, using spreadsheets, or connecting to tools such as GitHub, Slack, Gmail, Drive, Teams, or SharePoint.
Live Chrome sessionTasks that depend on logged-in browser state, cookies, open tabs, local files, forms, uploads, or full web-app features.Filling a Navan expense report using Gmail context and receipt PDFs from the desktop.
Parallel browser tabsTasks where separate agents need separate sessions inside the same live application.Four subagents joining the same multiplayer drawing app in separate tabs.
Kundel’s distinction between structured connectors and live-browser work

The extension also tries to avoid taking over the user’s whole browser. Kundel said Codex can create its own Chrome tab group, work in the background, open multiple tabs, inspect pages, scroll, take screenshots, and adapt when a route fails. In the research example, Codex was asked to review the last week of Codex-related posts on OpenAI’s developer forum, assess sentiment around launches, identify user issues, and produce a spreadsheet. The task log showed Codex switching from a blocked JSON search route to rendered forum search and topic pages. The completed message said Codex worked for 9 minutes and 51 seconds and produced a spreadsheet with 67 forum result rows and 34 high-signal topic deep reads.

9m 51s
shown Codex working time for a forum-research spreadsheet task

The expense-report example showed why browser access is different from a chat interface. Codex used Gmail context, local receipt PDFs, and a live Navan web form. It found receipts on the user’s desktop, checked trip context in email, filled expense fields, uploaded receipts, and submitted the form. The point was not that expense reporting is a frontier task. It was that many real workflows combine authenticated web apps, local files, forms, and institutional context in ways that pure APIs or chat responses do not cover.

A multiplayer drawing-game example pushed the same idea into parallel work. Kundel asked Codex to use four subagents, each in a separate Chrome tab, to join the same game and draw a tiny lighthouse collaboratively. The example is playful, but the capability it illustrates is not: separate agents can operate in different tabs inside the same live web application.

This is where the infrastructure and safety themes narrow into user control. Once an agent can operate inside a real browser session, permissions, reliability, observability, and reversibility become product requirements. A plugin failing to fetch a document is one kind of problem. An agent acting inside a logged-in enterprise app, uploading a file, submitting a form, or coordinating several tabs is another.

The source does not claim general autonomy, and the distinction matters. Kundel’s strongest claim is that agents can now work closer to the actual surfaces where knowledge workers operate. That makes the next question less about demonstration value and more about what the user can see, constrain, approve, audit, or undo.

Codex Can Now Work Inside Users’ Live Chrome SessionsOpenAI

Enterprise software becomes the control layer

ServiceNow’s Amit Zavery supplied the day’s counterweight to the idea that agents simply replace enterprise software. In a ServiceNow-sponsored interview, he argued that agentic AI will change enterprise software, but not by removing the need for platforms that provide context, permissions, compliance, governance, and orchestration across many systems.

His core distinction was between probabilistic AI and deterministic enterprise workflows. A consumer can tolerate some uncertainty or move to a different product. A large company cannot have an AI system return different financial numbers each time it is asked, or change production systems without authorization. Zavery did not argue that agents are weak. He argued that “can the agent perform the task?” is not the enterprise question. The question is whether the task can be performed with the right permissions, audit trail, compliance posture, rollback path, and operational owner.

That maps directly onto the Codex Chrome story. Browser agents show how AI can enter real work surfaces. Zavery’s argument is about what must surround those agents when the work is corporate, regulated, multi-system, and consequential.

He said many large enterprises run more than 300 systems. Onboarding a single employee might involve Workday, Fidelity, Concur, identity systems, device provisioning, badges, access privileges, and role-specific permissions. ServiceNow’s claimed role is not to be the system of record for all of those functions. Zavery described it as a “system of action” that orchestrates across systems of record. In his view, enterprise value lies in sense, decide, act, and secure: understand the request, make a decision, perform the action, and secure the process.

The model and the agent are not the whole enterprise product. In Zavery’s framing, the product is the governed action.

The governance layer is the most relevant part of that claim for applied AI. Zavery described ServiceNow’s AI Control Tower as a way for CIOs and risk managers to see who used AI, what it changed, what it cost, what privileges it had, what vulnerabilities exist, and whether a process should be shut off or access revoked. That is a different product thesis from “better model” or “more autonomous agent.” It is AI as governed action.

LayerWhat the day’s sources showApplied-AI question
ComputeAnthropic seeks Colossus 1 capacity; DeepSeek reportedly raises to buy compute.Who can get enough GPUs, power, and data-center capacity to serve demand?
Power and fabsApple-Intel talks, Nvidia’s Intel investment, and Three Mile Island restart planning.Who controls the physical supply chain and electricity required for scale?
Safety systemsGPT-5.5 Instant improves high-stakes factuality but relies on classifiers for hard adversarial bio refusals.What guardrails surround powerful default models?
User environmentCodex works inside live Chrome sessions, tabs, local files, and logged-in apps.What permissions and controls apply when agents act where users work?
Enterprise controlServiceNow argues platforms must orchestrate, audit, govern, and secure agentic workflows.Who makes AI action reliable enough for production systems?
The day’s stories move from raw capacity to governed action

Zavery used the PocketOS anecdote as a warning, not as proof of a universal pattern. In his account, a travel agency using Cursor and AI agents had its customer database and production system wiped out in nine seconds. The point was that an agent acting quickly can create large damage if it lacks permissions management, testing, rollback, and accountability. He also argued that build-versus-buy remains an economic question: ServiceNow’s internal calculations, he said, suggest building these capabilities internally can cost five to ten times as much as buying them, once compliance, security, connectivity, testing, and changing model behavior are included.

His position is self-interested, and the source makes that clear: he is ServiceNow’s president, COO, and chief product officer, speaking in a sponsored interview. But the argument is useful because it names the control problem that runs through the day’s other stories. Powerful models require safety systems. Browser agents require permissions and user control. Enterprise agents require auditability, orchestration, and compliance. Infrastructure scale requires power and policy.

The sources do not resolve whether ServiceNow’s specific platform will capture that value, whether enterprises will build more of it themselves, or whether model providers will move up into the control layer. They do show that the agentic-AI debate is becoming less about whether software can be generated or tasks can be attempted, and more about whether actions can be authorized, observed, reversed, and trusted inside production systems.

Agentic AI Is Making Enterprise Software a Control LayerAlex Kantrowitz

The frontier, in your inbox tomorrow at 08:00.

Sign up free. Pick the industry Briefs you want. Tomorrow morning, they land. No credit card.

Sign up free