Companies Can Build Frontier Intelligence Without Owning the Frontier Model

Sarah GuoLatent SpaceWednesday, June 3, 202614 min read

Satya Nadella used Microsoft’s Build 2026 AI announcements to argue that the next phase of AI will be defined by ecosystems, not by companies consuming a single frontier model. In a crossover conversation with No Priors and Latent Space, Microsoft’s chief executive said enterprises and startups should be able to build their own “frontier intelligence” from models, tools, data, context, and private evaluations. His case is that durable value will accrue to companies that control those loops, rather than simply rent intelligence from a general-purpose provider.

Microsoft wants AI to be an ecosystem, not a single model to consume

Satya Nadella framed Microsoft’s AI strategy around a platform test he said he learned from earlier technology shifts: a platform matters when it enables more value to be created above it than is captured inside it. For AI, that means the important question is not whether companies use someone else’s model. They will. The question is whether an AI-native startup or a traditional enterprise can become “a first-class participant” in creating its own intelligence.

That is the organizing idea behind Microsoft’s Build announcements, in Nadella’s telling. Microsoft’s job is to provide the path: the stack, the tooling, the models, the data layer, and the “recipe” for companies to point to AI they created themselves.

Can everybody operate at the frontier with their frontier intelligence?

Satya Nadella · Source

Nadella contrasted that with a world in which developers are asked to “worship at the altar of one model.” That, he said, is not a developer conference. The stable equilibrium he wants is one where companies can compound their own value on top of a platform that itself keeps improving. He compared the pattern to Windows enabling Adobe and Autodesk, and to Microsoft’s DirectX becoming part of the base on which Nvidia built CUDA. In this AI shift, the analogous extension layer is a company’s own intelligence layer.

Shawn Wang described Microsoft’s possible “third act” as becoming a company around harnesses and evals, after operating systems and cloud. Nadella’s answer returned to the platform promise rather than the corporate-label framing: customers should be able to build “frontier intelligence” for their own data, rather than simply rent intelligence from a general-purpose provider.

The MAI training strategy starts with clean lineage and ends with private evals

Sarah Guo raised the complication of ecosystem strategy: Microsoft is building some components, partnering on others, and supporting still others. Nadella said the MAI models begin with “great lineage”: high-quality pre-training data, careful ablations, and a clean enough training record to understand what is actually in the model. He argued that this has become harder, not easier, because so much material now exists in the world that has to be ablated out.

That concern shaped his critique of some open-weight models. They may look strong on one or two benchmarks, he said, but fail to be as useful “in practice.” Microsoft’s field engineers, in his account, became excited about MAI models because even a small 5B model could “hill-climb” when placed inside the right scaffold.

The model itself is not the full product. Nadella described a system in which companies start from a clean model, then build specialist capability around it: a hill-climbing scaffold, reinforcement learning environment, collected traces, and, most importantly, private evaluations. Public evals still have interest, but Nadella said many can now be “maxed.” The evals that matter are the ones only a company can define for the work it uniquely values.

He gave the Build demo with Land O’Lakes as his example of how he thinks about operating at the frontier over time. In Nadella’s account, Microsoft used GPT-4, collected traces, then used a 5B reasoning model to achieve higher performance. His point was that “frontier” is not only a static model leaderboard. With temporality, traces, and specialization, a smaller model inside the right process can become frontier for a particular company’s task.

size of the reasoning model Nadella cited as capable of hill-climbing inside the right scaffold

This view also affects what Nadella thinks companies should treat as intellectual property. Private evals may become one of the biggest sources of defensibility. His acid test was simple: if a company can use its private eval to switch from model A to model B and keep climbing, it is in control. If it cannot, it is not.

The hard lesson was not scaling intelligence, but deploying it into real work

Satya Nadella said he originally became excited by the scaling laws paper and by OpenAI’s plan to “throw a lot of compute at transformers.” The broad scaling intuition, he said, has held up: in his crude formulation, “intelligence is a log of compute.” What he thinks the industry underestimated was not the capability curve, but the real-world complexity of deploying models so they produce measurable value.

Benchmarks are interesting and important, he said, but the “true eval” is whether people can do valuable things in the real world that only they can measure. He connected that to customer resistance to token-max pricing. If customers do not want a token meter, he suggested, part of the reason is that the industry has not fully taught itself to think of tokens as being used to create value at every step.

The most obvious area where value has shown up is coding. But Nadella argued that coding’s success has created a second-order problem: the tools now need to be rebuilt. If a developer has “a hundred agent sessions,” the cognitive load returns to the human in a new form. A chat transcript is not enough as the only artifact, which is why Microsoft is moving toward new IDE and canvas interfaces.

The same pattern, he said, will apply outside code. Nadella pointed to “autopilot” work and Klarna as examples of agents handling the glue work that consumes much human capital inside organizations. If agents become long-running and durable, operating with delegated authority, they can work through the night on tasks that still require judgment and coordination. The next user interface problem becomes: what did the agents do, should the human approve it, and how does the human inspect the work?

The enterprise harness is where models, tools, data, and context meet

Elad Gil pressed Nadella on the idea of a “harness”: in coding, the agent is surrounded by an environment, context, and developer setup. Nadella generalized that concept to enterprise AI. The harness defines the models, data, and tools, and creates a loop among them.

Microsoft’s own AI products, he said, are built as multi-model harnesses. He named GitHub Copilot, Security Copilot, a product the transcript renders as M-Dash, and Discovery for Science. Each uses multiple models, tool access, and progressive tool disclosure to remain token-efficient. The other hard lesson of the last two years, he said, is the importance of the context layer. Preparing the right context so a plan can execute efficiently is “where the magic is.”

The GitHub harness is now used across Microsoft products and is available in Foundry, according to Nadella. But he emphasized that Microsoft is not requiring one harness. A company could use a Llama harness, an open harness, or its own harness, then train it with its tools, models, and context.

Nadella offered a product the transcript renders as M-Dash as evidence that a multi-model harness can outperform a more vertically trained setup in the real world. The name is hedged here because the source transcript does not make the product spelling certain; the same is true of the comparison system it renders as Mythos. His substantive claim was clear: when M-Dash launched, it found bugs or vulnerabilities not found by Mythos. He presented that as an existence proof that training the harness, tools, and model together is not the only path to strong real-world performance.

The harness also changes the skill set required of developers and enterprises. In Nadella’s formulation, the valuable AI-native company knows how to combine private evals, context, tools, and model choice into a hill-climbing loop. That is true for startups, SaaS companies, and traditional enterprises alike.

Company knowledge becomes traceable, trainable, and possibly balance-sheet-like

Satya Nadella did not argue that human capital becomes irrelevant. He argued almost the opposite: humans continue to create value by finding gaps, while token capital increases the leverage of that work. The important question is how the two compound.

He described an enterprise where Teams contains both humans and agents doing work, and the traces among them become a record of how the company creates value. Those traces can train not a generalist model, but a “company veteran agent.” That agent would encode some of the tacit knowledge that has historically lived in experienced employees and informal organizational memory.

Nadella said a company leader had suggested such an asset should go on the balance sheet. He took the idea seriously. Human capital itself was never easy to place on a balance sheet because tacit knowledge was hard to capture. With agents learning from long-running traces, he argued, that capture becomes more plausible.

SaaS will be unbundled, rebundled, and priced in more than one way

The “end of software” debate, in Satya Nadella’s view, is really a debate about how software’s layers get repackaged. Sarah Guo put the question in terms of SaaS: common horizontal and vertical workflows were captured in applications, while each enterprise also has differentiated internal work. If workflows become cheap to generate, the equilibrium between vendor software and enterprise-built agents changes.

Nadella answered by decomposing what SaaS actually captured: a data model, a schematized business process, business logic, and UI. He does not think all of that disappears. The data model underneath many SaaS applications remains valuable. A general ledger, he said, should still be a general ledger; there is no need to reinvent that schema. Business logic also remains valuable. He used Power BI as an example: the visible dashboards matter less than the rich semantic model and measures underneath them, which someone took the time to create.

The challenge for SaaS companies is packaging. Software was bundled one way for roughly two decades. AI agents force companies to unbundle those layers and rebundle them in new ways, with new business models.

Microsoft 365 is Nadella’s central example. Work IQ exposes what he called perhaps the most important database in a company: the corpus of email, Teams, Word, Excel, PowerPoint, and SharePoint activity that previously served only Microsoft’s own apps. With Work IQ, he said, he can ask an agent to connect design meetings from the previous week to a GitHub repo and return a plan for code changes. That use of Microsoft 365 would not previously have been imaginable.

He said this creates a much larger value opportunity, but also forces architectural changes. The systems built to serve a mailbox or inbox are not necessarily the systems required to serve an agent. Usage around M365, he said, is going to be “perhaps more than even the end users,” and the infrastructure must be rethought accordingly.

Pricing will change as well, but Nadella rejected a single future model. Per-user pricing exists because customers need budget certainty; it is a bundle of entitlements to usage. Subscriptions will remain. Consumption pricing will grow. Outcomes-based pricing will appear, but he warned that customers often like outcome pricing only until the outcome is valuable enough that the provider appears to be taking a royalty. In his phrasing, “most people love outcomes until they have an outcome.”

GitHub Copilot forced this adjustment. It was originally built around per-user pricing for interactive code completion and developer tasks. It was not designed for a world where a customer might launch 10,000 agents running all day. Nadella said there will still be per-user pricing, but there also has to be a consumption meter.

Elad Gil pushed the same issue from the buyer side: internal teams are realizing they can generate software quickly and are beginning to threaten to rebuild applications rather than pay SaaS vendors. Nadella said the market needs at least one full budget cycle to find its equilibrium. His decision rule was economic rather than ideological: a company should acquire software when the marginal cost of building and maintaining it internally is higher than buying it.

The maintenance piece matters. AI may generate the software, but security issues still have to be fixed quickly. Coding agents can help, but that consumes tokens. Someone has to own the cycle. Nadella expects the first wave of excitement — “I can generate a lot of software” — to give way to a more selective question: what software should a company generate, what should it use from others, and how should the two compose into agentic workflows the company controls?

He predicted little tolerance for inflexible vendors. But vendors that remain flexible, deliver value, and support new business models will still sell software. The product category does not vanish; the packaging and commercial model change.

Engineering roles widen as agents raise the leverage of generalists

Satya Nadella said agentic tools have made it possible for even a CEO to exercise more agency over software artifacts. He described using GitHub Copilot and newer sessions-style apps to inspect, learn from, and touch codebases that would previously have been inaccessible to many non-engineers.

He said he recently built a long-running Foundry agent that functions as a chief-of-staff autopilot. It used Work IQ, stored memory through a backend service the transcript renders as Rayfin, and published into Teams. The point was not that CEOs should become production engineers, but that knowledge workers can now move more fluidly from documents and spreadsheets into app-like artifacts.

Elad Gil asked whether engineering roles collapse into a smaller set: agent managers, forward-deployed engineers, security engineers, and large-scale infrastructure engineers. Nadella said organizations will have to experiment, but he gave LinkedIn as one example of structural change. LinkedIn created a discipline called “full stack builder,” combining design, product management, and frontend engineering while preserving each person’s “edge.”

Infrastructure, meanwhile, becomes more important even inside teams that used to be considered end-user application teams. Nadella said the Excel team’s reinforcement learning environment — the system in which a reward can be learned — is now one of its hardest infrastructure problems. That requires distributed systems talent in places where it might not previously have been expected.

Specialists will remain, but Nadella said the generalist role may become the most exciting because its leverage rises sharply. A person who previously produced Word documents, spreadsheets, or other knowledge-work artifacts can now build apps in the same workflow. The generalist does not replace expertise; the generalist’s scope expands.

Ambition means redesigning the work, not just doing it faster

For Satya Nadella, ambition in an AI era requires a new conceptual model for how work changes, not just a productivity tool for doing existing work more cheaply.

He cited a line from Kevin Scott: making hard things easier is one kind of leverage, but real ambition is making the impossible possible. The missing piece in many organizations, Nadella said, is the model for what formerly impossible work can now be built.

His example came from Azure networking. Microsoft built more Azure capacity in the prior 15 months than it had built in the first 15 years, he said. The same team responsible for Azure networking realized the old way of working could not scale. They reframed their job: not doing Azure networking directly, but building the agentic system that does Azure networking.

That system, called Miles, helps manage a global WAN involving more than 500 fiber operators. Fiber operations remain physical: lines get cut, repairs have to happen, emails arrive, responses are needed. The team’s conclusion, according to Nadella, was not that it needed more headcount, but that it needed more tokens.

He described this as making the work “meta.” The team’s new work is to build and supervise the system that performs the operational work. Nadella compared it to the 1980s: if someone had predicted that four billion people would wake up and type every day, the wrong conclusion would have been that the world needed four billion typists. The right conclusion was that typing became part of knowledge work.

True ambition is about making the impossible possible.

Satya Nadella

The data center build-out requires community permission

Shawn Wang turned the discussion to data centers, and Satya Nadella acknowledged the scale as extraordinary. But he argued that the industry’s permission to build will depend on whether communities experience tangible benefits, not just whether hyperscalers can finance and construct capacity.

The concerns he named were direct: energy prices, grid capacity, water use, jobs, training, and the local tax base. Communities need to believe data centers are not raising their energy costs and may, over the long term, help produce a stronger grid and more energy. They need to understand closed-loop water systems and replenishment. They need to see construction jobs, post-construction jobs, and tax-base effects as real rather than promotional.

If those conditions are met, Nadella said, the industry will have permission. If not, it will not. He said community skepticism is appropriate: communities should ask hard questions, and companies should have to earn trust.

His broader argument was that energy-intensive industries have historically been accepted when they create broad societal value. If a token economy drives productivity, economic growth, wider participation, and better health outcomes, he thinks the story can be positive. If it consumes resources without broad value, it will not be.

That led to his broader update on AI’s societal impact. He does not think the technology industry can rely on a “trust us” story about a glorious future. In the next 12 to 18 months, he said, people need to see that they have a real shot to participate as first-class actors in the new economy: creating startups, running local stores more efficiently, seeing healthcare benefits, or otherwise experiencing tangible gains. Political support will depend on that reality being visible. As Nadella put it, the industry has to “deliver tangible benefits.”

Education may need new credentials, not just better tutoring

Sarah Guo suggested that wealth creation and healthcare benefits from AI are becoming easier to see, but education has not yet shown the same impact. Satya Nadella said the question requires rethinking education itself: incentives, credentials, the value placed on those credentials, and the employment opportunities attached to them.

He mentioned meeting the founders of Alpha School and being struck by their approach to rethinking what education can look like. At the same time, he cautioned against assuming traditional learning no longer matters. He referred to a Stanford computer science class’s AI guidelines as an example: students still need to learn concepts such as applying softmax appropriately, rather than simply asking a model to fix a training run.

Access to information, self-education, and continuous updating have changed substantially. The unresolved institutional question is how to convert that into trusted credentials and economic opportunity. Nadella suggested the next major startup success might be a new university, or a new pedagogy that guides people through a curriculum and into valuable work.

AI Application Architecture Data and Training AI Labs and Strategy Evals and Benchmarks Agents and Autonomy AI Infrastructure and Compute AI Business Models AI Economics and Labor Coding Assistants Enterprise AI Adoption