IBM Bets AI Value Will Come From Orchestration, Not Model Size
IBM chief executive Arvind Krishna argues that the next phase of enterprise AI will be defined less by ever-larger foundation models than by using the right model and infrastructure for each task. Speaking with Masters of Scale host Bob Safian, Krishna says model switching will become easier, AI compute costs are likely to rise, and companies should move from pilots to scaled use cases while questioning the economics of the broader AI buildout. He also frames IBM’s $10bn quantum push as a bid to get ahead of the next hard technology curve.

IBM is betting that AI value will come from fit, not size
? arvind-krishna’s AI strategy joins two claims that many enterprise leaders are still treating separately: foundation models will become easier to switch among, while the cost of using them is likely to rise. If both are true, the advantage will not come from using the biggest model for every task. It will come from routing work to the right model, infrastructure, and governance setup for the job.
That is also how Krishna connects IBM’s present AI posture to its longer-term quantum bet. IBM is not trying to become OpenAI, Anthropic, Google, or Microsoft. It is “not going to be a hyperscaler,” he said, and it is not trying to be a foundation-model provider. His more provocative view is that foundation models will become commodities within “a year,” “two years,” or “three years” — not valueless, but easier to switch among.
“Gold is a commodity,” Krishna said. “So is iron.” The point is not that commodities are unimportant; it is that buyers will optimize across them when switching costs fall.
That view underpins IBM’s enterprise AI strategy. Krishna expects token prices to rise because the companies building large models need to justify their capital investment. At the same time, he said, the underlying GPU pricing that these systems depend on has “doubled in the last six months” on a per-hour basis. The source also showed VICE and Yahoo Finance/Digital Trends headlines about possible 2026 AMD and Nvidia GPU price increases. If AI compute becomes more expensive, Krishna argued, enterprise users will have stronger incentives to route each task to the cheapest adequate tool.
His analogy was deliberately simple. A car and an 18-wheeler are both automobiles if viewed from far enough away. A family could use an 18-wheeler to take children to school or buy milk, but that would be absurdly inefficient. The same truck is the right tool for moving house. Krishna’s claim is that enterprises are currently using the AI equivalent of the 18-wheeler for too many tasks.
I think right now we’re using the 18-wheeler for everything.
Krishna predicted that a shift toward more task-appropriate AI use will happen within 24 months, though not necessarily within 12. That is where he sees IBM’s role: helping enterprises use different models “in the most economic way possible,” while doing so safely. He tied that to IBM’s broader hybrid-cloud posture and to a “fit for purpose” view of technology: different workloads may call for different infrastructure depending on cost, security, sovereignty, and the nature of the work.
Bob Safian pressed whether AI tools are simply getting more expensive to use. Krishna said the pressure is rising, and he compared the current moment to earlier internet-era business models where user growth came before economics. Pre-public companies can tolerate losses while they pursue customer counts, he said, but eventually “the economics are important.” He thinks that reckoning may be about a year away.
Watson’s lesson was not that IBM was too early, but that it built the wrong way
? arvind-krishna was unusually direct about IBM’s earlier AI history. Watson’s Jeopardy! win, in his account, “woke the world up” because AI had done something many people thought it could not do. But IBM’s follow-through failed strategically.
The mistake was not simply that the technology was early. Krishna said IBM tried to create vertical solutions too quickly instead of building reusable blocks. It pursued “a monolithic application,” and chose healthcare — “perhaps the hardest” domain. Then it compounded that by entering a market where IBM did not know the customers or the regulatory environment.
“How many IBMers sell to doctors and how many IBMers deal with the FDA? Like none,” Krishna said. “So, you pick the wrong solution set in an industry you know nothing about, and with a customer you know nothing about. Other than that, it was pretty good.”
The Watson example matters because it clarifies how Krishna distinguishes invention from commercialization. IBM can identify a wave and still fail to turn it into durable advantage if the business model, domain knowledge, and product architecture do not align. He put public cloud and client-server computing in the same category of missed waves: not failures of technical awareness, but failures to make investment, return, and business model fit together.
That experience appears to shape his current AI discipline. IBM is avoiding the foundation-model arms race not because Krishna dismisses AI, but because he believes enterprise clients will need help using AI safely, economically, and at scale. The error with Watson, in Krishna’s account, was jumping too quickly to a vertical application in a domain IBM did not deeply understand. The intended correction is to help enterprises apply AI inside domains they already understand, rather than to make IBM the owner of the largest general-purpose model.
“Day zero” means fewer pilots and more scaled work
? arvind-krishna’s “day zero” phrase does not mean companies can relax. He explicitly rejected the idea that skepticism about AI economics means skepticism about AI’s usefulness.
AI, he said, is “an incredible productivity tool,” and organizations that do not take advantage of it will be “perpetually disadvantaged” compared with those that do. He expects it to optimize marketing, coding, enterprise operations, sales, and daily work. The “day zero” point is that the race is starting in earnest: leaders should stop treating AI as a school experiment and start learning how to scale it.
His advice is not to launch a hundred projects. It is to pick “three, four, five things” and learn how to do them at scale. That forces the harder organizational questions: how to handle change management, organize data, motivate people to change a process, and build the confidence to expand from a few scaled use cases to 10 and then 20.
Krishna used a power-law framing for enterprise AI adoption. Bob Safian suggested a K-shaped pattern, with tech companies advancing quickly while many other businesses fall behind. Krishna said corporate performance is even more differentiated than that: closer to a 20-80 rule, with 20% of companies understanding and pursuing returns, and 80% either not getting returns or not knowing what to do.
For companies in the 80%, Krishna’s advice was blunt: start somewhere. He argued that the first AI leader inside a business does not need to be a PhD-level AI expert. When one client asked him for “a deep AI expert,” Krishna said he recommended someone from the client’s domain instead — someone who may not know AI deeply but understands how AI could change that domain.
That distinction was central. Inventing AI requires computer science depth. Deploying AI inside a company requires curiosity, willingness to adapt, and domain knowledge. Every company already has some people who understand its work and are motivated to learn a new way to do it. Krishna’s view is that those people are often more useful for adoption than outside AI specialists who lack context.
The ROI comes after the organization learns to repeat itself
? arvind-krishna did not present IBM’s internal AI savings as instant or frictionless. IBM has discussed unlocking roughly $4 billion to $4.5 billion of efficiency from AI, while many companies implementing AI do not see immediate savings. Krishna said IBM’s first six months to a year probably cost more than it saved.
The costs were not limited to tokens. IBM had “a couple of hundred engineers” working on the effort, which was incremental expense. Infrastructure was another cost. There was also opportunity cost: those people could have been working on revenue-generating projects.
The return came after IBM learned a repeatable method. Once the company was applying the approach not across two or three areas but across 10 or 20, the economics changed. Krishna said that when savings reach “a billion dollars a year,” the cost of a few hundred people is manageable. After year two, IBM was “definitely getting a return that was 10x” what it was spending. By year four, he expects IBM to be over $5 billion in savings compared with its year-end 2022 spending baseline.
One practical objection came from another CEO: AI-generated code may save engineering time, but if a small number of errors require large amounts of work to find and fix, the company may not come out ahead. Krishna’s response was that this is often the wrong kind of use case. Companies should not begin with applications where a mistake means undoing six months of work or hundreds of millions of dollars of investment.
Customer service is a better example, in his view, because a wrong answer is bounded: the company has to fix one customer interaction. The system can also include evaluations and checks. AI can be constrained, monitored, and made to hand off uncertain cases to people.
Krishna compared this with human performance. In customer service, he suggested, humans may be right around 85% of the time and wrong 15% of the time for ordinary human reasons: anger, tone, overconfidence, imperfect memory. AI, if constrained, may be “probably 95% correct,” he said. The important design principle is not to let it operate freely in “the under-confident range.” If the model is uncertain or appears far off, it should punt to a human.
Bob Safian objected that AI is always confident. Krishna disagreed. If instructed not to pretend confidence, he said, it can report uncertainty. A second model can also be used to check the first, with the checker rewarded for finding mistakes. Krishna likened this to “four eyes” in software development: one person codes, another reviews. In AI systems, one model can produce and another can check.
Productivity does not imply one labor outcome across the company
? arvind-krishna drew a sharp line between work that creates value and work required to run operations. That distinction shaped his answer on job displacement.
IBM software developers, he said, are about 40% more productive than they were two years ago. One could infer that IBM therefore needs 40% fewer developers. Krishna said IBM did the opposite: it tripled college-level entry hiring compared with the previous year.
His explanation was economic. If software development becomes cheaper, products that were not economically viable three years ago may become worth building. Those products can generate revenue and margin, which creates demand for more people. Krishna put software development, sales, marketing, and consulting in the value-creating category.
The other category is the operational work needed to run the company: compliance, accounts payable, procurement, and similar functions. Krishna estimated that this may be about 20% of the enterprise. He does not think it goes to zero, but he said he would not be surprised if about 30% of headcount in those areas is not needed within a few years.
His overall forecast was not “no displacement.” It was displacement in some areas and increased demand in others. Net, he said, he expects increased demand for jobs, but “there is some displacement, which is always a little bit painful.”
On responsibility, Krishna argued that business leaders should provide opportunity: upskilling, reskilling, and access to other jobs. But he drew a boundary. Companies cannot force people to take those opportunities. In his experience, the response is roughly 50/50: some people step up, while others say they do not want retraining and want the old job back.
Krishna said IBM tries to be compassionate and does not force people out “in a day.” But if, over six or nine months, employees are not willing to learn the skills needed elsewhere, he said retaining them can become unfair to the “other 90%” who did.
The AI buildout math does not work for everyone
? arvind-krishna’s skepticism about AI economics became most explicit in his discussion of data centers. He said that if one takes all the verbal promises seriously, about 125 gigawatts of AI data centers are expected to come online in the next two to three years. He translated that into $8 trillion to $12 trillion of total capital expenditure, not all in one year.
He does not see the economics for that scale. Such investment would imply close to $1 trillion of profit, which in turn would require, in his framing, perhaps $4 trillion of additional revenue. “Where exactly is that going to come from?” he asked.
He does not predict that all of it fails. At least half may work out well, he said, and some companies will thrive. But some will disappoint. The commodity logic applies here too: if foundation models become easier to switch among, Krishna doubts there is room for a dozen successful foundation-model businesses globally. There may be room for three or four. Since around a dozen are being pursued, he sees inevitable losers.
The infrastructure distinction matters because different computing systems fit different kinds of work. A modern data center generally means a collection of similar boxes — hundreds, thousands, tens of thousands, perhaps hundreds of thousands — where work can be divided across machines and coordinated through networking or optics. AI inference queries, for example, can be parallelized because one user’s answer does not need to know another user’s answer.
A mainframe, by contrast, is “one box” designed for a different kind of workload: one piece of work with enormous volume and strict consistency requirements. Krishna’s example was airline reservations. If an airline sells one seat on one flight, it should not sell the same seat to someone else.
He connected this back to his larger “fit for purpose” doctrine. Mainframes were declared dead repeatedly, he said. Krishna recalled that in 1993, he thought Time magazine had shown a mainframe dressed as a dinosaur under a “death of the mainframe” framing. But the right question, in his view, is not whether one architecture is fashionable. It is which workload is suited to which architecture. GPUs are not ideal for running a smartphone if the battery would die in three minutes. AI training, inference, web serving, streaming, and transactional systems all have different infrastructure needs.
Sovereignty adds another constraint. Especially outside the United States, Krishna said, organizations care deeply about which government has control over the technology stack. That affects where workloads should run, alongside cost and security.
AI lowers the skill threshold for cyberattack — and for defense
Bob Safian described IBM’s Project Lightwell as a $5 billion initiative to identify and fix vulnerabilities in the open-source world, and said it was reportedly triggered by Anthropic’s release of Mythos. ? arvind-krishna’s first point was reassuring but limited: in IBM’s own work over recent months, Mythos had not found anything that other models could not find.
The problem was usability. IBM already has tens of thousands of experts using models to find and fix vulnerabilities. Experts could do this work before Mythos. But Mythos made the capability “way easier to use,” Krishna said. That expands the attack surface because a task that previously required one of a small number of experts can now be attempted by someone with average skills.
Krishna’s response was to turn the same capability toward defense. If foundation models can help write and understand code, they can also help patch open source. IBM’s idea, as he described it, is to operate as a kind of clearing house: a customer can provide a vulnerability, IBM can generate what it believes is a well-constructed patch, and then other members of a closed set can be alerted that a vulnerability has been found and receive the fix without learning who found it or where it was being used.
Krishna emphasized that the project is not purely altruistic. He said it is good for society, but IBM intends to charge “a fair price,” not “a usurious price.” Without current AI tools, he said, IBM could not plausibly offer to take a piece of open source and return a strong patch against a vulnerability at that scale.
The emerging cyber environment is not, in Krishna’s view, driven by a new motive. He invoked the Willie Sutton line about robbing banks because that is where the money is: cyber infrastructure is attacked because that is where the data is, and data is where the money is.
The good news, in Krishna’s formulation, is that this is not brand new. The bad news is that AI makes it faster and lowers the threshold. Capabilities once associated with three or four nation-states could open up to a couple dozen. Smaller and midsize organizations may become more viable targets because the “potential function” has come down. His warning was simple: if an organization does not think it is protecting itself, “it is only a matter of time.”
Quantum is IBM’s attempt to get ahead of the next hard curve
? arvind-krishna tied IBM’s quantum investment to the same strategic logic he used for AI and cloud: if a company can get ahead of a difficult technology curve by a couple of years, it can create outsized returns for itself and its clients. IBM believes quantum is such a curve.
Bob Safian described IBM as partnering with the U.S. government on a new Quantum Foundry and investing $10 billion in a large-scale commercial quantum computer. Krishna said IBM’s investment means the company expects a real return. He also interpreted the government’s willingness to invest as a sign that it had done its homework and agreed “it is now time to scale this as an industry.”
Krishna explained quantum’s potential by comparing it with CPUs and GPUs. CPUs have handled many important problems for 60 or 70 years. GPUs handled a different class of problem, especially matrix math, that made AI and other applications practical. CPUs could do some of that work, he said, but might be “10,000 times slower.”
He then used molecular simulation to describe quantum progress. In summer 2025, he said, quantum systems could simulate a five-atom molecule — useful as a proof point, but something an expert computational chemist might solve by hand if it were simple. By November or December, they could handle a 300-atom molecule, beyond hand calculation but still feasible on a normal supercomputer. By April, they reached 12,000 atoms, entering what Krishna called “the protein realm.” Proteins, he said, are in the 10,000 to 30,000 or 40,000 atom range.
| Moment | Molecular scale Krishna described | Why it mattered in his explanation |
|---|---|---|
| Summer 2025 | 5 atoms | A proof point, but possibly solvable by hand for a simple molecule by a strong computational chemist |
| November or December | 300 atoms | Beyond hand calculation, but still something a normal supercomputer could handle |
| April | 12,000 atoms | Entering the protein realm, using a piece of trypsin as the example |
The example was trypsin, a piece of a protein. Krishna said IBM was “pretty sure” it would be at double that range in another month or two, enough to solve trypsin. If a protein’s properties can be understood with a few minutes of computation, he said, researchers may be able to understand which molecule — a drug — could bind to it and stop its harmful behavior. That would open “a new pathway, possibly for health.”
He also named other areas where quantum may matter: fluid dynamics, aerodynamics, and calculations involving how liquids flow inside pipes. His advice to leaders was not to choose quantum instead of AI, but to understand where quantum may be in two or three years and begin developing the algorithms they will need. Waiting until the machine is ready could mean losing another two years.
The largest risk is protecting the current profit pool
? arvind-krishna’s final management argument was that avoiding risk is itself the riskiest strategy. A business that takes no risk is extracting profit from what it already has. That gives competitors time to clone it, copy it, and attack the most profitable pieces from below. The profit pool declines, conservative leaders invest even less because the company is shrinking, and decline accelerates.
The most risky route is taking zero risk.
He described that path as approaching a cliff without realizing it. In his view, history is full of companies that spent five years drifting into decline and then, five or 10 years later, were bought for parts or disappeared.
Innovation is the counterweight, but Krishna was clear that it cannot be made risk-free. Not all innovation pays off. The question is whether a company generates enough profit to fund enough innovation that some of it produces long-term return.
Bob Safian pointed out that many people say they are comfortable taking risks only if they do not lose anything. Krishna answered that humans are “incredibly loss averse.” Even if offered a chance to gain $10 while risking $1, many people fixate on losing the dollar. Leadership’s job is to counteract that tendency inside organizations.
That means changing what probability of success is acceptable. Krishna said he wants a “50% probability win,” not 90%. If leaders demand 90% certainty, they are not asking for risk. They must also avoid punishing people publicly or making them feel their jobs are on the line for reasonable failures. If a team is doing six things and four work, that can be acceptable.
IBM has had its own repeated failures, Krishna said. Its digital sales channel is still a work in progress and is “probably” on its fourth try during his tenure. The response to failure should not be merely to push harder. Leaders have to ask what structural issue prevented success, or whether the market was different from what they assumed, and then keep trying.
