Orply.

Generative AI’s Revenue Stack Is Still Inverted Toward Chips

Apoorv AgrawalStanford OnlineWednesday, May 20, 202611 min read

Stanford adjunct lecturer and Altimeter partner Apoorv Agrawal argues in MS&E435 that generative AI’s economics still look unlike the software and cloud cycles investors often use to value it. In his estimates, AI revenue has grown sharply, but gross profit remains concentrated in semiconductors, while applications face inference costs, thin monetization and uncertain paths to mass-market utility. The question he puts to students is not whether AI demand exists, but how long the stack’s inverted shape can persist before applications and infrastructure capture more of the value.

AI value is still concentrated where the old software model least applies

Apoorv Agrawal frames the economics of generative AI around a simple question: “Where’s the money?” His answer, based on the estimates he presents, is that the money is not yet where the software industry has learned to expect it.

On Agrawal’s slide, the cloud stack has a familiar-looking distribution: roughly $600 billion of annual app revenue, $300 billion of infrastructure revenue, and $80 billion of semiconductor revenue. The AI stack, by contrast, is shown as an inverted triangle: about $60 billion in applications, $75 billion in infrastructure, and $300 billion in semiconductors. The slide attributes semiconductor revenue to recent Nvidia, Broadcom, and AMD results; the AI cloud, infrastructure, and application estimates are based on Altimeter internal estimates.

StackAppsInfraSemis
Cloud estimated annual revenue$600B$300B$80B
AI estimated annual revenue$60B$75B$300B
Agrawal's central slide compares a cloud stack with large application revenue to an AI stack still concentrated in semiconductors.

That inversion is the organizing problem. The industry is spending heavily on the physical substrate: energy, chips, power, interconnects, memory, and data centers that can be rented by the hour or by the token. Those data centers train and serve models. The economic question is whether the models, and the applications built on them, create enough value to justify the capital flowing into the base of the stack.

Agrawal contrasts this with the software businesses that “ate the world.” Traditional software could be built once, distributed to millions, and run at marginal costs close to zero. Many software companies operated at 80% or even 90% gross margins. Generative AI does not inherit that economic structure automatically. Each incremental user of an AI application consumes inference. “Turns out you’ve got to burn those GPUs,” he says. That changes the marginal-cost profile of the application layer, including companies with billions of dollars in revenue that still face profitability questions.

Students offer several explanations for the inversion: AI is still early in the cycle; Nvidia has a dominant position in compute; and cloud infrastructure has had time to translate hardware into high-margin application value in a way AI has not yet done. Agrawal accepts all three as part of the current picture. But he adds that the physics of inference make AI different from cloud software in a deeper way.

The comparison to AWS is meant to discipline expectations. Agrawal says AWS began in 2004, had its first customer in Netflix in 2010, and Amazon shifted fully to AWS in 2012. By that framing, the cloud infrastructure buildout took roughly eight years from the first major capital cycle to a more complete internal and external platform transition. The public debate at the time, he notes, included whether Amazon might go bankrupt under the weight of that buildout. AI is not yet producing that bankruptcy debate, but the capital numbers are large enough that the analogy matters.

The stack has grown fivefold, but Agrawal’s estimated shape barely moved

Agrawal’s updated estimate shows AI revenue growing from about $90 billion in Q1 2024 to about $435 billion in Q1 2026. The growth is “heroic” by his description: applications grew more than 10x, infrastructure expanded, and the total revenue pool increased by roughly $350 billion.

5x
estimated AI revenue growth from Q1 2024 to Q1 2026

The striking point in Agrawal’s analysis is not the growth rate. It is that the estimated distribution of value barely changed. In his estimate, roughly 75% of the incremental revenue added over those two years flowed to semiconductors. Most of the $300 billion semiconductor layer is Nvidia. The application layer, meanwhile, is heavily concentrated: Agrawal says two companies account for about 90% of app revenue, though he does not name them in the excerpted discussion.

The gross-profit picture in his slides is even more concentrated than revenue. A slide based on Altimeter internal estimates shows semiconductors accounting for 87% of AI gross profit in 2024 and 79% in 2026. Applications rise from 3% to 7%, and infrastructure from 10% to 14%. The comparison with cloud is stark: in the cloud stack shown on the slide, applications account for 70% of gross profit, infrastructure for 24%, and semiconductors for 6%.

StackAppsInfraSemis
AI gross profit share, 20243%10%87%
AI gross profit share, 20267%14%79%
Cloud gross profit share70%24%6%
Agrawal's slide shows estimated AI gross profit remaining concentrated in semiconductors, unlike cloud.

Asked directly where profitability sits, Agrawal answers without qualification: semiconductors. He estimates Nvidia data center gross margins at about 75%, “plus-minus a couple percentage points,” while application-layer gross margins may range from 0% to 30%, depending on the company and the estimate. The spread, in his view, is a direct consequence of market structure: one player “kind of runs the tables” on the semiconductor layer.

That is why the course’s central question is not simply whether AI demand exists. In Agrawal’s presentation, demand is already visible in the scale of revenue growth and consumer usage. The harder question is how long it takes for the distribution of revenue and profit to look more like prior technology supercycles — or whether AI remains structurally different for longer than investors and founders expect.

He says he believes the stack will eventually move toward a cloud-like shape, but “it’s not happening nearly fast enough.” Asked what stable equilibrium might look like, Agrawal says AI is unlikely to be a fad or an unsuccessful technology. But he also says the inverted triangle may persist longer than he initially expected because getting the substrate right is so hard.

The destabilizers are ASICs, capex guidance, and the training-to-inference mix

Agrawal identifies several forces that could reprice the AI stack.

The first is custom silicon. If an ASIC program at a hyperscaler breaks out — he names Google’s TPU, Meta’s MTIA, Amazon, OpenAI, Microsoft, and other lab efforts that may not be publicly visible — he expects that to be the biggest repricing catalyst for the semiconductor layer. In other words, Nvidia’s current concentration is not assumed to be permanent, but the path away from it depends on serious silicon programs actually succeeding.

The second is hyperscaler capital-expenditure guidance. Agrawal recommends listening to hyperscaler earnings calls four times a year because public-company CEOs state their major questions and commitments there. If hyperscalers stop guiding to large capex numbers, he says, that would imply the current equilibrium is not working. This is why capex guidance has become so closely watched in AI: it is not just an accounting line; it is a signal about whether the infrastructure buildout continues.

The third is the shift from training to inference. A student suggests that the triangle flips only if inference becomes meaningfully larger than training. Agrawal calls this a good hypothesis and says one of the most sought-after disclosures in Nvidia earnings is the share of its fleet used for inference. He says the last figure he checked was about 40% inference and 60% training, assuming full utilization.

He expects the inference share to rise over time, but he does not claim to know when or how quickly. Training and inference workloads behave differently. Training is predictable, highly utilized, and concentrated over a defined period. Inference is bursty, often tied to human waking hours, and harder to forecast. Usage falls around Christmas and Thanksgiving “for some reason,” he notes. If agents eventually operate continuously, that may change the utilization profile, but that is posed as a possibility, not a current fact.

Those differences matter because the economics of a data center depend not only on the number of GPUs installed but on how predictably they can be used. A training cluster that runs flat out for a short period is a different asset from infrastructure serving unpredictable consumer and enterprise requests.

Infrastructure is contested because it may become someone else’s feature

Agrawal describes the infrastructure layer as the most competitive and unstable part of the AI ecosystem. It has many startups, many of which are “doing really well” and “winning so far,” but it also attracts the hyperscalers, which want a dominant role in the same layer.

The strategic question he poses for infrastructure companies is blunt: are they features or platforms? Many new infrastructure businesses, in his view, look like good ideas until one asks why the capability would not simply become part of AWS. That does not mean every infrastructure startup is doomed, but it sets the burden of proof. A company in the middle layer must show that it can remain an independent control point rather than an eventual product line inside a hyperscaler.

The competitive structure is different again in chips. Asked where ASIC inference startups can sell if Google, AWS, OpenAI, and others are building their own silicon, Agrawal says there is still $300 billion of revenue to fight over. But he also emphasizes the unusual customer shape. About half of that revenue, he says, is from the big hyperscalers, based on Jensen Huang’s earnings-call disclosures. A chip startup therefore sells into a market with a very small number of extremely large customers and orders. It is not like building consumer software or enterprise SaaS.

For someone starting a chip company, Agrawal says the first consideration should be: which of the five major buyers will you sell to first? The long tail of enterprises may exist, but he would not bank on it, because those enterprises are likely to buy through cloud providers.

The question of vertical integration complicates the whole stack. One student asks whether this cycle could produce a fully vertically integrated winner dominating multiple layers. Agrawal responds by looking backward. Google, the likely winner of the internet supercycle by his account, is already close to that model: it runs from its own server infrastructure through search, ads, and user experience, with roughly $3 trillion in market capitalization and near-99% search share by his estimate. Apple, the winner of mobile in his framing, is another integrated winner at around $2.5 trillion. Meta is dominant in social but less vertically integrated; Agrawal speculates it “maybe lost a trillion” because it did not go all the way down to servers. Cloud, by contrast, is more heterogeneous, with AWS, Google Cloud, and Azure forming an oligopoly.

In AI, Nvidia has tried to move upward. Agrawal points to DGX Cloud as Nvidia’s attempt to build a cloud ecosystem, along with vertical applications. He does not present a settled answer. The student may be “onto something,” he says, but the balance of power remains unresolved.

Consumer AI usage is mainstream, but not yet a core utility

The application layer has a distribution problem as much as a revenue problem. Agrawal shows consumer AI usage from Sensor Tower, including ChatGPT, Gemini, DeepSeek, Character AI, Perplexity, and Claude. He says consumer AI is the largest AI market outside coding, and that ChatGPT usage is extremely high. But he also says about 95% of ChatGPT users are free.

To understand the upside, Agrawal compares AI apps with major consumer franchises. He groups consumer apps into three rough categories. The first is “core utility”: products like YouTube, Chrome, and WhatsApp, which have reached around 3 billion users and become close to mandatory in daily life. The second is social: Instagram, TikTok, and Facebook, at roughly 1.5 billion to 2 billion users, with strong network effects but less necessity. The third is niche: products such as Spotify, Amazon, and Twitter, which people use for specific jobs rather than as universal utilities.

On the combined chart, ChatGPT has just overtaken the niche category. Gemini has not yet done so. Agrawal says ChatGPT appears to be heading toward the social category, though as an OpenAI investor he would prefer to see it move toward core utility.

The unresolved issue is whether knowledge work is universal enough to support that trajectory. ChatGPT is not yet where users message each other, receive email, or get a dopamine feed. It is a place where users do active work by asking questions. Agrawal argues that the number of people willing to ask active questions of technology is not the same as the entire online population.

His rough consumer-economics comparison is direct. Alphabet has about 4 billion users and monetizes them at about $100 per user per year. Meta has about 3.5 billion users and monetizes them at about $70 per user per year. ChatGPT, by contrast, has about 1 billion users and monetizes them at about $10 per user per year.

Company or productUsersRevenue per user per year
AlphabetAbout 4BAbout $100
MetaAbout 3.5BAbout $70
ChatGPTAbout 1BAbout $10
Agrawal's consumer-app comparison frames the application-layer monetization gap.

That creates two separate problems. First, how does ChatGPT or another leading AI app grow from 1 billion users toward 4 billion? Agrawal is not convinced knowledge work alone can do it. AI may need to move beyond knowledge work to become a core utility. Second, how does monetization rise from $10 per user per year toward $100? Agrawal is not convinced subscriptions can get there.

His answer is ads.

He says he suspects AI applications will have to move into advertising, and that ads in ChatGPT or Claude could command strong pricing because the systems may understand intent, operate with logged-in users, provide attribution, and benefit from trust. He calls this likely to become a major headline in the year ahead.

The objection is obvious, and he states it himself: AI conversations can be personal, and users may not want interruptions from advertising. But he compares the skepticism to Facebook’s IPO era, when critics argued mobile ads would not work because phone screens had no room. “Shocker. We found the space on a phone,” he says. He does not know what the AI ad format will look like, but he is optimistic that the industry will find it.

The course treats AI as a stack, not a single market

Agrawal’s seminar is organized around the premise that AI is the largest technology supercycle since PC, internet, mobile, and cloud, but that its economics must be studied layer by layer. The schedule moves from chips and the GPU economy to energy and gigawatt-scale AI factories, enterprise infrastructure, frontier labs, internal knowledge models, agent monetization, coding AI, and life sciences applications.

That structure reflects his central claim: founders, investors, and operators need different questions for different parts of the stack. For semiconductor speakers, he wants to ask how long dominance can last, which ASICs matter, and where pricing compression could come from. For OpenAI and Anthropic, he wants to ask about profitability, serving large user bases, whether current users are profitable, and whether ads can become larger than subscriptions. For inference and infrastructure companies, he wants to examine whether they are building independent platforms or features that hyperscalers will absorb.

Agrawal tells students that the purpose is to develop mental models for judging AI businesses: where a company sits in the cycle, what laws of physics govern its economics, what questions to ask before starting, funding, or joining it, and what not to do. He expects many students either to start AI companies or fund them. His practical warning is that they should at least know where the Series A money will go.

The frontier, in your inbox tomorrow at 08:00.

Sign up free. Pick the industry Briefs you want. Tomorrow morning, they land. No credit card.

Sign up free