Cerebras’s Higher IPO Range Tests AI Infrastructure Demand

Alex GrantThis Week in StartupsTuesday, May 12, 202620 min read

Alex Wilhelm and Jason Calacanis treat Cerebras’s raised IPO range as a test of how much public investors will pay for future AI inference demand and the quality of contracts with customers such as OpenAI. Ori Goshen makes a parallel case that enterprise AI’s hard problem is no longer choosing one model, but routing work across models, tools and inference strategies for cost, latency and accuracy. Across OpenAI’s deployment spinout, AI21’s orchestration pitch, Magrathea Metals’ brine-based magnesium plan and OpenClaw’s fading momentum, the article frames deployment as a question of incentives, constraints and where the bottleneck actually sits.

Cerebras’s IPO pricing is a bet on inference demand and contract quality

Cerebras’s revised IPO range turned the AI infrastructure boom into a more precise question: are investors buying a chip company’s current revenue, or buying the possibility that inference demand and AI-lab contracts will make today’s revenue multiples look temporary?

Alex Wilhelm said Cerebras raised its IPO price range from $115–$125 per share to $150–$160 per share. At the top of that range, he said, the company could be valued at as much as $34.4 billion on a simple basis, or $48.8 billion fully diluted. He called the repricing bullish for the IPO market, but expensive.

The revenue math is the hard part. Wilhelm said Cerebras reported $171.4 million in revenue in the fourth quarter of the prior year, the most recent figure he had. Annualized, that is a $686 million run rate. At a $34.4 billion valuation, he calculated roughly 50 times run-rate revenue. On a fully diluted basis, he said the multiple could reach about 71 times.

Metric	Figure discussed
Revised IPO price range	$150–$160 per share
Simple valuation at $160	$34.4 billion
Fully diluted valuation at $160	$48.8 billion
Q4 revenue	$171.4 million
Annualized run rate	$686 million
Simple valuation multiple	About 50x run-rate revenue
Fully diluted multiple	About 71x run-rate revenue

The Cerebras IPO discussion turned on how much future growth is implied by the revised price range.

Jason Calacanis emphasized that these were revenue multiples, not earnings multiples. Investors were not paying 50 times earnings; they were paying 50 times top-line run-rate revenue. In Calacanis’s view, a multiple that high implies a company must be growing extraordinarily fast. The discussion did not establish a current year-over-year growth rate.

Wilhelm’s defense of the pricing was not that Cerebras’s current revenue alone justifies the valuation. It was that Cerebras sits in the path of a major compute shift. He pointed to a 750 megawatt deal with OpenAI in January, a deal with Mistral in February, a March collaboration with AWS to put Cerebras chips inside AWS data centers, and work with Cognition. His argument was that AI companies increasingly need fast inference chips, not only GPUs for training. If agentic AI keeps driving compute demand, especially inference demand, Cerebras becomes a levered bet on that demand.

Wilhelm still did not present the price as easy. He said he understood the logic, but would not “bet the farm” at that valuation. His line was that the IPO was “not just insanity. It’s insanity with a twist.”

Calacanis treated the OpenAI contract as the key uncertainty. The headline number, he said, is around $10 billion over several years, but the contract structure determines what it is actually worth. Is the deal guaranteed? Is it optional? Can it be deferred? He compared the issue to Nvidia’s arrangement in which Nvidia had an option to invest in OpenAI, stressing that optional commitments are materially different from contracted revenue.

The broader risk is a chain of commitments across the AI infrastructure market. Calacanis referenced Brad Gerstner’s challenge to Sam Altman about how a company with $20 billion in revenue could support hundreds of billions of dollars in buildout. He also referenced Dario Amodei’s warning that companies must be careful: “if the revenue doesn’t show up, you’re out of business and you go bankrupt.”

That is the contract-quality problem underneath the IPO enthusiasm. If an AI lab cannot fund its commitments, does Cerebras, Oracle, or another infrastructure provider sue? Do the parties push out the contract? Does the customer have an option to turn a three-year purchase into a nine-year commitment? Calacanis called it “one of the highest stakes games” he has seen.

Wilhelm added a market-memory concern: investors once paid roughly 70 times revenue for SaaS companies, and that ended badly. He worried the market may be too enthusiastic again. Calacanis agreed that the next two years will show whether the AI infrastructure buildout is ahead of demand or still insufficient. At the moment, he said, it is starting to feel like not enough compute, pointing to cloud and neo-cloud demand that suppliers are struggling to meet.

Benchmark was named as one of Cerebras’s major shareholders, giving the IPO another implication: whether late-stage venture capital can still produce major returns. Calacanis listed SpaceX, Anthropic, OpenAI, Cerebras, and others as companies that could send strong returns back to limited partners and keep venture moving.

OpenAI’s deployment company raises the question of where value accrues

OpenAI’s enterprise deployment push drew more skepticism than Cerebras. Wilhelm described OpenAI and Anthropic as partnering with investment firms and private equity giants to push AI into portfolio companies. OpenAI’s effort, he said, has been named the OpenAI Deployment Company, a name Calacanis mocked as bland before turning to the structure itself.

According to Calacanis, the company is majority owned by OpenAI. He said OpenAI has raised money from TPG, Warburg Pincus, Bain, and “a bunch of other people,” and that Bain, Capgemini, and McKinsey are working with them to use the company to push OpenAI’s products into the market. The discussion also identified Tomoro, or “Tomoro without a W,” as a company OpenAI bought in connection with the deployment effort.

Wilhelm’s interpretation was straightforward: OpenAI wants enterprise deployment without turning the core company into a services organization. Forward-deployed engineers are needed to identify what can be automated and where AI fits inside customer workflows. But OpenAI may not want that cost structure on its own P&L, particularly if it is preparing for public-market scrutiny. So, in Wilhelm’s framing, the separate vehicle can take private-equity demand, send engineers into portfolio companies, route inference calls back to OpenAI, and create value for multiple parties, except perhaps the employees whose work gets automated.

Calacanis objected to the spinout structure. He compared it to dot-com-era separations such as Barnes & Noble and barnesandnoble.com or Toys R Us and toysrus.com, where internet subsidiaries confused where value accrued. His questions were governance and incentives: Will this company go public? Will it distribute dividends? Does value accrue to the deployment company’s shareholders or to OpenAI’s model business? Who does its board answer to?

The sharpest operational critique was model neutrality. A true enterprise AI advisor may need to recommend Claude, Kimi, DeepSeek, or another model if it is better for a customer’s use case. Calacanis asked what happens if the best advice is that OpenAI is not the right software for a particular application, or that a customer should not feed its data into OpenAI. A deployment company tied to OpenAI may have incentives that conflict with that advice.

Wilhelm noted that Anthropic has a similar effort, with $1.5 billion compared with OpenAI’s $4 billion push. Calacanis said his objection applies there too. He did not understand why these businesses are being built as spinouts rather than as divisions or channel strategies. Oracle and IBM have services divisions inside the company, he said. Microsoft has channel partners that implement products and generate services revenue around licenses. In his view, those are the standard approaches: build services internally and explain the lower-margin revenue, or enable an ecosystem of implementation partners.

Wilhelm offered the strongest counterargument. Private equity firms may urgently need help improving portfolio-company efficiency, especially in software businesses that burn cash while chasing growth. If there is strong pull from private equity to “please come fix our software companies,” a separate CEO and dedicated deployment company might move faster than a division inside OpenAI.

Calacanis still saw the structure as suboptimal. When Wilhelm looked up IBM’s trailing price-to-sales ratio and found 3.2 times, Calacanis said that illustrated the incentive to keep services revenue away from a high-multiple AI model company. But he treated that as financial engineering rather than a clean business-design reason.

Enterprise AI is becoming a routing problem, not just a model problem

Ori Goshen described AI21’s current enterprise strategy as a bet on orchestration: enterprises will not standardize on a single model, and the hard problem becomes deciding which model, tool, prompt, or inference strategy should handle each task at the right cost, latency, and accuracy.

AI21’s Maestro platform is built around what Goshen called a “meta-model.” It is not another general-purpose model meant to answer user queries directly. It is a model trained to learn the behavior and patterns of other models: their cost, latency, and accuracy characteristics. The goal is to predict which call is likely to succeed, how expensive it will be, and how long it will take. From there, Maestro can route a task to a frontier model, an open-weight model, or a combination of models and tools.

Goshen emphasized that this is more than a conventional model router. Maestro can call a model multiple times and select the best answer, call several models in parallel, or use different retrieval and prompting techniques against the same model. In his framing, the system executes “different inference strategies at scale,” rather than simply choosing model A or model B.

That distinction matters because, according to Goshen, enterprise AI usage is entering a phase where token consumption is no longer treated as an unbounded experiment. He said companies have been “token-maxing like crazy,” especially with agentic coding and other agentic workflows, but token bills are now large enough that customers are asking whether the return on investment is there. The question has shifted from whether tokens can produce useful outputs to whether they can produce them efficiently.

AI21 still builds its own models. Goshen discussed Jamba, the company’s open-weight model family, which includes a 400 billion parameter large model and a smaller 13 billion parameter mixture-of-experts model. He did not call the 400 billion parameter version a small language model, but said the smaller Jamba model probably fits that category. The architecture combines transformer-based attention with Mamba, which Goshen described as efficient at processing long sequences. Jamba was built with long-context processing in mind.

The open-weight strategy is separate from the company’s proprietary value capture. Calacanis pressed Goshen on whether Jamba could be taken and used without AI21, in the same way companies can run other open-weight models on their own. Goshen said yes: AI21 pre-trained and post-trained Jamba and shared it with the community. Some companies use it in private or on-premises deployments where memory and compute constraints matter. But Maestro, the orchestration system and its internal meta-model, remains proprietary.

Goshen’s clearest strategic claim was that “there is no one model to rule them all.” Enterprise builders, he said, currently begin by picking a model, optimizing around it, and then trying to adapt as better models are released. He called that process painful. Maestro is meant to automate the search across models and inference strategies, so that when Anthropic, Mistral, OpenAI, or another provider releases a new model, the orchestration layer can learn how to use it and incorporate it into the enterprise’s existing workflows.

The pricing model is still being tested. Goshen said Maestro launched late last year and AI21 is experimenting, but the structure is a combination of fixed license fees and consumption-based pricing. He named FNAC, a large European retailer, as one customer using the system for mission-critical workflows, and said AI21 also works with some large U.S. technology companies and Israeli companies.

The company’s path to this point began before ChatGPT. Goshen said AI21 released Wordtune in 2020, when Grammarly was the dominant writing-assistant reference point. At the time, AI21 built its own 178 billion parameter foundation model, Jurassic-1, and applied it to reading and writing assistance. Wordtune reached tens of millions of users and meaningful revenue. ChatGPT changed the modality: users began interacting with AI through chat rather than plugins that checked or rewrote text. AI21’s current enterprise focus is presented as an evolution from that earlier application layer into orchestration.

The demo showed why ensembles can beat single-model optimization

Goshen’s demonstration centered on a practical enterprise tradeoff: cost versus quality, and latency versus quality. He showed charts from AI21’s Maestro materials in which an enterprise can place its available models into a portfolio, including frontier models and open-weight models, and then let the system search possible operating points.

On the cost-quality chart, grey boxes represented baselines such as GPT-5, GPT-5 mini, and GPT-5 nano. The red curve represented combinations of models and strategies. Goshen said the red curve created a “new Pareto frontier”: some queries go to larger models, others to smaller models, and the aggregate can reach higher success rates at lower average cost than relying on a single more expensive model.

Wilhelm interpreted the top of the curve as showing that model combinations can “essentially get to perfect completion at a lower cost than some individual models that cost more.” Goshen agreed and said the same logic applies to latency. The relevant action space is broad: number of models, model choice, retrieval method, prompting approach, and activation pattern. His claim was that automated exploration can surface operational points that a human builder would be unlikely to find manually.

When Wilhelm asked whether the savings are closer to 10% or materially larger, Goshen said it depends on the task, but “definitely can get to 50%.” He pointed to benchmark results showing significant savings.

The more interesting part of the demo was the overlap analysis. Goshen showed a Venn diagram comparing three model-and-tool variants: MiniMax with late interaction, GPT-4 with late interaction, and GPT-4 with dense retrieval. The diagram showed that different variants solved different subsets of problems, with some queries solvable only by one setup. One visible label indicated that 7.2% of the problems could be solved uniquely by one GPT-4 variant. The figure’s caption said the analysis was performed on the BrowseComp-Plus dataset and showed how different model and tool variants complement each other. It also referenced the broader idea that ensemble size and accuracy can scale together even when individual agents are smaller and cheaper.

Goshen’s conclusion was that a portfolio can improve both cost efficiency and coverage. If models have partially overlapping capabilities, the enterprise gains by learning which model or technique covers which query type. That is why he argued the approach works best in enterprise settings: companies have specific workflows, specific evaluation criteria, and measurable success, cost, and latency. Where the evaluation is precise, the orchestration layer has something concrete to optimize.

Magrathea wants to make magnesium a domestic industrial input again

Alex Grant framed magnesium as a critical but under-discussed input for U.S. industrial capacity. It is in every aluminum alloy, he said, and therefore necessary for cars, planes, helicopters, construction materials, and other aluminum-based products. It is also used to make titanium for aerospace and in the production of steel, hafnium, zirconium, beryllium, boron, and other materials tied to defense, aerospace, and national security applications.

Grant’s supply-chain claim was stark: the United States has zero magnesium production, while China controls 95% of global supply. Magrathea Metals is trying to change that by producing magnesium from seawater or brines using a molten-salt electrolyzer process. Grant showed video from the company’s Oakland facility, which he described as the largest U.S. pilot for magnesium metal in two generations. The video showed workers in protective gear pulling metal from molten salt. Grant said it was “reasonable scale,” not bench-scale lab work, and that the company is now scaling toward commercial production.

The process is not built on the novelty of using seawater. Grant said the largest magnesium smelter ever built was constructed in Freeport, Texas, during World War II, using seawater as a feedstock. Magnesium was precipitated from seawater, converted into a magnesium chloride salt concentrate, dried, and then electrolyzed into metal. Similar processes were later built in Utah, Israel, Europe, and Canada, with different implementations each time.

Magrathea’s claim is that it identified why those historical processes were capital-intensive and environmentally problematic, and solved the pieces that would make them difficult to build in the West today. Grant singled out drying the salt as the hardest part because magnesium salt holds onto water. To electrolyze it into metal, the salt must be dry.

The first commercial project will not use ocean seawater. Magrathea has formed a joint venture with Tetra, a publicly traded industrial minerals company serving the oil and gas industry, to build a smelter in Arkansas. The feedstock there is a high-grade brine 10,000 feet underground. Grant said this brine is far below freshwater aquifers and is already used today to produce bromine, calcium chloride, magnesium hydroxide, and other minerals.

Calacanis pressed on environmental byproducts, especially what happens to the water after mineral extraction. Grant said there is “no issue”: the brine comes up, minerals are extracted, and the brine goes back underground with virtually nothing added. He said this is already done daily in Arkansas by companies making bromine and other minerals, and characterized the process as environmentally benign and well understood.

The contrast with current production is central to Magrathea’s pitch. Grant described China’s dominant Pidgeon process as “a Rube Goldberg device of coal,” saying it relies on ties between magnesium production and coal processing, free waste heat, and CCP subsidies. That, in his telling, is what Western producers have been trying to compete against for 30 years.

Magrathea’s intellectual property came from a two-year review of 110 years of magnesium smelting technologies. Grant said the company assembled engineers who could understand the physical chemistry from first principles, identified white space, and built an IP portfolio with 11 patent applications and “a huge mountain of trade secrets.”

The economics are what make the company venture-backable in Grant’s telling. Magnesium sells for roughly $7,000 per ton in the U.S., he said, while Magrathea expects to produce it in Arkansas for around $3,000 per ton. The U.S. market is about 100,000 tons per year. The defense industrial base need is about 10,000 tons per year, which is also the planned size of Magrathea’s first commercial smelter. The first plant is therefore aimed at removing foreign-dependence risk for U.S. defense needs before addressing the broader domestic market.

$3,000/ton

Magrathea’s expected Arkansas production cost, versus about $7,000/ton U.S. market price

Calacanis treated the company as part of a larger American industrial-independence trend. Wilhelm compared it favorably to startups that “cosplay industry,” saying Magrathea appeared to be doing the “in the weeds, nitty-gritty” work rather than putting a thin industrial veneer over a software business.

OpenClaw may have proved the category before losing attention

Wilhelm raised the decline of OpenClaw through two pieces of trend evidence shown on screen. The first was a Google Trends chart showing search interest peaking in mid-March and falling steadily afterward. The second was a 30-day usage chart from an unnamed API model router, with branding covered, also showing a consistent downward trend.

Wilhelm’s question was whether OpenClaw was directionally correct but too early to sustain its hype. Calacanis’s answer was that the OpenAI acquisition “took the wind out of the sails.” The excitement around Peter and a potential revolution changed when “the Empire bought it,” as Calacanis put it. At the same time, competing tools arrived quickly: Perplexity’s computer-use product, Claude cowork improving, Grok’s desktop tool, and people writing their own agents.

Inside Calacanis’s own organization, the OpenClaw work lost some momentum because the top two people working on it were redeployed to vibe-code internal software projects. Other team members became interested in Perplexity Computer and Cowork because the interfaces were better. Calacanis’s conclusion was not that OpenClaw is finished, but that users are promiscuous with tools. When five tools arrive in 90 days, people test them, and a bake-off follows.

Wilhelm argued that OpenClaw may have already “won” in a different sense. It signaled where personal computer-use agents are going, even if it is not the final technology. He compared it to the first airplane: evidence that flight is possible, not the endpoint of aviation. In his view, OpenClaw produced moments of magic amid API failures and crashes, and those moments helped establish the category.

Calacanis cautioned against overreading Google Trends. Search declines can reflect familiarity, not usage decline. People may stop searching for a product once they know it. He used Airbnb, Uber, Lyft, and Amazon Prime as examples of products whose search trends do not cleanly map to usage. Google Trends is more a proxy for novelty and lagging awareness than retention. Still, Wilhelm’s API-router usage chart suggested the decline was not only search behavior.

The competitive issue is usability. Calacanis compared OpenClaw to Linux desktop efforts such as Lindows: powerful, but too hard for mainstream users to set up. The winning product, in his view, needs to be an OpenClaw client that a user can download, install, and have working immediately.

The broader category remains alive. Calacanis expects every major platform to have a desktop agent: Google, Microsoft, xAI, and others. Grok has connectors for services such as Notion, Outlook, Google Workspace, and Gmail. These integrations matter because users need agents to answer questions like what is on the calendar, generate to-do lists, and work across apps. Wilhelm also noted the possibility that OpenAI will bake OpenClaw-like capabilities into Codex, creating a different bundling strategy than Anthropic’s separation between Claude, Cowork, and Claude Code.

The sidebar bounty was narrowed to real-time fact checking

Calacanis and Wilhelm also revisited a product challenge that began as an offhand bounty: build a real-time AI sidebar for live shows. The original ambition was broad. The system would listen to a podcast or live show, watch a transcript stream in real time, and provide persona-specific outputs such as fact checks, jokes, roasts, deep research, or suggested follow-up questions.

Calacanis compared the idea to live radio production, where Howard Stern had people feeding him jokes during a show. The AI version would provide a similar live assistive layer, but for facts, research, and show production.

Wilhelm showed Glass Sidebar by Oliver Choy, built from a GitHub link using Codex because Wilhelm asked it to build the app for him. The demo captured Wilhelm’s live microphone audio, transcribed it, and generated real-time cards. The interface included modes such as “AI writer’s room,” “Fact-checker,” and “Comedy Writer.” When Wilhelm mentioned Zoom’s camera problems, the comedy writer produced the line, “If Zoom had a mood today, it’d be camera shy.”

Calacanis then narrowed the product specification. For the final sprint, he wanted to remove the jokes and other personas and judge only real-time fact checking. His reasoning was practical: the broader set of personalities made it harder for builders to refine a working production product. A fact-checking-only scope would be easier to judge and closer to what the show could actually use.

TikTok’s ad-free tier is both a product and a regulatory response

TikTok’s new ad-free subscription in the UK was discussed as a consumer product and a privacy maneuver. The reported price was £3.99 per month. Calacanis viewed it favorably, comparing it to YouTube Premium and saying ad-free products can remove some privacy pressure because users who do not want targeted advertising can pay instead.

Wilhelm said the price is “incredibly cheap” and that he would pay it, but also said it feels like being charged to make the platform stop hitting him with ads. Calacanis said he could never use YouTube with ads because the ad load is too high, and he sees the pain of the free tier as partly intentional. Companies make non-premium experiences unpleasant enough that some users upgrade. He compared the tactic to airlines making coach progressively more painful through bag fees, paid drinks, and other nickel-and-diming.

The regulatory angle was Calacanis’s main point. If users complain about privacy, TikTok can tell UK and EU regulators that consumers have choice: use a free service with tracking and targeted ads, or pay for an ad-free version where the service “will know nothing about them.” In his view, ad-free subscriptions help platforms get out of regulatory trouble.

Wilhelm raised Meta’s paid ad-free option in the EU, noting that regulators found Instagram’s paid ad-free option breached rules. Calacanis said these outcomes show how regulation can force companies into unnatural product structures to appease authorities. He expects most major products to offer ad-free, privacy-preserving versions for regulators and for the top few percent of users willing to pay.

The off-duty thread turned into a debate about collapse, federalism, and housing

Wilhelm recommended There Is No Antimemetics Division, a novel by qntm that began as part of the SCP Foundation, a collaborative speculative-fiction wiki. The SCP Foundation is framed as a clandestine organization that secures and protects humanity from anomalies. The book focuses on antimemes: ideas or phenomena that consume memory and information, making them hard to perceive, track, or remember. Wilhelm said the premise becomes a story about memory, collaboration, technology, and near-future science fiction.

He also recommended the Fall of Civilizations podcast, which he described as long-form history about the rise and collapse of societies. Episodes can run three hours, with quotes, stories, and narration by one creator. Wilhelm said the historical record often shows civilizations failing when technology no longer fits a changed world, with drought frequently playing a central role.

That led to a discussion of whether the United States is in decline. Calacanis said he does not buy the idea that the American empire is ending, though he accepts that the country has serious problems. Wilhelm predicted a difficult decade because the federal government has to figure out how to spend less money.

Calacanis then argued for state-level experimentation on problems he sees as federally nonfunctional: education certification, universal healthcare, poker regulation, cannabis, psychedelics, and housing. He acknowledged abortion as controversial and said he did not want Roe v. Wade changed, but argued that after the issue moved to states, state-level politics and workarounds began processing the dispute in real time.

Wilhelm pushed on the limits of states’ rights, using anti-discrimination law and school segregation as an example. Calacanis drew the line at basic human rights and constitutional protections, saying that was an area where federal intervention is appropriate.

Housing became the practical example. Calacanis said Texas, Florida, and Nevada are “killing it” on housing, while New York cannot build enough units and relies on rent freezes. In his view, rent freezes undermine construction because developers will not build if returns are capped. Wilhelm agreed and criticized rent control from personal experience living in a rent-controlled building in San Francisco, saying maintenance suffered because landlords had little incentive to invest.

For Calacanis, Austin and Dallas were evidence that housing supply can absorb population growth. He said rents have gone down for three years in a row in Austin, and that Dallas added roughly 100,000 to 200,000 people while housing prices stayed the same. The practical case was simple: build more units, let supply and demand work, and use state and local systems to test what actually fixes constrained markets.

AI Labs and Strategy Agents and Autonomy AI Application Architecture Inference and Deployment AI Infrastructure and Compute Open Models Enterprise AI Adoption AI Business Models AI Market Signals