Cerebras IPO Puts a Public Price on Fast AI Inference
Jordi Hays
John Coogan
Tyler Cosgrove
Ben Hylak
Andrew Feldman
Amy Reinhard
Doug O'Laughlin
Steve Vassallo
Eric VishriaTBPNThursday, May 14, 202633 min readTBPN’s John Coogan and Jordi Hays use Cerebras’s first day as a public company to frame a narrower AI hardware argument: the market is beginning to price low-latency inference as a product in its own right. Cerebras founder Andrew Feldman argues that fast inference will eventually consume demand for slow AI responses, while SemiAnalysis’s Doug O’Laughlin cautions that the company’s wafer-scale SRAM architecture may be limited by memory scaling and model size. The result is a public-market test of whether owning a valuable slice of the AI compute stack is enough.

Cerebras is being valued on speed, not just chips
Cerebras’s public debut turned a long-running AI hardware argument into a market price. By the time John Coogan and Jordi Hays opened the Cerebras discussion, Hays said the company was sitting around a $64 billion market cap. Coogan said he had written only days earlier about a possible $50 billion IPO “being optimistic,” and that the listing had beaten those expectations.
The first-day repricing was abrupt. Coogan said the IPO range had moved from $115–$125 to $150–$160 before Cerebras priced at $185 per share. On its first day of public trading, the stock opened at $350, traded there briefly, and then moved down toward roughly $300–$320 while remaining far above the IPO price. A market chart shown on screen listed Cerebras at $307, up $122, or 65.95%, with a previous close of $185, an open and high of $350, and a low of $300.
Coogan called Cerebras a “true overnight success” in the startup sense: a company that spent roughly a decade moving through technical skepticism, financing cycles, and product-market uncertainty before landing in the right part of the AI cycle. Cerebras, he explained, is built around a simple but radical chip architecture. Instead of manufacturing many smaller chips on a wafer and cutting them apart, Cerebras uses the whole wafer as the chip.
The early objections were also simple. If a standard wafer is divided into many chips, a defect on one area can mean discarding only the affected die. If the entire wafer is one chip, a defect anywhere could theoretically ruin the product. Coogan said early skepticism focused on whether wafer-scale yield could work at all. He credited Andrew Feldman and the team with addressing that through redundant cores and by not activating all cores, creating fault tolerance on the wafer.
The public-market argument was not just that the chips work. It was that AI customers are revealing a willingness to pay for latency reduction. Coogan leaned on SemiAnalysis’s work to argue that businesses are paying disproportionately for speed. SemiAnalysis, he said, had been run-rating roughly $10 million in annual AI spend in April, with 80% of that spend going to Anthropic’s Opus 4.6 Fast mode. That mode charged six times the price for roughly two and a half times the interactivity, and later under two times the speed. In Coogan’s phrasing, customers were effectively paying six times the price for about two times the speed.
That matters because Cerebras is positioned around fast inference. Hays noted that Cerebras chips were already usable through Codex 5.3 Spark. Coogan said OpenAI was “clearly very pilled on Cerebras,” pointing to a 750 megawatt deal and to Cerebras chips serving GPT-5.3 and Codex under the Spark name.
The user experience changes when output arrives immediately rather than token by token. Coogan described the familiar LLM interface, where text streams slowly and feels like a person typing, as less useful than a page loading all at once. He compared the desired experience to landing on a Wikipedia page: the full content is present immediately and the user can scroll. For coding, he said, the preference is clearer still. Users do not want a simulated typing experience; they want the code.
Hays put the point in labor terms: if two employees have the same capability but one is five times faster, the faster employee creates more organizational value. Coogan extended the analogy. In many occupations, a worker who is twice as effective can command far more than twice the salary. He also invoked the familiar e-commerce latency idea that every 100 milliseconds of delay can cost Amazon a percentage point of sales, while acknowledging he did not know the exact provenance of the quote. His underlying point was that delay causes users to lose intent. In LLM workflows, a slow response creates the same leakage: a user asks a question, waits, gets distracted, and never returns to the task.
| Event | Valuation or price | Source basis |
|---|---|---|
| IPO pricing | $185 per share | Coogan cited the IPO price. |
| First-day open | $350 per share | The intraday chart displayed an open and high of $350. |
| Intraday level shown | $307 per share | The chart displayed a 65.95% gain versus the prior close. |
| May 2026 IPO valuation | $48.8B | Coogan read this as the IPO valuation in the company’s round history. |
| Market cap discussed live | ~$64B | Hays said the company was sitting around this level. |
The Cerebras case presented by Coogan and Hays was narrower than a claim that wafer-scale systems replace every GPU workload. They argued that low-latency inference is becoming a premium product across a wide range of business and consumer uses, and that Cerebras is one of the companies positioned to sell that speed.
The memory question is the central technical overhang
The Cerebras debate did not stop at the IPO pop. Coogan and Hays spent considerable time on SemiAnalysis’s concerns, and Doug O'Laughlin later sharpened them. The core issue is memory.
Cerebras’s design relies heavily on SRAM, or static random-access memory, directly on the wafer. Coogan explained that SRAM is fast, but it is no longer shrinking dramatically with each semiconductor node. Cerebras’s WSE-2 had 40 gigabytes of memory. WSE-3 had 44 gigabytes. That is a 10% increase, not the doubling or order-of-magnitude improvement one might hope for in a scaling story.
Because the chip is a wafer, adding more SRAM means giving up wafer area that could otherwise be used for compute. Coogan described it as a direct tradeoff: if Cerebras wants more memory, it may have to sacrifice compute area. TSMC’s wafer size is standardized, so the company cannot simply make the wafer larger. More exotic approaches such as 3D wafer bonding or stacking might exist, but he stressed that there is no obvious linear path where every new generation doubles memory.
That matters because the broader industry is moving toward larger context windows and more demanding agentic workloads. Coogan quoted SemiAnalysis’s view that 128,000-token context windows “will certainly not be acceptable for long,” especially as agents need to maintain longer histories and operate over larger task states. Cerebras’s current strengths may fit fast inference for certain model sizes, but the question is whether that is enough as frontier models grow.
O’Laughlin was direct. Cerebras is “about SRAM,” he said, and SRAM is the fastest possible memory. But “SRAM scaling is dead,” meaning it is not getting smaller at the pace Cerebras would need for a simple scaling story. The company built the biggest scale-up domain in the form of a wafer, but models grew larger than a single wafer. That leaves Cerebras with “really, really fast inference, but only at a certain size.”
The capability problem, in O’Laughlin’s formulation, is whether Cerebras can inference models larger than a trillion parameters. His answer was “pretty unlikely in the near term.”
He was not dismissive of the company’s current opportunity. His updated view was that Cerebras may have a viable path as a disaggregated prefill chip, or in attention/feed-forward disaggregation, where different parts of inference are handled by specialized hardware. But that is a narrower thesis than “Cerebras replaces the GPU.”
The hosts tried to map Cerebras to an analogy: perhaps it becomes more like a CPU in the AI stack. When the AI boom began, GPUs were the obvious bottleneck. Then agentic systems created demand for CPUs as well, because CPUs help keep GPUs fed and handle surrounding work. Coogan asked whether Cerebras might similarly occupy a durable role even if the largest models run elsewhere. Hays suggested a hierarchy where a big model gives orders and delegates workloads to smaller models.
O’Laughlin accepted the possibility only with a qualification. In a perfect world without silicon constraints, that kind of delegation architecture might make sense. But in the real world, Cerebras is optimized for a specific problem: very fast inference at a certain model size. The question is whether that market is large enough. His updated answer was that even 1% of a very large market may be enough.
Non-ironically, 1% of a very large market works.
He described the broader AI compute market as a shortage environment. In a shortage, the best company is not the only winner; demand overflows to the second-, third-, and fourth-best providers. That overflow is part of the opportunity for Cerebras and other specialized hardware companies.
Coogan asked whether a future model architecture shift could unexpectedly favor Cerebras. O’Laughlin said that might be “outside my pay grade,” but offered a speculative version: if future systems can distill themselves or improve compute efficiency enough, they might make it easier to run highly capable models on Cerebras. He called that the “gigabrain thesis,” not a near-term base case.
The Cerebras-Groq comparison made the specialization point more concrete. O’Laughlin explained that transformer inference can be broken into different compute- and memory-bound stages. Prefill can be compute constrained; decode can be heavily memory-bandwidth constrained. Groq, in his telling, can fit into a rack-level architecture by receiving activations from a GB200 rack and using very fast SRAM in an LPU rack. Cerebras is different: “an island of compute,” excellent for what happens on the island, but harder when work has to move off the island.
That is the technical tension behind the market enthusiasm. Cerebras may be extremely valuable if fast inference at certain model sizes becomes a large, persistent market. But the most skeptical technical question is whether its wafer-scale SRAM architecture can follow the industry if frontier models, contexts, and agent state grow faster than its on-wafer memory can.
Feldman’s case: the market for slow inference goes to zero
When Andrew Feldman joined after the IPO, he was in celebration mode but still focused on the product argument. He said the day had gone better than Cerebras had hoped. The company brought a large group to Nasdaq, including employees who had been at Cerebras for more than nine years and their families. Feldman emphasized the families because, in a hardware startup, the years of patience come from more than employees alone.
He confirmed the pricing and trading arc: Cerebras priced at $185, opened at $350, and settled around $320. “What an extraordinary thing,” he said.
Asked whether Cerebras had been a straight shot, Feldman rejected the premise. In hardware, he said, anyone who claims a straight shot is not telling the truth. The first chip in a new architecture is only a little more than a prototype or proof of concept. The second chip irons out challenges and begins customer exposure. Often the third chip is where the architecture takes off. Cerebras was founded in 2016 and is more than 10 years old.
Feldman held up the wafer-scale chip on camera, describing it as “the size of a dinner plate.” He said the company tried to solve problems other people thought were impossible. For a while, he said, they were impossible. Cerebras did not solve the chip until August 2019. When the company built it, it was faster than everyone else — and “absolutely nobody cared.”
The problem, in Feldman’s telling, was not performance. It was timing. AI was still a novelty. Nobody cared how fast a system was when the use case was not yet central. Feldman said that changed with GPT and then, more decisively, in 2025, when models became useful enough that everybody wanted to use AI. That demand is inference demand.
On the IPO roadshow, Feldman said Cerebras had to explain three ideas to investors who wanted more than an “AI chip” label. First was market size. He cited Jensen Huang’s claim on Brad Gerstner’s podcast that inference demand would grow by a million times, a claim Feldman said many people did not believe. He also pointed to Sam Altman’s efforts to lock up compute, memory, data center capacity, and power as evidence that the largest AI builders saw the same demand curve.
Second was architectural pluralism. The GPU is not the only way to build AI compute. Feldman listed TPUs, Trainium, and Cerebras as examples of different routes.
Third was the claim that CUDA lock-in may be overplayed. He argued that Gemini 3 was trained on TPUs without CUDA, and Anthropic’s models were trained on Trainium without CUDA. Some of the best models and most interesting work, in his view, are happening outside the CUDA ecosystem.
The clearest commercial claim was about what users do once AI becomes real-time. Feldman argued that fast inference will cause people to do more things, stay longer, work on harder problems, and invent new products. He used Netflix as the analogy: the company began by mailing DVDs, but when the internet got fast enough, it did not merely improve DVD delivery; it became a movie studio delivering directly to the home.
How big is the market for slow search? Zero. How big is the market for dial-up internet?
In Feldman’s view, speed does not simply command a premium within today’s workflows; it changes the workflow and eventually consumes the market. Fast inference, he said, will be “all of the market.”
On the memory and model-size question, Feldman pushed back on the idea that 10 trillion-parameter models create a problem uniquely for Cerebras. A 10 trillion-parameter model is hard and expensive for everyone to serve, he said. For larger models, Cerebras ties systems together in parallel and runs them as a pipeline. Feldman’s view was that Cerebras can train and run inference on trillion- and multi-trillion-parameter models in ways that are more intuitive than GPUs, because GPUs have off-chip memory but less compute per chip.
O’Laughlin and Feldman emphasized different parts of the same constraint. SemiAnalysis focused on SRAM scaling and the difficulty of serving models larger than a trillion parameters in the near term. Feldman focused on system-level parallelism, pipeline execution, and the claim that very large models strain all architectures.
The more durable framing may be complementarity. Feldman described a future “confederacy of models,” with different models serving different roles. Cerebras connects over standard 100 gigabit Ethernet, he said, and is already deployed in environments that include Nvidia GPUs, AMD GPUs, and x86 compute from Dell or HP. The company is eager to operate in mixed environments.
On internal AI use, Feldman said frontier models are already changing Cerebras. The company uses them in coding and general administrative work. If one started a company today, he said, one would build a very different organization. HR, training, finance close processes, recruiting, and sales all change. A year ago, engineers were using approximately zero tokens; now, according to Feldman, some are using $10,000 worth of tokens a month. The number of new pull requests has increased dramatically.
On space data centers, Feldman was more cautious. Cerebras’s large chip helps because one of the hardest things in space is communicating across chips, and a wafer-scale design reduces how often that communication has to happen. But he compared the category to self-driving: the last 10% takes 80% of the time. He does not see data centers in space in the next three or four years. His estimate was eight to 12 years, with the caveat that if companies do not start working now, it becomes 25 years away.
The investors saw hard problems before the market was ready
Cerebras’s public debut also became a case study in venture timing. Hays noted that Benchmark’s Series A stake, led by Eric Vishria, was now worth many billions. Coogan cited a post from Ho Nam arguing that the IPO illustrated the power of an individual partner over the brand of a firm: Pierre Lamond had been at Sequoia and Khosla, but it was Eclipse, the firm he joined at age 84, that backed Cerebras multiple times in the early days.
The order book showed strong demand. Coogan cited Matthew Sigel’s post saying one-third of the book received zero allocation and the top 25 investors took 60% of the shares. Another on-screen post said Cerebras raised $5.9 billion in the year’s biggest IPO and priced at $185, giving it a roughly $40 billion market value at pricing.
The company’s private valuation path was shown as a long ramp: a Series A in 2016 at $100 million with Foundation, Benchmark, and Eclipse; a Coatue-led Series B in 2016; VY Capital leading Series C in 2017; a $1.6 billion valuation in 2018; $2.4 billion in 2019; $4 billion in 2021; $8 billion in 2025 with Atradius and Fidelity; $23 billion with Tiger; and then a May 2026 IPO at $48.8 billion, according to Coogan’s readout.
| Year or round | Valuation or lead detail |
|---|---|
| 2016 Series A | $100M; Foundation, Benchmark, and Eclipse |
| 2016 Series B | Coatue led |
| 2017 Series C | VY Capital led |
| 2018 | $1.6B valuation |
| 2019 | $2.4B valuation |
| 2021 | $4B valuation |
| 2025 | $8B; Atradius and Fidelity |
| 2026 pre-IPO round | $23B; Tiger |
| May 2026 IPO | $48.8B IPO valuation cited by Coogan |
Vishria said the investment looked obvious only with hindsight. In 2016, deep learning was clearly becoming important, but the Transformer paper had not yet appeared, TPUs had not yet been announced, LLMs had not emerged, and ChatGPT was years away. He was looking at applications such as radiology and security but found it hard to know where AI would work. When Feldman pitched Cerebras, the early slide that changed his thinking was the claim that GPUs “actually suck for deep learning” but were 100 times better than CPUs. Vishria said the point immediately made sense: why would a graphics processing unit be the ideal architecture for deep learning?
He also admitted to underestimating the difficulty of hardware. “It is very useful to be naive,” he said. Benchmark had not been a regular hardware investor. Its last hardware investment before Cerebras had been Ambarella, roughly 10 years earlier. But the team was exceptional and the idea was provocative.
The hard part came later. Vishria said that six or seven years after the investment, Cerebras had raised a great deal of money, had little revenue, and was still grinding. The company’s pivot from training to inference, the explosion of inference demand, and the rise of coding as a speed-sensitive use case all came together over the last two years. He credited the team with persistence, openness to market feedback, and never giving up.
As an investor, Vishria did not claim technical contribution. He joked that he was the “algorithm specialist” and then clarified that his role changed with the company’s stage: fundraising help, management-team building, and being a founder’s sounding board through the highs and lows. He called the role “consiglieri”-like and said it is the part of venture he enjoys most.
Steve Vassallo of Foundation Capital gave the earlier version of the same story. He met Feldman and Gary in October 2007 when they were raising money for SeaMicro, a prior company focused on new server architecture. Vassallo passed on that investment but stayed close. After SeaMicro was acquired by AMD, he expected the founders would not remain there long and began a two-year conversation with them about new ideas. By spring 2016, Foundation wanted to be the first term sheet. Benchmark and Eclipse then joined, and the terms were adjusted to co-lead.
Vassallo said Foundation backed Cerebras because it saw AI and ML workloads ramping steeply through its portfolio. His investing lens is workload-driven: when a new computing workload spikes, it often creates an opportunity to replace or redesign the compute layer. x86 fit the personal-computing workload. GPUs fit graphics. Mobile created demand for low power and smaller form factors. AI workloads suggested purpose-built silicon.
He also described the technical risk as “five startups worth of hard problems.” The challenges were not just fundraising, suppliers, or TSMC negotiations, though those existed too. They were physical and architectural: how to yield a semiconductor the size of a dinner plate, how to power it, how to cool it, how to maintain continuity across thousands of connections, how to integrate it into systems, and how to operate dozens of them together in a data center. Those risks were stacked, making the overall risk combinatorial.
Asked what he tells founders going public, Vassallo offered three pieces of advice. First, prepare the team for volatility; the share price may move for reasons unrelated to day-to-day execution. Second, accept that the company has to grow up into quarterly operating discipline and public reporting. Third, do not let quarterly pressure destroy what made the company special. Innovation companies, he warned, can be killed by a short-term cadence if they lose sight of the larger horizon.
Netflix turned lower-cost access into an advertising machine
Amy Reinhard described Netflix’s ad tier as a business that has moved from internal angst to strategic acceptance. Reinhard has been at Netflix for about nine and a half years. She began in content, working on licensing and production, and moved into her role overseeing ads about two and a half years ago.
Netflix’s decision not to have ads had been a strategic bet for years. In 2021 and 2022, when Netflix began discussing an ads business, the shift created real internal concern because it represented both a cultural and strategic reversal. Netflix partnered with Microsoft to enter the business quickly, which got the ad tier running. About 18 months before the interview, Netflix decided to build its own tech stack, and that stack launched a year ago.
Reinhard said she has to remind herself how nascent the stack still is because the team has delivered so much in a short period. The bigger point was organizational. Netflix has “put to bed” the idea that it should not be in advertising. The ad tier has helped Netflix grow its user base by reaching consumers who want a lower-cost option and are comfortable with ads. Netflix announced at its Upfront that it would expand the ad tier into 15 more countries.
The ad pitch is full-funnel. Reinhard said advertisers are oriented around outcomes, and Netflix wants to support both brand partnerships and lower-funnel conversion, including purchase intent and consideration. The company is not positioning the product only as television-style brand advertising.
Some advertisers want to attach themselves to tentpole content. Reinhard gave examples of advertisers wanting to be associated with properties such as K-pop Demon Hunters or Stranger Things, and mentioned McDonald’s in that context. The syntax of the exchange did not establish a specific placement as a reported fact; the point was that large cultural moments are often the easiest to sell. But K-pop Demon Hunters also illustrated Netflix’s uncertainty advantage: the company did not know it had a hit until about 60 days after release. Because Netflix has depth and variety, it also sells audience behavior, moods, targeting, and relevance rather than only specific program adjacency.
On the tech stack, Reinhard said Netflix’s technical culture of testing and iteration transferred well. The company constantly tests hypotheses around member experience, lower ad loads, frequency caps, and reduced friction. The bigger learning was not the technology, she said, but the fact that advertising is a relationship business. Netflix had never had a sales team in the same organizational sense. The company had more to learn organizationally than technically.
Ads may also put pressure on the shape of the content itself, but Reinhard did not describe that as a top-down mandate. Some creators, such as Shonda Rhimes, spent years writing for broadcast and naturally think in breaks. Others do not. Netflix’s job is to find natural breaks without cutting mid-sentence or disrupting action.
The ad opportunity is mostly B2C for now. Reinhard said the current target clients are top enterprise advertisers, but when Coogan clarified that he meant B2B versus B2C advertisers, she said Netflix sees the opportunity primarily as B2C. The company may expand as it learns more.
On measurement and signal, Reinhard said Netflix is testing second-screen experiences and ways to meet customers without being intrusive. Privacy safety and member data protection are constraints. QR codes came up as a revived ad mechanic; Reinhard agreed they are not always the simplest member experience but said the backend ad-tech complexity makes their return unsurprising.
Hays asked whether Netflix might serve short-form vertical video ads in a clips tab. Reinhard said yes. Netflix announced at the Upfront that as it rolls out vertical video content, it will offer that inventory to advertisers along with Tudum.com coverage in 2027. Games advertising, by contrast, is not on the near-term roadmap. Reinhard said Netflix is watching games engagement increase, and she has learned never to say never at Netflix, but foundational ad products and other innovations already fill the roadmap for the next two to three years.
Agent infrastructure is becoming a debugging and incentive problem
Ben Hylak of Raindrop described the company as “observability for agents,” with a focus on self-healing agents. When an agent hits a production problem, Raindrop detects it and fixes it. Hylak called the company “the intelligence for your intelligence,” combining agentic systems, classic machine learning, anomaly detection, trace analysis, and customer-specific models.
The customer base, he said, has two shapes. Raindrop began with high-growth startups such as Clay, Framer, and Speak.com. Those early customers grew substantially, which helped Raindrop. Hylak compared the business to seed investing: infrastructure companies succeed when their customers succeed, and one must be selective because customers that die do not compound into good accounts or strong references. In recent months, Raindrop has also moved into Fortune 50 and Fortune 100 companies deploying agents internally.
Raindrop’s launch was a free local open-source tool called Workshop, available at raindrop.ai/workshop. Hylak said the missing piece in agent development has been local visibility. Developers build agents locally using SDKs from OpenAI, Vercel, or others before pushing to production, but there has been no standard way to see what the agent is doing. Some teams send traces to a server, some print logs to the console, and some dump logs into databases. The experience is slow and fragmented.
The second problem is that coding agents such as Claude Code cannot see the traces either. When something goes wrong and a developer says the response was wrong, the coding agent guesses. It does not know what the agent actually did. Workshop is meant to give developers and coding agents local traces they can use in a self-healing loop. Hylak said users can also connect it to production Raindrop, pull in a remote trace, replay it locally, and let Claude Code or Codium keep iterating until it works.
Asked why open source it, Hylak said partly because the tool can be open sourced and someone else could build it. More importantly, running locally gives the best experience, and Raindrop wants users to hack and shape it for their workflows.
The agent market creates a second constraint: companies want agents to act, but they do not necessarily want to become interchangeable backends. Hylak said he would use an Airbnb API if one existed, and he would book Airbnbs through Claude Code. He finds Airbnb hard to search. But he also sees a broader risk: companies may reduce themselves to APIs with no moat. He cited Photoshop and Illustrator adding Claude Code or MCP integrations. If users stop touching the UI and use the application only through an agent interface, the company has lost something important.
His Apple App Clips anecdote illustrated the incentive problem. One hero idea for App Clips was that a user in a Starbucks line could scan something and order without downloading the Starbucks app. But Starbucks does not want that. It wants the user to download the app, accumulate stars, join the loyalty system, and enter Starbucks’s own customer relationship. Companies have goals beyond simply exposing a transaction endpoint.
Coogan pushed back that tools such as Photoshop differ from marketplaces or retailers such as Airbnb, DoorDash, and Starbucks. The value of Starbucks is not in the app UI; it is in the drinks and the store network. Hylak agreed, but maintained that companies will have to decide how much of themselves they are willing to make interchangeable through agent interfaces.
On computer use, Hylak said Codex has done a good job implementing browser use, especially for debugging. He expects the next several months to be defined by “self-healing loops”: Claude Code makes a UI change, sees that it fails or looks wrong, and keeps going. To him, the AGI question can be framed partly as how many loops a system can complete before degrading or ending catastrophically.
General Catalyst’s ad raised a positioning problem
The General Catalyst ad presented two characters, “GC” and “VC,” with VC pitching “Wolf AI,” an AI-native robot dog companion platform. GC responds that people may like dogs as they are and says the firm has a high bar for responsibility. The robot dog then malfunctions and runs off.
Hays thought the ad was well shot, well timed, and funny at the end. But he also thought the premise was strange because robot dogs may be a large opportunity. There are already robot dog toys, and he had recently seen a robot dog pitch aimed at replacing or supplementing seeing-eye dogs. Service dogs can be expensive and scarce; a robot service dog that travels easily, does not bark, and does not need food could be socially valuable.
Coogan said the robot dog in the ad appeared to resemble a Boston Dynamics-style machine, which is usually used for industrial or hazardous environments rather than pets. He and Hays agreed that if General Catalyst wanted to define itself as the responsible investor, robot dogs were a weak target. Coogan said a responsibility line would more naturally target gambling, cannabis, or other contested categories — not a toy or assistive robot dog.
The sharper issue was that General Catalyst and Andreessen Horowitz overlap heavily in actual portfolios. Hays said many of their modern winners are in both firms’ portfolios. He highlighted that General Catalyst is in both Kalshi and Polymarket, which he called central to the current moral debate in tech. Both firms are also in Anduril, which was once controversial for defense technology but has become broadly normalized in Silicon Valley.
That made the ad feel less like a principled investing distinction and more like brand counter-positioning. Hays sees General Catalyst historically as more buttoned-up, East Coast, Boston, and traditional finance, while Andreessen Horowitz is louder and more media-driven. The ad seemed to signal a shift toward a louder brand strategy for GC, but Hays questioned whether attacking a category of weird consumer startups made sense when the firms syndicate deals together and share cap tables.
Coogan contrasted the ad with what he viewed as better venture thought leadership from General Catalyst CEO Hemant Taneja. Taneja’s “triple, triple, double, double, double is no longer good enough” claim had sparked debate because it asserted that AI-era companies can grow 10x annually and that venture benchmarks need to adjust. Coogan said that was provocative and controversial, but it was market analysis from a person with standing to make it. It did not require attacking a specific rival.
The hosts’ preferred villain for venture branding was stagnation, not another VC firm. Coogan cited Peter Thiel’s stagnation thesis as a better model: if technology does not advance, society does not get robot dogs, cancer cures, Mars, or other big projects. That kind of villain can unite the industry. Picking a fight with a firm that shares most of the same future vision risks making everyone look smaller.
The Figure robot debate turned on what “autonomous” should mean
The Figure AI livestream created another fight over AI demonstration credibility. Brett Adcock had posted that a team of humanoid robots was running a full eight-hour shift at human performance levels, “fully autonomous” and running Helix-02. The stream received millions of views. But viewers noticed a moment where the robot missed packages, then appeared to touch its own head in a gesture that resembled someone adjusting a VR headset.
Coogan and Hays treated the demo as impressive even under the skeptical interpretation. The robot moved packages quickly. Even if teleoperated, Coogan said, it would still be impressive hardware. But because Adcock claimed no teleoperation, the head-touching gesture became the point of scrutiny.
Hays laid out possible explanations. One was that a human operator adjusted a headset and the robot mirrored the movement. Another was a benign autonomous behavior: if the robot reached across its body, it might lift its other arm to avoid blocking a camera sensor or hitting nearby hardware. Coogan said Adcock’s explanation was that for cross-body reach, the policy lifts the arm to avoid hitting the metal chute.
The hosts joked about a third category: no humans in the loop, but perhaps an orangutan in a VR headset teleoperating the robot. Hays said, tongue-in-cheek, that such a system could claim no humans in the loop. Coogan argued the chimpanzee would be autonomous in some sense because it has “somewhat of a neural network.” The joke underscored the semantic problem. “Autonomous” can be stretched if the terms are not precise.
Adcock later posted more details, according to Hays’s summary. The original goal was an eight-hour run. After zero failures, Figure kept going and surpassed 24 hours of continuous autonomous operation. The task was small-package sorting: F.03 detects the barcode, picks up the package, and reorients it barcode face down on the conveyor. Humans average around three seconds per package, and F.03 was around human parity. Adcock said the robots reason directly from camera pixels, run Helix-02 onboard, and that there is no teleoperation. Every action comes directly from Helix-02.
He also explained that if the robot gets stuck or the policy goes out of distribution, Helix triggers an automatic reset. If a robot has a software or hardware issue, it autonomously leaves for maintenance and another robot takes over. Figure runs its labs that way to maximize uptime.
Coogan’s practical question was whether a humanoid is the right form for the task. He wanted to ask someone at Amazon whether fulfillment centers actually need humanoids for this kind of package orientation. Hays noted that factories are full of custom machines that flip, sort, and package goods at scale, often using durable, purpose-built equipment that lasts decades. A humanoid sorting packages may be on the path toward more general economically valuable robots, but it is not obviously the best way to solve package sorting itself.
Power, clean rooms, and local politics are turning into AI markets
O’Laughlin’s broader AI infrastructure discussion treated the boom as real but bottlenecked. On AMD, he said Lisa Su’s immediate task is getting rack-scale systems working. Because the market is in compute shortage, AMD should benefit from overflow demand. On the inference-serving side, he would not be surprised to see AMD pursue some kind of fast SRAM offload feed-forward-network chip within 12 months, though the number of candidates for that approach is small.
On Intel, O’Laughlin said the stock price may be ahead of the technical turnaround. The transcript rendered his next phrase as “liquidating clearly has like righted the ship,” likely referring to leadership changes but not clear enough to state as a named claim. His broader point was that Intel has gotten the right people involved and that a government-Intel deal changed the demand problem. He contrasted it with Pat Gelsinger’s earlier effort to build foundry demand from the bottom up: in O’Laughlin’s telling, the government approach was to sign from the top and make customers participate because the United States government was involved. He said customers are present, the process is good enough, and 14A may also be good enough given shortages at TSMC’s N3. From here, it is execution risk — and Intel has a history of execution problems.
On TSMC, O’Laughlin described the company as a kingmaker in AI supply. It has no reason to let the market get too far over its skis. He said TSMC is expanding capex significantly, but in absolute terms the numbers are already huge, and Taiwan may run short of TSMC engineers. The most specific bottleneck is clean room capacity. A clean room can take about three years to bring up, and TSMC would have needed extraordinary conviction two years earlier to perfectly match today’s demand. He expects supply to lag repeatedly, with demand signals pushing up wafer pricing and orders, causing TSMC to invest more incrementally.
On data centers, O’Laughlin said delays are already visible. His favorite clickbait framing was that “50% of all data centers in America are delayed or canceled,” which sounds like half are canceled but mostly means everything is delayed. The key political fight is local. Communities have to want the jobs or at least accept the economics. Data center dollars per megawatt are rising, and the cost leaks into labor. A place opposed to development may reconsider if the project brings thousands of jobs. Still, he called it a county-by-county fight, and some places will say no.
For site selection, he said power is the biggest bottleneck. The industry is increasingly willing to move data centers to where the power is and connect them by fiber. In the previous internet era, the major constraint was delivering video from TikTok or other services to users quickly, so points of presence near population centers mattered. If power is the largest cost and constraint, the data center should move to the power. There will still be some inference densification near populations, but the ROI favors remote locations.
On space data centers, O’Laughlin remained skeptical in the short run. Putting a pound in space is far more expensive than on Earth, and the market would require a new specialized supply chain for a smaller near-term opportunity. In a very long-run world with enormous terrestrial GPU capacity and a desire to put a terawatt in space, space data centers could work. But before then, he expects the buildout to shift to other geographies, probably in the Western Hemisphere, where power and regulatory capacity are easier.
His broader AI view was that the boom is larger than he expected. He said he now believes AI will be bigger than the internet, a claim he did not believe two years earlier. The revenue and demand still look real to him. He uses Claude Code every day, expects to use it more for the rest of his life, and considers himself an early adopter. The bubble callers, he noted, had become quieter, which made the hosts more nervous, not less.
O’Laughlin also argued that AI may break existing economic measurements. GDP was invented in the 1930s as a statistical estimate to organize and measure output, including for wartime planning. If AI dramatically increases unmeasured output, the concept may become less useful. Coogan joked that SemiAnalysis should track “gross token production,” or GTP, as the new national output measure.
Warsh and the OpenAI trial showed the institutional backdrop
Kevin Warsh’s confirmation as Federal Reserve chair and the OpenAI–Elon Musk trial sat outside the Cerebras story, but both supplied a reminder that technology markets are also being shaped by central banks, courts, politics, and public trust.
Warsh was confirmed 54–45, with all Senate Republicans and one Democrat, John Fetterman of Pennsylvania, voting yes. Senator Kirsten Gillibrand did not vote. Reading from coverage, Coogan said no Fed chair had been confirmed by such a narrow margin since Senate approval became required for the job in 1977.
Warsh’s challenge, as Coogan framed it, is both economic and institutional. Trump has demanded rate cuts, but the Fed committee may be skeptical, especially given inflation news and potential economic stagnation. Coogan explained the stagflation problem in policy terms: low growth and low inflation make rate cuts easier; high growth and high inflation make rate hikes more straightforward; stagnation plus inflation creates the difficult case.
Warsh also enters after intense political pressure on Fed independence. Coogan’s readout said that during confirmation hearings, Democrats questioned how he would maintain independence from a president who values personal loyalty. Warsh said he would preserve monetary independence and made Trump no promises about policy decisions. Powell, citing concern over political attacks on the institution, plans to remain on the Fed Board of Governors after his chair tenure ends, despite Trump’s insistence that he leave.
The OpenAI–Elon Musk trial was in closing arguments, with Hays following Mike Isaac’s live posts from court and Coogan later reading from a separate news update. Prediction markets had moved against Musk: a Kalshi market on whether Elon would win his case against OpenAI had peaked at 58% on April 28 and was sitting at 30% on May 14, according to an Isaac post shown on screen.
Isaac’s courtroom thread emphasized that the judge’s instructions were clarifying the specific lens through which the jury had to view the evidence. That mattered because, in Isaac’s words as summarized by Hays, it showed how high the bar was for the plaintiff’s side to prove some claims. Isaac described Musk’s side as focusing heavily on credibility: attacking Sam Altman and Greg Brockman, portraying Altman as a liar, and using unflattering imagery.
The separate news update Coogan read described OpenAI’s response differently: Musk’s character attacks were a sideshow, and the actual claims could not be supported under law. OpenAI lawyer Sarah Eddy argued that no one other than Musk had testified to any commitments or promises that Altman, Brockman, or OpenAI made to him. Lead counsel William Savitt told the jury that Musk had no claim unless there was a specific agreement describing how his donations to the nonprofit should be spent. “That agreement does not exist,” Savitt said, according to Coogan’s readout.
The trial also produced a lower-stakes technology parable. Max Zeff posted that Musk’s lawyers brought a large monitor into the courtroom. OpenAI’s lawyers asked to use it. Musk’s lawyers said no. The judge told them they had to share. OpenAI then said it might not be possible to connect their laptops to the monitor. Coogan summarized the scene as AGI arriving while the courtroom still needed a dongle.
The best robot stops being called a robot
Vassallo’s robotics comments supplied the clearest counterweight to humanoid hype. He studied robotics and embedded systems at the intersection of mechanical and electrical engineering, then worked at IDEO on product development for companies such as Apple and Cisco. One of his old projects, Cisco voice-over-IP phones, was visible on desks at Nasdaq.
Asked how that background informs his view of the current robotics moment, Vassallo said investing in areas where one has operating experience can create scar tissue. He is generally not a big believer in the humanoid approach. There may be use cases in home companionship, but even there he sees it as a stretch. Instead, he thinks robotics should be understood more broadly as automation of human labor.
That means the form factor should fit the job. On a factory floor, people move pallets, but the human body is not a good form factor for pallet movement. One would not build a humanoid robot for that. The deeper product point is that the greatest compliment for many robotic systems is that people stop calling them robots. They become forklifts, washing machines, or other named appliances. The technology diffuses into the background, and the application becomes the product.
Hays connected this to washing machines. Watching humanoids load washers made him wonder whether the better opportunity is not a humanoid doing laundry but a first-principles redesign of the washer-dryer stack itself: if the constraint is dirty clothes in and clean clothes out, perhaps the right product is not a general humanoid plus existing machines, but a narrower automated appliance.
Vassallo agreed with the wedge logic. In hardware and hard technologies, focus is a source of leverage. The right starting point is “big enough to matter but small enough to win.” From there, the company can expand into larger markets.
That was also his read on Cerebras. The company began with training, then rotated toward inference when the workload exploded. The lesson was not that founders should start with the whole world. It was that they should identify a workload that is spiking, build a focused solution, and be willing to rotate when the larger opportunity becomes clear.



