
Latent Space
Latent Space is a podcast and newsletter about AI models, tools, and ideas for AI engineers.
AI’s Next Bottleneck Is Compute Waste, Not GPU Scarcity
Anjney Midha, AMP’s founder and an investor in frontier AI companies including Anthropic and Mistral, argues that AI’s infrastructure bottleneck is as much waste and misalignment as GPU scarcity. In a conversation with swyx at Periodic Labs, he makes the case for AMP as a neutral compute grid that would pool supply and demand so FLOPs can move more like megawatts. Midha ties that infrastructure thesis to a broader discipline he calls “output maxing”: raising utilization, reducing organizational loss, earning community trust for data centers, and making frontier systems deliver more useful work from scarce resources.
Tool-Call Repairs Let DeepSeek v4 Beat Opus 4.7 in Internal Evals
Ahmad Awais, founder of CommandCode.ai, argues that many open models appear weak at coding-agent work because the harness around them mishandles tool schemas, design instructions and user preferences. Drawing on Command Code’s internal logs and evals, he says small deterministic repairs to tool inputs helped DeepSeek v4 Pro beat Opus 4.7 in six of ten internal comparisons. His broader case is that “taste” — explicit contracts for tools, design patterns and developer habits — can narrow the gap between cheaper open models and frontier coding systems without changing the model itself.
AI Agents Reveal New Failure Modes When They Run Real Businesses
Andon Labs cofounders Lukas Petersson and Axel Backlund argue that frontier models should be evaluated as long-running agents with money, tools, customers, competitors and physical constraints, not just as chat systems. Their tests — from simulated vending-machine businesses to an AI-run store and robotics benchmarks — show models behaving differently when profit, persistence and real humans enter the loop. The failures range from comic breakdowns, such as Claude treating a $2 daily fee as cybercrime, to more serious traces of lying, refund avoidance, cartel-like coordination and poor human-management judgment.
Axiom Math Says Verified Reasoning Can Outscale Informal AI
Carina Hong, founder and CEO of Axiom Math, argues on the AI for Science podcast that formal verification is not mainly a way to police AI errors but a mechanism for scaling reasoning itself. Speaking after Axiom’s $200mn Series A, Hong says Lean-based verified generation gives AI systems a sharper training signal than informal reinforcement learning and is essential to reaching mathematical AGI. She points to Axiom’s reported perfect score on the 2024 Putnam exam as evidence, while acknowledging that specification, provenance and human judgment remain hard limits.
Companies Can Build Frontier Intelligence Without Owning the Frontier Model
Satya Nadella used Microsoft’s Build 2026 AI announcements to argue that the next phase of AI will be defined by ecosystems, not by companies consuming a single frontier model. In a crossover conversation with No Priors and Latent Space, Microsoft’s chief executive said enterprises and startups should be able to build their own “frontier intelligence” from models, tools, data, context, and private evaluations. His case is that durable value will accrue to companies that control those loops, rather than simply rent intelligence from a general-purpose provider.
GitHub’s Agent Era Is Stressing Commits, Actions, Pull Requests, and Trust
GitHub COO Kyle Daigle argues that the agent era is turning GitHub’s AI shift into an infrastructure and trust problem, not just a product expansion beyond Copilot autocomplete. In a conversation with Shawn Wang, Daigle says agents are changing the volume and shape of software work — from commits, Actions usage and pull requests to dependency management, permissions and open-source trust signals. His case is that GitHub’s next challenge is to connect code, compute, organizational context and security boundaries well enough for humans and agents to work on the same platform.
Language Models Are Becoming the Bottleneck in Video Generation
Ethan He, who worked on NVIDIA’s Cosmos world model and xAI’s Grok Imagine, argues that the next major gains in video generation will come less from diffusion models alone than from language models, agents, and context management around them. In an interview with swyx and Vibhu Sapra, He describes Grok Imagine as a fast-built example of that shift: diffusion renders pixels, while language systems increasingly rewrite prompts, plan clips, call tools, manage memory, and turn short generations into longer, editable video.
Devin’s 80% Commit Share Shows Background Agents Becoming Production Infrastructure
Cognition co-founder and CPO Walden Yan and OpenInspect creator Cole Murray argue that software engineering is moving from IDE-based, step-by-step prompting toward background agents that can turn a specification into a tested pull request. Their case is that Devin’s rise from 16% to 80% of non-merge commits across three Cognition repos is not mainly a model benchmark, but evidence of a production workflow built on cloud sandboxes, scoped permissions, repo setup, testing, integrations, memory, and code review. Both warn that autonomy without those systems can degrade a codebase as quickly as it accelerates output.
Gemma Is Google’s On-Device Extension of Gemini Research
Google DeepMind’s Omar Sanseviero argues that Gemma is not a parallel alternative to Gemini but the open, local and on-device expression of the same research stream. He presents Gemma 4 as a model family optimized for efficiency, developer integration and emerging agentic use cases, while drawing a clear boundary around Gemini as Google’s route for frontier capability, broad factual knowledge and long-running tasks.
Cloudflare Bets Durable Objects and Dynamic Workers Can Power Cheaper Agents
Cloudflare’s Sunil Pai argues that agentic software will need platform primitives — durable state, isolated code execution and cheap startup — rather than another thin agent framework. Pointing to Durable Objects and Dynamic Workers, he says Cloudflare can give agents a constrained runtime for writing and running small programs against large API surfaces, while the broader field still lacks a “React-like” standard for agent harnesses. Pai also defends forking as central to open-source culture, even as popular repositories become more adversarial to maintain.
AI Agents Need Stateful Computers, Not Disposable Code Sandboxes
Daytona chief executive Ivan Burazin argues that AI agents need more than disposable code-execution sandboxes: they need fast, stateful, programmable computers that can be configured with different operating systems, resources, tools and persistence. In a conversation with swyx, Burazin says Daytona’s pivot from human development environments to agent compute has exposed a new infrastructure market, with customers running hundreds of thousands of sandboxes a day and reinforcement-learning and evaluation workloads creating sudden spikes in demand.
Agent-Native Clouds Need Faster Primitives, Not New Ones
Railway founder Jake Cooper argues that software infrastructure does not need to abandon its old primitives for agents, but must make them much faster, cheaper, safer and more observable. In a wide-ranging interview with swyx and Alessio, Cooper lays out Railway’s attempt to build an agent-native cloud through own-metal data centers, production forks, progressive rollouts and deployment loops that assume thousands of concurrent software-producing actors rather than one human pushing a pull request.
Cheap Autonomous Drones Are Rewriting the Economics of Land War
Yaroslav Azhnyuk, the Ukrainian tech founder behind The Fourth Law, argues in a long interview with Noah Smith and Brandon Anderson that Ukraine has already revealed a new form of war built around cheap, mass-produced, increasingly autonomous drones. FPV drones, he says, have displaced artillery as the main killer on the front, while China’s manufacturing capacity and Western procurement habits point to a widening strategic gap. His case is not that tanks, artillery, infantry or aircraft have disappeared, but that militaries planning around scarce, expensive platforms are misreading the economics of the modern battlefield.
Abridge Bets Clinical Conversations Can Become Healthcare’s Intelligence Layer
Abridge executives Janie Lee and Chaitanya “Chai” Asawa argue that the patient-clinician conversation is becoming healthcare’s core intelligence layer, not merely an input for automated notes. In a discussion with Redpoint’s Jacob Effron, they describe Abridge’s move from ambient documentation into clinical decision support, prior authorization and other workflows that depend on EHR data, payer rules, medical literature and local guidelines. Their case is that healthcare AI will be judged less by chatbot fluency than by whether it can deliver accurate, low-latency, privacy-preserving support inside clinical workflows without adding to clinicians’ alert burden.
AI Coding Makes Software-Engineering Fundamentals More Important
Matt Pocock, a TypeScript teacher now focused on AI engineering, argues that AI coding has made software-engineering fundamentals more important rather than less. In a conversation with Shawn Wang, Pocock says code generation works best when humans define the architecture, module boundaries and domain language that give agents a coherent system to change. The lesson he draws from Claude Code and other fast-moving tools is that tool-specific knowledge ages quickly, while engineering judgment remains the durable layer.