AI Scaling Faces an Energy Wall Without Physics-First Hardware

Naveen RaoSequoia CapitalThursday, May 7, 20266 min read

At AI Ascent 2026, Unconventional AI founder and CEO Naveen Rao argued that the current AI compute stack is approaching an energy wall because it is built on an 80-year-old digital computing model poorly suited to intelligence. Rao’s case is that GPUs and matrix math cannot close the efficiency gap with biological brains fast enough, and that AI hardware must instead be rebuilt around physical dynamics, time-domain computation, and architectures that blur memory and processing. He presented Unconventional AI’s coupled-oscillator chip prototype as an attempt to move compute closer to the thermodynamic limits of intelligence per watt.

The bottleneck is the substrate, not only the model

Naveen Rao opened with a deliberately blunt claim: “ASI won’t happen” unless AI compute becomes much more efficient. He did not mean better algorithms, better data, or better training tricks. He meant the physical substrate of computation — “how I do information processing at the physics level.”

The core claim is that AI is being built on an 80-year-old digital computing abstraction designed for a different era and a different purpose. Floating-point numbers, matrix math, and the broader digital computer stack came out of machines from the 1940s. Rao argued that those assumptions are a poor fit for machines whose central job is intelligence.

The urgency, in his telling, is energy. AI may make people more productive, but that does not mean the overall system is becoming more energy-efficient. AI training and inference already consume “many gigawatts,” he said, and he expects AI to hit a hard energy wall not in a decade, but in “two, three, four years.”

160 GW

Rao’s estimate for the combined power use of all human brains: 8 billion people at 20 watts each

He framed the comparison starkly. The world has about 9,000 gigawatts of generation capacity, with roughly 1,000 gigawatts in the United States, and that capacity runs everything else too: homes, heating, electric cars, and “all the things.” By contrast, the entirety of humanity’s brainpower runs on about 160 gigawatts.

Quantity	Value Rao gave	Context
All human brains	160 GW	8 billion people at about 20 watts each
World generation capacity	9,000 GW	Capacity for all uses, not only AI
United States generation capacity	1,000 GW	Comparison figure Rao gave
Human brain	20 W	Per-person brain power
AI computer	1e8 W on slide; megawatt-to-gigawatt range in remarks	Rao called the machine-side number rough and all-in

Rao’s energy comparison between biological intelligence, power-generation capacity, and current AI compute

Rao called the machine-side estimate rough. He described a modern AI “computer” in all-in terms — model building, inference, and operation — as somewhere in the megawatt to gigawatt range. The point of the estimate was the gap: the present compute paradigm is vastly less efficient than biological intelligence, and the constraint is becoming practical. The question is how much intelligence can be learned and run on a given amount of energy.

Biology is the existence proof

For Naveen Rao, biology was not decoration; it was evidence that another efficiency regime exists. He placed mammalian brains near the top of a chart of “intelligence per watt,” below the thermodynamic limit but far above current chips. He cited the Landauer principle as a way to think about the ultimate physical limit of computation.

Biology is not the limit in this account. Mammalian brains are “pretty darn efficient,” he said, but probably still an order or two of magnitude below the thermodynamic asymptote. On the same chart, current systems sat near the bottom. Rao labeled a higher line as the “limits of 2D litho” and said focused effort could push toward that region; the gap from where systems are today to that plotted 2D-lithography limit was, in his estimate, roughly three orders of magnitude.

The biological comparisons were concrete and all pointed in the same direction. Human brains consume about 20 watts. A macaque monkey’s brain is probably under one watt. Across mammals and insects, Rao pointed to complex behavior at milliwatt scales, with one slide summarizing animal brains as operating from roughly 20 milliwatts to 20 watts. His most vivid example was a squirrel jumping between branches in wind on less than 10 milliwatts — about one-hundredth of the power draw of a phone in someone’s pocket, by his comparison.

Current computers, he said, cannot reproduce that kind of embodied behavior with anything close to the same energy budget. He acknowledged a common objection: humans produce fewer tokens per second than machines. But he argued that biological intelligence remains higher in important respects, especially discovery. Machines may reach that level soon, he said, but on the current path it will come “at the cost of a lot of energy.”

Brains compute with dynamics

Naveen Rao said neuroscience does not yet fully explain how the brain works. “We don’t really know,” he said, speaking as both a computer scientist and neuroscientist. But he argued that neuroscience offers ideas worth extracting into hardware.

The central idea is nonlinear dynamics. Brains, he said, do not compute by doing floating-point matrix math. They have time-varying interactions between neurons, and “that’s actually where the compute lies.” They are stochastic, distributed, and slow. They are not digital systems where one wrong bit can collapse the result.

That distinction matters because the dominant AI stack is organized around matrix math. Rao credited Nvidia with owning and advancing that market, but argued that the energy efficiency of delivering, for example, an FP8 flop has improved only incrementally once memory access is included. Costs have fallen because manufacturing, packaging, and systems work have improved. But the energy per operation, especially when moving data to and from memory, has not changed enough.

Dynamics exploit the time dimension of the underlying physics. Digital primitives do not.

Naveen Rao

To explain the alternative, Rao showed Kuramoto synchronization: 32 metronomes on a coupled platform beginning out of phase and then synchronizing. His point was that a coupled dynamical system can converge from many starting states based only on the coupling among its parts. If that coupling becomes trainable, the system can move through a rich state space in useful ways.

He then translated the analogy into electronics: a network of coupled ring oscillators connected through a trainable fabric. Such a circuit, he argued, can produce nonlinear interactions that begin to resemble a computational property of brains. The state is not read out, operated on, and written back in the usual way. The physics “run.”

In a conventional von Neumann machine, time is simulated as discrete steps: retrieve state, operate, write state back, repeat. Rao said that shuttling is what burns most of the energy in existing systems. In a dynamical system, by contrast, the initial state is perturbed and the system evolves. Computation is implicit in the trajectory.

The prototype is the proof attempt

Naveen Rao said Unconventional AI had gone from “basically no team in January” to a full prototype in six months. He showed a chip layout and said the company planned to build the chip this summer. He attributed the speed partly to being a startup with no legacy baggage and partly to AI itself, which he said enabled a different way of building.

The chip concept was a physical realization of the coupled-oscillator idea. The company is investigating whether learned parameters and inputs can steer dynamical systems in programmable ways. One slide showed trajectories in state space becoming more controllable, including an example described as spelling “UN” with four oscillators. Rao’s answer to the programmability question was direct: yes, these systems can be trained and steered into “basically any arbitrary set of trajectories.”

He also connected the approach to familiar AI workloads. In a demo titled “Implicit Kuramoto Inference,” Rao showed a dynamics-based generative model trained on image classes. He described starting from randomness, applying backpropagation of an error signal toward a target class at time t equals one, and then letting the system evolve naturally. In the demo, representations converged into cat- or horse-like images and then morphed over time, suggesting that the model had learned meaningful regions of state space rather than static pixel templates.

The final architecture comparison was CPU, GPU, compute-in-memory, and dynamical systems. CPUs, Rao said, remain best for fast single-threaded work. GPUs expand the same basic pattern by moving many operands from memory, operating on them, and writing them back. Compute-in-memory improves locality by moving the operation onto the chip. But for Rao, those are still variants of the same separation between memory and computation.

Dynamical systems are different because “the state and the function are overlapped with the physics themselves.” There is no clean separation between state and computation. That is why he called the approach “truly non-von Neumann.”

AI Research Methods AI Infrastructure and Compute

The bottleneck is the substrate, not only the model

Biology is the existence proof

Brains compute with dynamics

The prototype is the proof attempt

The frontier, in your inbox tomorrow at 08:00.