AI Progress Is Being Bought With Data, Not Sample Efficiency

Dwarkesh Patel

Dwarkesh Patel argues that recent AI progress is driven less by clear gains in sample efficiency than by an immense expansion of training data, including synthetic rollouts and highly specific human expert examples. In his account, frontier models can display broad professional competence because labs keep pushing more tasks into the training distribution, not because the systems learn new domains the way humans do. Patel says that data-heavy approach may still be commercially powerful when capabilities can be amortized across billions of uses, but it leaves unresolved whether current systems can solve their own sample-efficiency problem.