Peter Potaptchik

DPhil student in statistics and machine learning at the University of Oxford, working on generative modelling, diffusion and flow matching models, and sampling. He is advised by George Deligiannidis, Saifuddin Syed, and Yee Whye Teh, and is based at Harvard University as a Fellow working with Michael S. Albergo.

Meta Flow Maps Cut Reward-Alignment Costs With One-Step Posterior Sampling

Peter Potaptchik presents Meta Flow Maps as an amortized way to remove a costly inner loop in reward-aligning generative models: repeatedly simulating trajectories to estimate expected future reward from a noisy state. The method trains stochastic flow maps to produce differentiable, one-step samples from the clean-data posterior conditioned on any time and noisy state, enabling value-gradient estimates for inference-time steering and an off-policy objective for fine-tuning. In ImageNet experiments, Potaptchik argues, this lets a single-particle steered sampler outperform Best-of-1000 baselines across several rewards with far less compute.

Microsoft ResearchMay 26, 202616 min read