Qiyang Li

Qiyang Li is a PhD student in computer science at UC Berkeley advised by Sergey Levine. His research focuses on reinforcement learning, robot learning, and using expressive generative models such as flow-matching policies to improve offline learning and online exploration.

Flow Policies Need New Q-Learning Methods for Online Robot Adaptation

UC Berkeley PhD student Qiyang “Colin” Li argues that the flow-matching and diffusion policies now effective for robotic manipulation expose a weakness in standard Q-learning: they model complex, multimodal action chunks well, but are hard to optimize with the reparameterized actor gradients used in efficient continuous-control RL. He presents two approaches, Flow Q-learning and Q-learning with Adjoint Matching, as ways to make off-policy RL work with these policies while reusing prior robot data. The trade-off, in Li’s account, is between the stability gained by distilling flows into one-step actors and the expressivity preserved by keeping multistep flow policies.

Microsoft ResearchMay 26, 202619 min read