
Carles Domingo-Enrich
Senior Researcher at Microsoft Research New England in Cambridge, working on generative AI models, including diffusion and flow models, language models, and related machine learning, statistics, and AI-for-science topics. He organizes the MSR New England Generative Modeling & Sampling Seminar.
Stochastic Control Closes the Sampling Loop for Rare-Event Analysis
Microsoft Research’s Yuanqi Du and Carles Domingo-Enrich recast rare-event simulation as a stochastic optimal control problem, arguing that the committor function at the center of Transition Path Theory can be learned by using each current estimate to steer new trajectories into the transition region. Their framework turns committor estimation into a feedback loop: a transformed value function induces a Doob-style control, that control generates more useful reactive samples, and the samples improve the estimate. They present REACT-VM, an off-policy Value Matching objective with a stated first-order optimality guarantee, as the more principled version of the method, and report stronger benchmark results than variational committor-learning baselines.
FRIGID Scales Molecular Structure Elucidation With Masked Diffusion
MIT postdoc Runzhong Wang argues that de novo molecular structure elucidation from tandem mass spectrometry is constrained less by instruments than by computation: researchers can produce high-quality spectra, but often cannot infer the molecules behind them. His talk presents DiffMS and FRIGID, two diffusion-based inverse models that decompose the task into spectrum-to-fingerprint prediction and scalable fingerprint-to-structure generation. Wang’s central claim is that scaling helps most where chemical structure data are abundant, while forward fragmentation models can guide inference by identifying parts of a generated molecule that do not match the observed spectrum.
Hard Constraints Steer Generative AI Toward Chemically Valid Materials
MIT PhD student Mouyang Cheng argues that generative models for materials discovery need explicit scientific constraints, not just larger diffusion models. In a Microsoft Research seminar, he describes two approaches: diffusion inpainting that forces generated crystals to contain target structural motifs, and CrysVCD, a valence-constrained framework that generates charge-balanced formulas before predicting structures. His case is that constraints such as motifs, valence and stability screens make generative materials design more useful in a field where data are sparse and chemically invalid samples are easy to produce.
Self-Consistent Interpolants Learn Clean Priors From Corrupted Data
Jiequn Han’s talk argues that transport-based generative models should be treated not only as tools for sampling clean data distributions, but as machinery for recovering and adapting those distributions when the usual clean training set is absent. His main proposal, Self-Consistent Stochastic Interpolants, learns a clean prior from corrupted observations by iterating a transport map until the learned distribution, passed through a trusted forward simulator, reproduces the observed data. Han presents the method as a black-box alternative to EM-style inverse generative modeling, with the caveat that simulator mismatch remains a central unresolved risk.
Flow Policies Need New Q-Learning Methods for Online Robot Adaptation
UC Berkeley PhD student Qiyang “Colin” Li argues that the flow-matching and diffusion policies now effective for robotic manipulation expose a weakness in standard Q-learning: they model complex, multimodal action chunks well, but are hard to optimize with the reparameterized actor gradients used in efficient continuous-control RL. He presents two approaches, Flow Q-learning and Q-learning with Adjoint Matching, as ways to make off-policy RL work with these policies while reusing prior robot data. The trade-off, in Li’s account, is between the stability gained by distilling flows into one-step actors and the expressivity preserved by keeping multistep flow policies.
Hamiltonian Flow Maps Learn Larger Molecular Dynamics Steps Without Trajectories
Michael Plainer, Winfried Ripken and Gregor Lied argue that generative models can attack molecular dynamics’ central bottleneck: the gap between femtosecond integration steps and biological processes that unfold many orders of magnitude later. In the Microsoft Research seminar, they separate the problem by timescale, using diffusion models to sample equilibrium Boltzmann states and extract force information, while proposing Hamiltonian flow maps for the intermediate regime where simulations need large, stable steps without training on expensive future-state trajectories.
Fixed-Point Bridge Matching Makes Diffusion Sampling Scalable Without Target Data
Lorenz Richter’s seminar argues for a non-Markovian route to diffusion-based sampling when the target distribution is known only through an unnormalized density rather than data. He presents existing Markovian path-space samplers as theoretically flexible but increasingly constrained by trajectory simulation and storage costs, then proposes building reciprocal bridge measures from endpoint couplings and learning their Markovian projection by fixed-point regression. The resulting Bridge Matching Sampler, Richter says, uses a single learned control, accommodates flexible priors and reference processes, and shows improved stability and mode preservation in high-dimensional synthetic and molecular benchmarks, especially with damping.
Denoising Markov Models Generalize Diffusion Through Reverse-Time Generators
Stanford Ph.D. candidate Yinuo Ren argues that diffusion, discrete diffusion, and broader jump-based generative models can be treated as instances of the same problem: choose a forward Markov process that carries data toward a simple reference law, then learn its reverse-time generator. His framework gives conditions under which that reverse generator is explicit up to unknown densities and turns the resulting approximation problem into a path-space KL objective via Doob’s h-transform. The payoff, Ren says, is a principled way to design denoising models beyond Gaussian diffusion, including discrete and Lévy-type dynamics.
Energy-Based Fine-Tuning Improves Accuracy Without RLVR’s Validation-Loss Penalty
Mujin Kwun and Carles Domingo-Enrich present energy-based fine-tuning as a post-training method that replaces next-token imitation or task-specific rewards with sequence-level feature matching. Their argument is that supervised fine-tuning remains efficient but is trained under teacher forcing, while RL with verifiable rewards can improve accuracy without preserving the target completion distribution. EBFT instead samples model rollouts, compares their frozen-model feature embeddings with reference completions, and uses that signal for policy-gradient updates; in the reported coding and translation experiments, it matched or exceeded RLVR accuracy while producing lower validation cross-entropy than both RLVR and SFT.
Split-Flows Make Mapping Entropy Computable for Molecular Coarse-Graining
Tristan Bereau presents Split-Flows, a flow-based method for connecting atomistic and coarse-grained molecular representations by adding explicit noise variables for the degrees of freedom lost under coarse-graining. The argument is that this augmentation turns a many-to-one mapping into a tractable coordinate transform, enabling both generative backmapping and computation of configuration-dependent mapping entropy. Bereau says the approach makes information loss measurable for complex molecular systems, though it depends on a differentiable bijective construction and still faces scaling costs.
Diffusion Models Generate Images Through Critical Instability Windows
Luca Ambrogioni argues that trained diffusion models generate images through brief instability windows rather than uniform step-by-step denoising. In a Microsoft Research generative modeling seminar, he links score dynamics, conditional entropy and statistical-physics phase transitions to show how low-frequency spatial modes soften at critical times, allowing noise to organize into coherent structure. Experiments on patch models, Fashion-MNIST and ImageNet models are presented as evidence that these critical windows govern both pattern formation and the timing of effective guidance.
Energy-Based Fine-Tuning Trains Language Models on Whole Responses
Microsoft Research’s presentation on energy-based fine-tuning argues that language-model post-training can be aimed at whole responses rather than next-token imitation. Carles Domingo-Enrich presents EBFT as a middle path between supervised fine-tuning and reinforcement learning: it samples model completions, compares them with ground-truth answers in a model-derived feature space, and turns that comparison into a policy-gradient update without a separate reward model or verifier. The reported results show gains over SFT on several coding and translation measures, with performance often comparable to RLVR while avoiding explicit correctness rewards.