Shortest-Path Flow Matching with Mixture-Conditioned Bases for OOD Generalization to Unseen Conditions

Abstract

Robust generalization under distribution shift remains a key challenge for conditional generative modeling: conditional flow-based methods often fit the training conditions well but fail to extrapolate to unseen ones. We introduce SP-FM, a shortest-path flow-matching framework that improves out-of-distribution (OOD) generalization by conditioning both the base distribution and the flow field on the condition. Specifically, SP-FM learns a condition-dependent base distribution parameterized as a flexible, learnable mixture, together with a condition-dependent vector field trained via shortest-path flow matching. Conditioning the base allows the model to adapt its starting distribution across conditions, enabling smooth interpolation and more reliable extrapolation beyond the observed training range. We provide theoretical insights into the resulting conditional transport and show how mixture-conditioned bases enhance robustness under shift. Empirically, SP-FM is effective across heterogeneous domains, including predicting responses to unseen perturbations in single-cell transcriptomics and modeling treatment effects in high-content microscopy–based drug screening. Overall, SP-FM provides a simple yet effective plug-in strategy for improving conditional generative modeling and OOD generalization across diverse domains.

Publication
arXiv
Arian Amani
Arian Amani
Machine Learning Scientist

I am a Machine Learning Scientist at AI VIVO and a Data Scientist at the Wellcome Sanger Institute. My work is at the intersection of computational biology and drug discovery, where I develop deep generative and foundation models for molecules and cells. I specialize in molecule generation and single-cell perturbation modeling using advanced techniques like VAEs, Diffusions, Transformers, and Flow Matchings. I’m passionate about building AI methods that accelerate target discovery and therapeutic design.