The Curse of Conditions: Analyzing and Improving Optimal Transport for Conditional Flow-Based Generation

Published 13 Mar 2025 in cs.LG and cs.CV | (2503.10636v3)

Abstract: Minibatch optimal transport coupling straightens paths in unconditional flow matching. This leads to computationally less demanding inference as fewer integration steps and less complex numerical solvers can be employed when numerically solving an ordinary differential equation at test time. However, in the conditional setting, minibatch optimal transport falls short. This is because the default optimal transport mapping disregards conditions, resulting in a conditionally skewed prior distribution during training. In contrast, at test time, we have no access to the skewed prior, and instead sample from the full, unbiased prior distribution. This gap between training and testing leads to a subpar performance. To bridge this gap, we propose conditional optimal transport C^2OT that adds a conditional weighting term in the cost matrix when computing the optimal transport assignment. Experiments demonstrate that this simple fix works with both discrete and continuous conditions in 8gaussians-to-moons, CIFAR-10, ImageNet-32x32, and ImageNet-256x256. Our method performs better overall compared to the existing baselines across different function evaluation budgets. Code is available at https://hkchengrex.github.io/C2OT

Abstract PDF Upgrade to Chat

Summary

The paper introduces C²OT, a novel method that incorporates condition-aware weighting into optimal transport to address train-test distribution gaps in flow-based generation.
It demonstrates that C²OT achieves straighter flow paths and improved adherence to conditions, yielding better FID and CLIP scores across synthetic and real datasets.
The approach significantly reduces test-time computational overhead by lowering the number of required integration steps while maintaining high generative quality.

Analyzing and Improving Optimal Transport for Conditional Flow-Based Generation

In the study of conditional flow-based generative models, the interaction between flow matching (FM) and optimal transport (OT) has unveiled certain limitations and potential enhancements. This essay explores the central themes of the paper "The Curse of Conditions: Analyzing and Improving Optimal Transport for Conditional Flow-Based Generation" (2503.10636), focusing on the novel methodologies proposed to overcome existing challenges in conditional generation contexts.

Introduction to Conditional Generation and the CRFM Approach

Flow-matching-based conditional generative models aim to generate data guided by specific conditions, such as class labels or textual descriptions. Recent advancements utilize flow matching (FM) to simplify training objectives and improve sample quality. However, test-time performance is computationally expensive, owing to the many numerical integration steps required by typical ODE solvers (Figure 1).

Figure 1: We visualize the coupling during training and the learned flow from $t=0$ to $t=1$ during testing for FM, OT, and our proposed method.

The study identifies a significant drawback in using minibatch OT for conditional generation. While OT efficiently straightens the flow paths in unconditional scenarios, it neglects conditions during coupling, leading to significant discrepancies between prior distributions at training and testing phases. This phenomenon results in degraded performance due to the train-test gap.

Conditional Optimal Transport Overview

To address the shortcomings of standard OT in conditional settings, the paper introduces Conditional Optimal Transport Flow Matching (C $^2$ OT). By injecting a condition-aware weighting term in the cost matrix, C $^2$ OT effectively bridges the gap created by condition-agnostic coupling.

Formulation: C $^2$ OT restructures the joint distribution by incorporating independent samples from the prior and conditions, while linking data conditionally and through OT. This framework ensures the preservation of marginal distributions necessary for accurate conditional generation at test time.
Implementation: For discrete conditions, C $^2$ OT enforces strict condition matching, whereas for continuous conditions, a relaxed penalty function is applied. This adaptive strategy allows for more robust conditional generation without compromising the straightened path benefit of OT.

Experimental Validation and Results

Extensive experiments conducted on synthetic 2D datasets, CIFAR-10, and ImageNet validate the efficacy of C $^2$ OT. The results demonstrate significant improvements over traditional FM and OT methods, particularly in scenarios involving complex, high-dimensional data (Figure 2).

Figure 2: Visual comparisons of 256 $\times$ 256 images generated by the baselines and our approach with different amounts of sampling steps.

Path Straightening: C $^2$ OT consistently achieves straighter flow paths compared to FM, significantly reducing the computational overhead during test time without sacrificing generative quality.
Conditioning Performance: The method exhibits superior adherence to conditions, evident through improved scores in mean FID and CLIP metrics across various datasets. This reflects the model's robustness in handling both discrete and continuous conditioning scenarios.
Computation Efficiency: By maintaining straight paths using OT conditioning, C $^2$ OT considerably reduces the number of function evaluations (NFE) required, especially in adaptive methods.

Conclusion

The exploration and refinement of OT within the context of conditional flow-based generation highlight the critical role of coupling strategies in optimizing generative performance. The introduction of C $^2$ OT effectively neutralizes condition-induced biases in prior distributions, offering a significant leap towards achieving efficient and accurate conditional generative models. Future research might explore further integration of C $^2$ OT with consistency models or extend its application to other complex generative tasks, ensuring continued advancements in AI-driven generative modeling.