Multi-SLERP: Multi-Vector Interpolation
- Multi-SLERP is a geometric interpolation method that extends classical SLERP by blending more than two vectors on the unit hypersphere while preserving angular relationships.
- It employs both iterative pairwise SLERP and tangent-space (Karcher mean) strategies to compute convex combinations under geodesic distance, balancing order sensitivity and numerical stability.
- The technique has practical applications in composed image retrieval, multimodal representation learning, and model merging in reinforcement learning, demonstrating empirical success across these domains.
Multi-SLERP is a family of techniques for geometric interpolation of multiple vectors on the unit hypersphere, generalizing classical Spherical Linear Interpolation (SLERP) to combine more than two endpoints. This concept arises across domains including composed image retrieval, multimodal representation learning, and weight-space model merging in reinforcement learning. Multi-SLERP blends k vectors (“endpoints”) with prescribed weights, ensuring the result remains on the sphere and reflects the intended convex combination under geodesic (angular) distance. Recent work formalizes both sequential pairwise and Riemannian mean constructions, and demonstrates empirical success in tasks such as Zero-Shot Composed Image Retrieval (ZS-CIR) and multi-policy model merging for LLM alignment.
1. Mathematical Foundations of SLERP and Multi-SLERP
Classical SLERP interpolates between two d-dimensional, nonzero vectors (often normalized to lie on ) along the shortest geodesic on the hypersphere. The interpolation at fraction is given by:
where . This method ensures the curve tracks the great circle between endpoints, maintaining a constant angular rate.
Multi-SLERP extends this framework to endpoints with weights (, ). There are two principal strategies:
- Iterative Pairwise Slerp: Sequentially interpolate according to the weights, imposing an order on the endpoints.
- Spherical Weighted Mean via Log/Exp Maps: Compute the weighted Karcher (Fréchet) mean under the sphere’s geometry using tangent space operations and exponential maps.
These approaches differ in their treatment of path-dependence and respect for all weights.
2. Sequential and Riemannian Algorithms for Multi-SLERP
The two principal multi-way spherical blending protocols are:
(a) Iterative Pairwise Slerp:
Begin with . For each ,
This process has time complexity and trigonometric operations, but the result depends on the chosen order of endpoints unless all are collinear.
(b) Tangent-space (Karcher mean) Spherical Mean:
Select a reference point (often ). For each :
Sum the weighted tangent vectors:
Project back to the sphere:
This yields the point on minimizing , the Karcher mean with respect to cosine (geodesic) distance (Jang et al., 2024).
3. Multi-SLERP in Vision-Language Retrieval and Compositionality
In ZS-CIR, multi-SLERP facilitates combining an image embedding with multiple text modifications or multiple images/texts in a unified vector space. Consider image and text encoders , yielding normalized , . For multiple input modalities or modification commands , multi-SLERP (using either method above) produces a compositional embedding for retrieval. Key features:
- No explicit projection or pseudo-token learning required.
- Geometric interpolation leverages only the modality-aligned representation space.
- Allows smooth control over weighting and bias toward any combination of attributes or modalities (Jang et al., 2024).
A plausible implication is that this framework can incorporate fine-grained, multi-attribute user queries in retrieval or generation workflows, e.g., multi-facet search in vision or synthesis of cross-modal content with controlled attribute blending.
4. Model Merging in RLHF and Policy Averaging
In policy optimization, especially RLHF for LLMs, multi-SLERP is used to blend task vectors from multiple independently fine-tuned models. Given initialization , and fine-tuned policies , define task vectors . For models, iterative multi-SLERP merging proceeds as:
- Merge one-by-one with mixing coefficient at each stage:
with .
- In WARP (Ramé et al., 2024), this step forms the basis of stage-2 merging, with subsequent linear interpolation (“LITI”) back to the initialization to traverse the KL–reward Pareto front.
Empirical results indicate:
- Reward as a function of mixing coefficient is convex under SLERP and can exceed all endpoints; for merging more policies monotonically improves the Pareto front after LITI.
- SLERP and LERP (linear interpolation) have complementary strengths; mixing via SLERP covers high-reward, high-KL regions unreachable by linear averaging.
Table: Multi-SLERP Procedures in Key Contexts
| Domain | Endpoints | Multi-SLERP Strategy |
|---|---|---|
| ZS-CIR | Embeddings | Pairwise / Karcher |
| RLHF/WARP | Model weights | Iterative SLERP |
5. Theoretical Properties and Computational Considerations
- Order Sensitivity: Pairwise Slerp is path-dependent if endpoints are not collinear. The order of combination affects the result unless additional symmetrization is imposed.
- Riemannian Mean Robustness: The log/exp map approach yields the unique minimizer of the weighted sum of squared geodesic distances on , but numerically, selection of the reference point and points near-antipodal to can create instability.
- High-dimensional Effects: In large , the hypersphere becomes locally flat (concentration of measure). In this limit, differences between spherical and Euclidean interpolations diminish, but subtle geometric effects persist in fine-tuned retrieval and compositionality metrics.
- Computational Cost: Both methods require arithmetic; the Riemannian mean method centralizes computation to a single tangent space projection and exponential map, while iterative SLERP requires trigonometric function calls.
6. Limitations and Practical Challenges
Multi-SLERP faces several practical exigencies:
- Non-uniqueness of the geodesic path when implies multiple plausible interpolants, especially in sequential (pairwise) merges.
- Accumulation of numerical errors in iterated SLERP merges may produce deviations from intended weights.
- The choice of merging strategy can bias the output, especially if endpoints are widely separated on the sphere or have highly uneven weights.
- In the tangent-space Karcher mean approach, finding the global Riemannian mean may require iterative algorithms to avoid bias from the initial reference point, especially for actual endpoints that are far apart.
7. Applications and Prospects
Multi-SLERP is now established in several machine learning subfields:
- Composed Retrieval: Enabling flexible, multi-attribute queries in text-image models without extra parameterization (Jang et al., 2024).
- Style/Concept Blending: Blending embeddings from multiple sources for generative and retrieval tasks.
- RLHF Model Merging: Progressive policy averaging to improve Pareto fronts for reward–KL tradeoffs in LLMs (Ramé et al., 2024).
- Multi-Objective Merging: Combining models trained under distinct objectives (e.g., length penalty) to mitigate single-objective biases.
Future directions may include more robust, order-independent multi-SLERP algorithms, scalable Riemannian mean computation for massively multi-modal tasks, and theoretical study of concentration phenomena for informed architectural choices. The geometric techniques underlying Multi-SLERP are expected to underpin further advances in compositionality and model fusion for high-dimensional, multimodal spaces.