Papers
Topics
Authors
Recent
Search
2000 character limit reached

Multi-SLERP: Multi-Vector Interpolation

Updated 21 February 2026
  • Multi-SLERP is a geometric interpolation method that extends classical SLERP by blending more than two vectors on the unit hypersphere while preserving angular relationships.
  • It employs both iterative pairwise SLERP and tangent-space (Karcher mean) strategies to compute convex combinations under geodesic distance, balancing order sensitivity and numerical stability.
  • The technique has practical applications in composed image retrieval, multimodal representation learning, and model merging in reinforcement learning, demonstrating empirical success across these domains.

Multi-SLERP is a family of techniques for geometric interpolation of multiple vectors on the unit hypersphere, generalizing classical Spherical Linear Interpolation (SLERP) to combine more than two endpoints. This concept arises across domains including composed image retrieval, multimodal representation learning, and weight-space model merging in reinforcement learning. Multi-SLERP blends k vectors (“endpoints”) with prescribed weights, ensuring the result remains on the sphere and reflects the intended convex combination under geodesic (angular) distance. Recent work formalizes both sequential pairwise and Riemannian mean constructions, and demonstrates empirical success in tasks such as Zero-Shot Composed Image Retrieval (ZS-CIR) and multi-policy model merging for LLM alignment.

1. Mathematical Foundations of SLERP and Multi-SLERP

Classical SLERP interpolates between two d-dimensional, nonzero vectors u,vRdu, v \in \mathbb{R}^d (often 2\ell_2 normalized to lie on Sd1S^{d-1}) along the shortest geodesic on the hypersphere. The interpolation at fraction τ[0,1]\tau\in[0,1] is given by:

Slerp(u,v;τ)=sin((1τ)θ)sinθu+sin(τθ)sinθv,\operatorname{Slerp}(u, v; \tau) = \frac{\sin((1-\tau)\theta)}{\sin \theta} u + \frac{\sin(\tau\theta)}{\sin \theta} v,

where θ=arccos(u/u,v/v)\theta = \arccos (\langle u/\|u\|, v/\|v\| \rangle). This method ensures the curve tracks the great circle between endpoints, maintaining a constant angular rate.

Multi-SLERP extends this framework to k>2k>2 endpoints e1,,ekSd1e_1,\ldots,e_k \in S^{d-1} with weights λ1,,λk\lambda_1,\ldots,\lambda_k (λi0\lambda_i\ge0, λi=1\sum\lambda_i=1). There are two principal strategies:

  • Iterative Pairwise Slerp: Sequentially interpolate according to the weights, imposing an order on the endpoints.
  • Spherical Weighted Mean via Log/Exp Maps: Compute the weighted Karcher (Fréchet) mean under the sphere’s geometry using tangent space operations and exponential maps.

These approaches differ in their treatment of path-dependence and respect for all weights.

2. Sequential and Riemannian Algorithms for Multi-SLERP

The two principal multi-way spherical blending protocols are:

(a) Iterative Pairwise Slerp:

Begin with c1=e1c_1 = e_1. For each j=2,,kj=2,\ldots,k,

cj=Slerp(cj1,ej;λj/i=1jλi).c_j = \operatorname{Slerp}(c_{j-1}, e_j; \lambda_j/\sum_{i=1}^j \lambda_i).

This process has O(kd)O(kd) time complexity and O(k)O(k) trigonometric operations, but the result depends on the chosen order of endpoints unless all are collinear.

(b) Tangent-space (Karcher mean) Spherical Mean:

Select a reference point pSd1p \in S^{d-1} (often p=e1p=e_1). For each eie_i:

θi=arccos(p,ei),logp(ei)=θisinθi(eicosθip)\theta_i = \arccos(\langle p, e_i\rangle),\quad \log_p(e_i) = \frac{\theta_i}{\sin\theta_i}\,(e_i - \cos\theta_i\,p)

Sum the weighted tangent vectors:

vt=i=1kλilogp(ei)v_t = \sum_{i=1}^k \lambda_i \log_p(e_i)

Project back to the sphere:

c=expp(vt)=cos(vt)p+sin(vt)vtvtc = \exp_p(v_t) = \cos(\|v_t\|)p + \sin(\|v_t\|)\frac{v_t}{\|v_t\|}

This yields the point on Sd1S^{d-1} minimizing iλidist(c,ei)2\sum_i \lambda_i \operatorname{dist}(c, e_i)^2, the Karcher mean with respect to cosine (geodesic) distance (Jang et al., 2024).

3. Multi-SLERP in Vision-Language Retrieval and Compositionality

In ZS-CIR, multi-SLERP facilitates combining an image embedding with multiple text modifications or multiple images/texts in a unified vector space. Consider image and text encoders E1E_1, E2E_2 yielding normalized v=E1(x)v=E_1(x), w=E2(t)w=E_2(t). For multiple input modalities or modification commands e1,,eke_1,\ldots,e_k, multi-SLERP (using either method above) produces a compositional embedding for retrieval. Key features:

  • No explicit projection or pseudo-token learning required.
  • Geometric interpolation leverages only the modality-aligned representation space.
  • Allows smooth control over weighting and bias toward any combination of attributes or modalities (Jang et al., 2024).

A plausible implication is that this framework can incorporate fine-grained, multi-attribute user queries in retrieval or generation workflows, e.g., multi-facet search in vision or synthesis of cross-modal content with controlled attribute blending.

4. Model Merging in RLHF and Policy Averaging

In policy optimization, especially RLHF for LLMs, multi-SLERP is used to blend task vectors from multiple independently fine-tuned models. Given initialization θ0\theta_0, and fine-tuned policies {θm}m=1M\{\theta^m\}_{m=1}^M, define task vectors δm=θmθ0\delta^m = \theta^m - \theta_0. For MM models, iterative multi-SLERP merging proceeds as:

  • Merge {θ1,,θM}\{\theta^1,\ldots,\theta^M\} one-by-one with mixing coefficient λ=1/M\lambda=1/M at each stage:

slerpm({θm}m=1M)=slerplayer(slerpm1({θm}m=1M1),θM;λ=1/M),\text{slerp}_m(\{\theta^m\}_{m=1}^M) = \text{slerp}_\text{layer}(\text{slerp}_{m-1}(\{\theta^m\}_{m=1}^{M-1}), \theta^M; \lambda=1/M),

with slerp1({θ1})=θ1\text{slerp}_1(\{\theta^1\}) = \theta^1.

  • In WARP (Ramé et al., 2024), this step forms the basis of stage-2 merging, with subsequent linear interpolation (“LITI”) back to the initialization to traverse the KL–reward Pareto front.

Empirical results indicate:

  • Reward as a function of mixing coefficient is convex under SLERP and can exceed all endpoints; for M>2,M>2, merging more policies monotonically improves the Pareto front after LITI.
  • SLERP and LERP (linear interpolation) have complementary strengths; mixing via SLERP covers high-reward, high-KL regions unreachable by linear averaging.

Table: Multi-SLERP Procedures in Key Contexts

Domain Endpoints Multi-SLERP Strategy
ZS-CIR Embeddings Pairwise / Karcher
RLHF/WARP Model weights Iterative SLERP

5. Theoretical Properties and Computational Considerations

  • Order Sensitivity: Pairwise Slerp is path-dependent if endpoints are not collinear. The order of combination affects the result unless additional symmetrization is imposed.
  • Riemannian Mean Robustness: The log/exp map approach yields the unique minimizer of the weighted sum of squared geodesic distances on Sd1S^{d-1}, but numerically, selection of the reference point pp and points near-antipodal to pp can create instability.
  • High-dimensional Effects: In large dd, the hypersphere becomes locally flat (concentration of measure). In this limit, differences between spherical and Euclidean interpolations diminish, but subtle geometric effects persist in fine-tuned retrieval and compositionality metrics.
  • Computational Cost: Both methods require O(kd)O(kd) arithmetic; the Riemannian mean method centralizes computation to a single tangent space projection and exponential map, while iterative SLERP requires O(k)O(k) trigonometric function calls.

6. Limitations and Practical Challenges

Multi-SLERP faces several practical exigencies:

  • Non-uniqueness of the geodesic path when k>2k>2 implies multiple plausible interpolants, especially in sequential (pairwise) merges.
  • Accumulation of numerical errors in iterated SLERP merges may produce deviations from intended weights.
  • The choice of merging strategy can bias the output, especially if endpoints are widely separated on the sphere or have highly uneven weights.
  • In the tangent-space Karcher mean approach, finding the global Riemannian mean may require iterative algorithms to avoid bias from the initial reference point, especially for actual endpoints that are far apart.

7. Applications and Prospects

Multi-SLERP is now established in several machine learning subfields:

  • Composed Retrieval: Enabling flexible, multi-attribute queries in text-image models without extra parameterization (Jang et al., 2024).
  • Style/Concept Blending: Blending embeddings from multiple sources for generative and retrieval tasks.
  • RLHF Model Merging: Progressive policy averaging to improve Pareto fronts for reward–KL tradeoffs in LLMs (Ramé et al., 2024).
  • Multi-Objective Merging: Combining models trained under distinct objectives (e.g., length penalty) to mitigate single-objective biases.

Future directions may include more robust, order-independent multi-SLERP algorithms, scalable Riemannian mean computation for massively multi-modal tasks, and theoretical study of concentration phenomena for informed architectural choices. The geometric techniques underlying Multi-SLERP are expected to underpin further advances in compositionality and model fusion for high-dimensional, multimodal spaces.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (2)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Multi-SLERP.