OverIVA: Overdetermined IVA for Source Separation

Updated 25 January 2026

OverIVA is a method that generalizes IVA to address overdetermined mixtures with more microphones than target sources, modeling nonstationary signals versus Gaussian backgrounds.
It employs block coordinate descent and iterative projection updates to optimize a likelihood-based cost function and achieve efficient source separation.
OverIVA demonstrates robust separation performance, reduced computational complexity, and versatility in both offline and online processing environments.

Overdetermined Independent Vector Analysis (OverIVA) generalizes independent vector analysis (IVA) to the regime where the number of mixture channels (microphones), $M$ , exceeds the number of nonstationary target sources, $K$ . This approach is designed for convolutive blind source separation in both offline and online settings, enabling joint extraction of $K$ sources from $M$ mixtures in the presence of up to $M-K$ stationary or weakly nonstationary Gaussian background components. By explicitly modeling a low-dimensional non-Gaussian signal subspace against a high-dimensional Gaussian background and exploiting both statistical independence and subspace orthogonality, OverIVA achieves robust, computationally efficient multi-microphone source separation and extraction.

1. Signal and Statistical Model

The OverIVA model considers the STFT domain with observed mixture vectors $x(f, t) \in \mathbb{C}^M$ at each frequency bin $f$ and time frame $t$ . The mixture is decomposed as

$x(f, t) = A_s(f)\,s(f, t) + A_z(f)\,z(f, t),$

where $A_s(f) \in \mathbb{C}^{M \times K}$ mixes the $K$ target sources $s(f, t) \in \mathbb{C}^K$ , and $A_z(f) \in \mathbb{C}^{M \times (M-K)}$ mixes $M-K$ Gaussian background signals $z(f, t) \in \mathbb{C}^{M-K}$ (Ikeshita et al., 2020, Scheibler et al., 2019, Brendel et al., 2020, Scheibler et al., 2020).

A full-rank demixing matrix $W(f) \in \mathbb{C}^{M \times M}$ is learned to extract both target sources and background:

$s_k(f, t) = w_k(f)^H x(f, t), \quad k=1, \ldots, K$
$z(f, t) = W_z(f)^H x(f, t), \quad W_z(f) = [w_{K+1}(f), \ldots, w_M(f)]$

Statistical assumptions:

Target sources are temporally nonstationary and mutually independent, modeled as $s_k(t) \sim \mathcal{CN}(0, \lambda_k(t) I_F)$ .
Background signals are stationary Gaussian: $z(f, t) \sim \mathcal{CN}(0, I_{M-K})$ (Ikeshita et al., 2020, Chen et al., 18 Jan 2026, Scheibler et al., 2019).
Sources and background are mutually independent.

This modeling enables the explicit separation of nonstationary sources of interest from stationary or weakly nonstationary Gaussian environments.

2. Likelihood, Objective, and Bayesian Formulation

The OverIVA objective is derived from maximum likelihood or MAP estimation with Gaussian or super-Gaussian source priors and a Gaussian background model. For nonstationary target sources, the negative log-likelihood (up to irrelevant constants) is:

$J(W, \lambda) = \sum_{k, t} \left[ \frac{||s_k(t)||^2}{\lambda_k(t)} + F \log \lambda_k(t) \right] + \sum_{f, t} ||z(f, t)||^2 - 2T \sum_f \log |\det W(f)|.$

Optimizing over $\lambda$ , the cost reduces to:

$J_W(W) = \sum_{k=1}^K w_k^H G_k w_k + \mathrm{tr}[W_z^H G_z W_z] - 2\log |\det W|,$

where $G_k = \frac{1}{T}\sum_t x(t)x(t)^H/\lambda_k(t)$ and $G_z=\frac{1}{T} \sum_t x(t)x(t)^H$ (Ikeshita et al., 2020).

In the Bayesian framework, the demixing matrix $W$ can be given priors (e.g., steering vector alignment or spatial regularization). The full MAP cost incorporates both source and background models:

$J(W) = \sum_{k=1}^K \langle -\log p(\underline s_{k,n}) \rangle_n - 2\sum_f \log |\det W_f| + J_{BG}(W) + J_{prior}(W),$

where $J_{BG}(W)$ handles the Gaussian background and $J_{prior}(W)$ encodes any desired spatial prior (Brendel et al., 2020).

3. Algorithmic Structure: Block Coordinate and Iterative Projection Methods

OverIVA algorithms employ block coordinate descent (BCD) and/or majorization-minimization (MM) strategies with efficient iterative projection (IP) updates.

Demixing updates: Each source demixing vector $w_k(f)$ , $k=1,\ldots,K$ , is updated by solving the stationarity condition:

$(W^H G_k) w_k = e_k,$

followed by normalization $w_k \leftarrow w_k/(w_k^H G_k w_k)^{1/2}$ (Ikeshita et al., 2020, Scheibler et al., 2019, Scheibler et al., 2020).

Background subspace update: The $W_z$ (or $U_f$ ) block is updated exploiting the background sample covariance:

$W_z \leftarrow \begin{pmatrix} - (W_s^H G_z E_s)^{-1}(W_s^H G_z E_z) \ I_{M-K} \end{pmatrix},$

with $W_s$ the stack of source demixing vectors, and $E_s$ , $E_z$ selecting respective identity submatrices.

Orthogonality constraint (OC): Some variants (e.g., OverIVA-OC) enforce strict sample covariance orthogonality between source and background subspaces at every iteration.
Specialized updates: For $K=1$ , a globally optimal solution is found via a single generalized eigenvector computation

$G_z u = \lambda G_1 u;$

the leading eigenvector provides the optimal demixing (Ikeshita et al., 2020, Scheibler et al., 2020).

Combining these IP and BCD steps yields fast convergence and scaling to scenarios with large $M$ .

4. Parameterizations, Complexity, and Acceleration

OverIVA achieves significant computational efficiency compared to determined IVA (e.g., AuxIVA), owing to operating directly in the lower-dimensional source subspace and leveraging analytic background updates.

Representative complexities (per frequency) (Ikeshita et al., 2020, Scheibler et al., 2019):

OverIVA (IP-1, IP-3, unconstrained): $O(K\,M^3)$ ; only $K$ demixing vectors sequentially updated, as opposed to $M$ in conventional IVA/AuxIVA ( $O(M^3)$ ).
Single-source (IP-2) variant: $O(M^3)$ per frequency.
Background update: Using explicit orthogonality, background subspace parameters are updated by $K \times K$ solves rather than full $M \times M$ , saving computation especially for $K \ll M$ .

Accelerations

Caching Cholesky factorization of $G_z$ ,
Avoiding full orthogonalization for the background subspace,
Efficient utilization of the Kronecker (bilinear) structure for demixing vectors in large arrays (Chen et al., 18 Jan 2026).

The bilinear OverIVA (BiIVA) further reduces the number of parameters from $M$ to $M_1+M_2$ , leveraging a Kronecker product $w_n = w_{n,1} \otimes w_{n,2}$ , allowing parameter savings of $M/(M_1+M_2)$ and smaller matrix inversions per update (Chen et al., 18 Jan 2026).

5. Empirical Performance and Practical Trade-Offs

OverIVA demonstrates competitive or superior separation performance compared to determined approaches, particularly as $M$ increases. Representative findings (Ikeshita et al., 2020, Chen et al., 18 Jan 2026, Scheibler et al., 2019, Scheibler et al., 2020):

SDR/RTF: For $K=1$ , $L=5$ , $M=3,5,7$ , OverIVA (IP-2) achieves SDR $5.3$–$8.6$ dB with RTF $0.017$–$0.10$, outperforming AuxIVA and OverIVA-OC/IP-1 both in quality and computational cost.
Multi-source ( $K>1$ ): IP-1/3 reach full AuxIVA separation quality with a factor $K/M$ computational reduction; empirical separation metrics (SDR, SIR) improve with microphone count and converge rapidly in practical conditions.
Online and large arrays: BiIVA attains a $10$ dB SIR/SDR gain over standard OverIVA for $36$ microphones, despite a reduction in parameter count from $36$ to $12$ per source per frequency (Chen et al., 18 Jan 2026).

Empirical validation utilizes simulated and measured room mixture data, various array geometries, and different background conditions, consistently showing that the OverIVA cost structure and subspace parameterization are robust to reverberation, nonwhite background, and high levels of interferers.

6. Variants, Priors, and Algorithm Selection

Several OverIVA algorithmic variants address different trade-offs and use-cases (Ikeshita et al., 2020, Scheibler et al., 2019, Brendel et al., 2020):

IP-1 (block source-then-background updates): Generic, efficient for $K>1$ .
IP-2 (single-source optimal): Use for $K=1$ to achieve best SDR and speed.
IP-3 / OverIVA-OC: Maintains OC at every iteration, matches IP-1 in separation quality but incurs higher computational cost.
Pairwise MM (IP2) updates: Used in advanced OverIVA and JISA-MM, improves convergence for $K>1$ , enables fast separation under challenging conditions (Scheibler et al., 2020).
Spatial priors / Bayesian OverIVA: Incorporate steering or spatial regularization using quadratic or Euclidean priors, further improving extraction when extra information is available (Brendel et al., 2020).
Bilinear/BiIVA: Parameter-efficient for large $M$ , critical in online and massively overdetermined settings (Chen et al., 18 Jan 2026).

Recommended usage: for $K=1$ , IP-2; for $K>1$ and $M \gg K$ , IP-1 or BiIVA; OverIVA-OC if orthogonality is imposed externally.

7. Impact, Limitations, and Extensions

OverIVA provides a principled framework for source separation in overdetermined mixtures, explicitly modeling the background and reducing computational load by targeting the relevant low-dimensional signal subspace. The background model is suited for stationary and weakly nonstationary interferers but assumes reliable stationarity or weak non-Gaussianity in the background for optimal discrimination.

A key distinction from determined IVA/AuxIVA is computational scaling and robustness in large arrays, where OverIVA is able to exploit spatial diversity without incurring the cubic growth in parameter estimation costs.

Extensions include:

Online and realtime OverIVA/ BiIVA: Adaptation for framewise or low-latency processing using exponential forgetting and alternating projection (Chen et al., 18 Jan 2026).
Spatially informed OverIVA: Bayesian framework to integrate source position priors, null constraints, or directivity information (Brendel et al., 2020).
Under- and hybrid-determined settings: Unified cost and update structures allow application where $K<M<Q$ , handling both extraction and separation flexibly.

The empirical evidence demonstrates OverIVA’s capacity to approach determined separation quality as $M$ increases, while obtaining significant computational and statistical robustness gains (Chen et al., 18 Jan 2026, Ikeshita et al., 2020, Scheibler et al., 2019, Brendel et al., 2020, Scheibler et al., 2020).