Extended Power Weiszfeld Method

Updated 17 January 2026

Extended Power Weiszfeld is a robust optimization algorithm that minimizes q-th power distances, generalizing the geometric median and weighted least squares.
It introduces a desingularization subgradient strategy to escape singularities, ensuring well-defined iterative updates and global convergence.
Empirical results, especially in portfolio optimization, show that intermediate q values balance robustness and convergence speed for enhanced performance.

The Extended Power Weiszfeld method is a de-singularity subgradient approach for solving the extended Weber location problem, formulated as the minimization of a sum of $q$ -th power distances between a variable point and a collection of weighted data points: $f(x) = \sum_{i=1}^n w_i \|x - a_i\|^q, \qquad 1 \leq q < 2,\quad x\in\mathbb{R}^d,$ where $w_i > 0$ are positive weights, $a_i\in\mathbb{R}^d$ are data points, and the objective $f$ is strictly convex under non-collinearity assumptions. This formulation generalizes the geometric median ( $q=1$ ) and weighted least squares ( $q=2$ ) problems. Classical iterative solutions, such as the Weiszfeld algorithm, experience a breakdown at singularities (data points) for $q < 2$ . The Extended Power Weiszfeld method introduces systematic desingularization through a subgradient-based remedy, ensuring well-defined iterative progress and global convergence, including at data points.

1. Formulation and Singularity of the Extended Weber Problem

The extended Weber location problem seeks the unique minimizer $x_*$ of

$f(x) = \sum_{i=1}^n w_i \|x - a_i\|^q,$

with $1 \leq q < 2$ . For $x \neq a_i$ for all $i$ , the gradient is

$\nabla f(x) = \sum_{i=1}^n q w_i \|x-a_i\|^{q-2} (x - a_i).$

The standard "power–Weiszfeld" update is defined as

$x^{(k+1)} = T_1(x^{(k)}) = \frac{\sum_i w_i \|x^{(k)}-a_i\|^{q-2} a_i}{\sum_i w_i \|x^{(k)}-a_i\|^{q-2}}.$

For $q < 2$ , if an iterate lands on $a_k$ , then $\|x - a_k\|^{q-2}\to\infty$ , causing the update to be undefined ("breakdown"). This singularity obstructs convergence at and near data points.

2. The De-Singularity Subgradient Approach

To address the breakdown, a desingularized subgradient strategy is adopted.

Subgradient at a Data Point

For $x = a_k$ , isolate $f(x)$ as $w_k \|x - a_k\|^q + D_q(x)$ , with $D_q(x) = \sum_{i \neq k} w_i \|x - a_i\|^q$ . Define the $q$ -power desingularity subgradient as

$\nabla D_q(a_k) = \sum_{i\neq k} q w_i \|a_k - a_i\|^{q-2} (a_k - a_i).$

For $1 < q < 2$, the subdifferential at $a_k$ is the singleton $\{\nabla D_q(a_k)\}$ ; for $q=1$ it is the classical $\ell_1$ -median subgradient set.

Escape from a Singular Point

If $x^{(k)} = a_k$ and $\|\nabla D_q(a_k)\| > 0$ , move by

$x^{(k+1)} = a_k - \lambda \nabla D_q(a_k),$

with $\lambda$ chosen small enough to ensure strict decrease in $f$ . An initial $\lambda_0$ can be set as

$\lambda_0 = \min\left\{ 1,\; \frac{1}{q} w_k^{-1/(q-1)} \|\nabla D_q(a_k)\|^{-(2-q)/(q-1)} \right\},$

and then backtracked ( $\lambda \leftarrow \rho \lambda$ with $0<\rho<1$ ) until $f$ decreases.

Unified Iteration Scheme

The overall method operates as follows: $x^{(k+1)} = \begin{cases} T_1(x^{(k)}) & x^{(k)} \notin \{a_i\} \ a_k - \lambda_k \nabla D_q(a_k) & x^{(k)} = a_k,\; \nabla D_q(a_k) \neq 0 \ a_k & x^{(k)} = a_k,\; \nabla D_q(a_k) = 0 \end{cases}$ with backtracking ensuring $f(x^{(k+1)}) < f(x^{(k)})$ at each step (Lai et al., 2024).

3. Convergence Properties

Global Convergence

Under the assumptions $w_i > 0$ and non-collinear $\{a_i\}$ , the sequence $\{x^{(k)}\}$ generated by this rule exhibits:

Monotonic decrease: $f(x^{(k+1)}) < f(x^{(k)})$ until termination.
Fixed points: Only possible at the data points $\{a_i\}$ and the unique minimizer $x_*$ .
Any non-optimal data point can be visited at most once; the escape step immediately returns the iterate to the regular region.
Global convergence: $x^{(k)} \to x_*$ from any starting point.

Superlinear Local Convergence at Singular Minima

If the unique minimizer $x_*$ coincides with a data point $a_k$ , then as $x \to a_k$ ,

$\|\nabla D_q(x)\| = O(\|x - a_k\|).$

Thus, near a singular minimizer and for $1 < q < 2$,

$\lim_{k \to \infty} \frac{\|x^{(k+1)} - a_k\|}{\|x^{(k)} - a_k\|} = 0,$

implying a superlinear convergence rate (Lai et al., 2024).

4. Algorithmic Structures

A structured summary of the procedure is as follows:

Scenario	Update Step	Stopping Criterion
$x^{(k)} \notin \{a_i\}$	$x^{(k+1)} = \frac{\sum_i w_i\\|x^{(k)}-a_i\\|^{q-2} a_i}{\sum_i w_i\\|x^{(k)}-a_i\\|^{q-2}}$	N/A
$x^{(k)} = a_k$ , $\\|\nabla D_q(a_k)\\|<\varepsilon$	$x_* = a_k$ (optimum found)	$\\|\nabla D_q(a_k)\\|<\varepsilon$
$x^{(k)} = a_k$ , $\\|\nabla D_q(a_k)\\|\geq\varepsilon$	$x^{(k+1)} = a_k - \lambda_k \nabla D_q(a_k)$ (with backtracking on $\lambda_k$ until $f$ decreases)	N/A

Typical parameters are backtracking factor $\rho=0.1$ and tolerance $10^{-9}$ .

5. Comparisons across Exponent Choices

$q = 1$ : Reduces to the classical Weiszfeld algorithm ("geometric median") with the Kuhn–Weiszfeld escape step for singularities.
$q = 2$ : Exact minimizer in one step at the weighted Euclidean mean.
$1 < q < 2$: Interpolates between robustness ( $q=1$ ) and speed ( $q=2$ ), achieving faster (superlinear) convergence near singular minima, with an automatically scaled linesearch step.

A plausible implication is that selecting intermediate values $q \approx 1.2$ –$1.6$ provides a balance between robustness to outliers and iterative efficiency.

6. Empirical Evaluation in Portfolio Optimization

In online portfolio selection:

At each trading day, compute the $q$ -median of the last $m$ price-relative vectors via this method and use as forecast in a reversion strategy.
Evaluated on NYSE(N) ( $d=23$ , $T\approx6000$ ) and CSI300 (Chinese index, $d=47$ , $T\approx500$ ), for $m = 5$ or $10$, over $q = 1.1$ to $1.9$.
The algorithm does not stall and escapes singularities efficiently ( $\approx$ 2–3 inner backtrackings).
Iteration count is modest ( $\leq 30$ ), CPU time per median below $1$ ms.
Empirical rate of convergence decreases from $\lambda\approx 0.35$ at $q=1$ to $\lambda\approx 0.06$ at $q=1.9$ .
Using $q$ -median forecasts in reversion portfolios, intermediate $q$ ( $\approx1.3$ --$1.6$) often yields better cumulative returns and Sharpe ratios than both $q=1$ and $q=2$ (Lai et al., 2024).

7. Practical Recommendations and Significance

The de-singularity escape step guarantees robustness against iterate stalling and adds negligible computational overhead.
For online learning, machine learning, and optimization applications that generalize the median/minimum norm paradigm, intermediate values of $q$ are recommended for an optimal robustness–convergence trade-off.
Theoretical guarantees include monotonic decrease, avoidance of singularity lock-in, and superlinear local convergence at most problematic cases.

The Extended Power Weiszfeld method thus advances both theoretical analysis and practical computation for the extended Weber location problem in the non-quadratic, nonconvex setting of $1 \leq q < 2$ (Lai et al., 2024).

Markdown Report Issue Upgrade to Chat

References (1)

A De-singularity Subgradient Approach for the Extended Weber Location Problem (2024)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Extended Power Weiszfeld.