Extended Power Weiszfeld Method
- Extended Power Weiszfeld is a robust optimization algorithm that minimizes q-th power distances, generalizing the geometric median and weighted least squares.
- It introduces a desingularization subgradient strategy to escape singularities, ensuring well-defined iterative updates and global convergence.
- Empirical results, especially in portfolio optimization, show that intermediate q values balance robustness and convergence speed for enhanced performance.
The Extended Power Weiszfeld method is a de-singularity subgradient approach for solving the extended Weber location problem, formulated as the minimization of a sum of -th power distances between a variable point and a collection of weighted data points: where are positive weights, are data points, and the objective is strictly convex under non-collinearity assumptions. This formulation generalizes the geometric median () and weighted least squares () problems. Classical iterative solutions, such as the Weiszfeld algorithm, experience a breakdown at singularities (data points) for . The Extended Power Weiszfeld method introduces systematic desingularization through a subgradient-based remedy, ensuring well-defined iterative progress and global convergence, including at data points.
1. Formulation and Singularity of the Extended Weber Problem
The extended Weber location problem seeks the unique minimizer of
with . For for all , the gradient is
The standard "power–Weiszfeld" update is defined as
For , if an iterate lands on , then , causing the update to be undefined ("breakdown"). This singularity obstructs convergence at and near data points.
2. The De-Singularity Subgradient Approach
To address the breakdown, a desingularized subgradient strategy is adopted.
Subgradient at a Data Point
For , isolate as , with . Define the -power desingularity subgradient as
For $1 < q < 2$, the subdifferential at is the singleton ; for it is the classical -median subgradient set.
Escape from a Singular Point
If and , move by
with chosen small enough to ensure strict decrease in . An initial can be set as
and then backtracked ( with ) until decreases.
Unified Iteration Scheme
The overall method operates as follows: with backtracking ensuring at each step (Lai et al., 2024).
3. Convergence Properties
Global Convergence
Under the assumptions and non-collinear , the sequence generated by this rule exhibits:
- Monotonic decrease: until termination.
- Fixed points: Only possible at the data points and the unique minimizer .
- Any non-optimal data point can be visited at most once; the escape step immediately returns the iterate to the regular region.
- Global convergence: from any starting point.
Superlinear Local Convergence at Singular Minima
If the unique minimizer coincides with a data point , then as ,
Thus, near a singular minimizer and for $1 < q < 2$,
implying a superlinear convergence rate (Lai et al., 2024).
4. Algorithmic Structures
A structured summary of the procedure is as follows:
| Scenario | Update Step | Stopping Criterion |
|---|---|---|
| N/A | ||
| , | (optimum found) | |
| , | (with backtracking on until decreases) | N/A |
Typical parameters are backtracking factor and tolerance .
5. Comparisons across Exponent Choices
- : Reduces to the classical Weiszfeld algorithm ("geometric median") with the Kuhn–Weiszfeld escape step for singularities.
- : Exact minimizer in one step at the weighted Euclidean mean.
- $1 < q < 2$: Interpolates between robustness () and speed (), achieving faster (superlinear) convergence near singular minima, with an automatically scaled linesearch step.
A plausible implication is that selecting intermediate values –$1.6$ provides a balance between robustness to outliers and iterative efficiency.
6. Empirical Evaluation in Portfolio Optimization
In online portfolio selection:
- At each trading day, compute the -median of the last price-relative vectors via this method and use as forecast in a reversion strategy.
- Evaluated on NYSE(N) (, ) and CSI300 (Chinese index, , ), for or $10$, over to $1.9$.
- The algorithm does not stall and escapes singularities efficiently (2–3 inner backtrackings).
- Iteration count is modest (), CPU time per median below $1$ ms.
- Empirical rate of convergence decreases from at to at .
- Using -median forecasts in reversion portfolios, intermediate (--$1.6$) often yields better cumulative returns and Sharpe ratios than both and (Lai et al., 2024).
7. Practical Recommendations and Significance
- The de-singularity escape step guarantees robustness against iterate stalling and adds negligible computational overhead.
- For online learning, machine learning, and optimization applications that generalize the median/minimum norm paradigm, intermediate values of are recommended for an optimal robustness–convergence trade-off.
- Theoretical guarantees include monotonic decrease, avoidance of singularity lock-in, and superlinear local convergence at most problematic cases.
The Extended Power Weiszfeld method thus advances both theoretical analysis and practical computation for the extended Weber location problem in the non-quadratic, nonconvex setting of (Lai et al., 2024).