Fixed-Share Regret Minimization

Updated 23 November 2025

Fixed-Share Regret Minimization Algorithm is a method in online learning that minimizes shifting regret by balancing expert weight concentration with an adaptive share mechanism.
It employs an entropic mirror descent approach with convex mixing, delivering rigorous theoretical guarantees for tracking the best expert in non-stationary settings.
Its simple update rule and efficient parameter tuning facilitate practical applications, including extensions to adaptive models and hidden Markov frameworks.

The Fixed-Share Regret Minimization Algorithm is a foundational approach in online learning within the prediction-with-expert-advice paradigm. It is designed to maintain low regret not only against the globally best expert, but also with respect to a reference that may switch between different experts over time—crucial in non-stationary environments. The method operates by maintaining and updating a weight vector over the set of experts, employing a distinctive "share" mechanism that continuously trades off tracking persistent expert performance against rapid adaptation to changing conditions. Modern analyses recognize Fixed-Share as a specific instantiation of entropic mirror descent with a simple convex-mixing step, yielding regret guarantees for a variety of shifting, adaptive, and discounted loss benchmarks (Koolen et al., 2010, Cesa-Bianchi et al., 2012).

1. Formal Framework and Algorithmic Structure

The Fixed-Share algorithm operates in the classical "prediction with expert advice" framework. Let $\mathcal{X}$ denote the outcome space, and $E = \{1, \ldots, K\}$ the set of experts. On each round $t = 1, \ldots, T$ , each expert $i$ emits a predictive distribution $f_{i,t}(\cdot)$ over $\mathcal{X}$ , while the learner produces a mixture prediction $p_t(x) = \sum_{i=1}^K w_{t,i} f_{i,t}(x)$ , with $w_{t} \in \Delta_{K-1}$ the current weight vector on the simplex.

Upon observing $x_t \in \mathcal{X}$ , the learner incurs log-loss $\ell(p_t, x_t) = -\log p_t(x_t)$ , and each expert suffers $\ell(f_{i,t}, x_t) = -\log f_{i,t}(x_t)$ . The goal is to maintain low cumulative loss $L_{\mathrm{alg}} = \sum_{t=1}^T \ell(p_t, x_t)$ compared to a reference loss, possibly defined by the best partitioned assignment of experts to segments of the data.

The Fixed-Share update operates as follows (for log-loss and $\eta=1$ ):

Initialize $w_{1,i} = w_i$ (a prior).
For $t = 1, \ldots, T$ $t = 1, \dots, T$ :
1. Predict $p_t(x) = \sum_i w_{t,i} f_{i,t}(x)$ .
2. On observing $x_t$ , compute unnormalized loss-updated weights $u_{t,i} = w_{t,i} f_{i,t}(x_t)$ .
3. Normalize $v_{t,i} = u_{t,i} / (\sum_j u_{t,j})$ .
4. Set next weights $w_{t+1,i} = (1-\alpha) v_{t,i} + \alpha w_i$ .

The share parameter $\alpha \in [0,1]$ governs the tradeoff:

Small $\alpha$ promotes concentration on past successful experts.
Large $\alpha$ enables rapid adaptation to changes via reversion to the prior (Koolen et al., 2010).

2. Regret Notions and Theoretical Guarantees

Fixed-Share is motivated by the necessity to minimize not only regret to the globally best expert in hindsight, but also to the best sequence of experts under a bounded number of switches ("shifting regret") or in other scenarios requiring tracking. More generally, the shifting, adaptive, and discounted regret can be cast as minimization versus a time-varying comparator sequence, measured via a total-variation "shift cost" $m(u_{1:T}) = \sum_{t=2}^T D_{\mathrm{TV}}(u_t, u_{t-1})$ where $D_{\mathrm{TV}}$ is the total-variation distance (Cesa-Bianchi et al., 2012).

Regret Bound (Classical Shifting Case)

For a partition of $1{:}T$ into $k$ segments $S_1, \ldots, S_k$ , with comparator experts $i_1, \ldots, i_k$ , let $\alpha^* = (k-1)/(T-1)$ . The regret is bounded as

$\sum_{t=1}^T \ell(p_t, x_t) - \sum_{s=1}^k \sum_{t\in S_s} \ell(f_{i_s, t}, x_t) \le (k-1) \log K + (T-1) H(\alpha^*)$

where $H(\alpha^*) = -\alpha^* \log \alpha^* - (1-\alpha^*) \log(1-\alpha^*)$ is the binary entropy function (Koolen et al., 2010). For the more general comparator sequences $u_{1:T}$ , the bound admits an explicit switch term (proportional to $m(u_{1:T}) \log(K/\alpha)$ ), a "stay" term, and a loss/penalty term (proportional to $(\eta/8)T$ ) (Cesa-Bianchi et al., 2012).

3. Entropic Mirror Descent and Unified Analysis

Recent work establishes that the Fixed-Share algorithm is an instance of entropic mirror descent with an added mixing ("share") step. The general mirror-descent update, parameterized by a learning rate $\eta$ and mixing step $\psi_{t+1}$ :

After observing losses $\ell_{t} \in [0,1]^n$ , perform a mirror descent update:

$v_{i, t+1} = \frac{w_{i, t} \exp(-\eta \ell_{i, t})} {\sum_{j=1}^n w_{j, t} \exp(-\eta \ell_{j, t})}$

Then, the "share" (mixing) step for Fixed-Share is $w_{t+1} = (1-\alpha) v_{t+1} + \alpha \tfrac{1}{n} \mathbf{1}$ .

Both projection-based and share-based mirror descent achieve near-equivalent regret bounds, explicable via Bregman divergence (specifically KL-divergence) analyses. The proof exploits the telescoping property of KL-divergences, the Pythagorean property in the projection case, and convexity in the mixing case (Cesa-Bianchi et al., 2012).

The Fixed-Share analysis naturally extends to multiple regret notions:

Adaptive regret: Bounds regret over any interval $[r, s]$ of given length, enabling robust adaptation to changing environments.
Discounted regret: Incorporates discount factors on older rounds, relevant in settings where recent performance is more important.
Small-loss refinements: Utilizing sharper inequalities, the regret guarantee can be improved from scaling in $O(\sqrt{T m_0})$ to $O(\sqrt{L_0 m_0})$ where $L_0$ is the cumulative loss of the comparator, when $L_0 \ll T$ (Cesa-Bianchi et al., 2012).

Further, by using sparse weight-sharing variants, performance guarantees scale with the support size of comparator sequences, beneficial when only a small subset of experts is ever relevant.

5. Practical Algorithmic Realization and Parameter Tuning

Each round of Fixed-Share requires $O(K)$ operations: updating weights, normalizations, and mixture steps. Total time is $O(KT)$ ; space is $O(K)$ .

Pseudocode for basic Fixed-Share update:

Input: experts E={1,…,K}, prior w∈Δ_{K−1}, share α∈[0,1]
Initialize w_{1}←w
For t=1,…,T:
    1. Receive expert predictions {f_{i,t}(·): i∈E}
    2. Predict p_t(x)=∑_{i} w_{t,i} f_{i,t}(x)
    3. Observe x_t and incur loss −log p_t(x_t)
    4. For each i: u_{t,i}←w_{t,i}·f_{i,t}(x_t)
    5. Compute normalization Z_t=∑_i u_{t,i}; v_{t,i}=u_{t,i}/Z_t
    6. For each i: w_{t+1,i}←(1−α)·v_{t,i} + α·w_i
End

Parameter selection:

For tracking up to $m$ switches in $T$ rounds, set $\alpha \approx m/(T+m)$ and $\eta \approx \sqrt{(\ln K)/(T+m)}$ .
For adaptive regret over window $\tau_0$ , set $\alpha \approx 1/\tau_0$ , $\eta\approx\sqrt{(\ln(K\tau_0))/\tau_0}$ .
Without prior knowledge, employ time-varying or doubling-trick schedules for $\eta_t$ , $\alpha_t$ (Cesa-Bianchi et al., 2012).

6. Extensions: Learning Experts and Hidden Markov Models

When experts themselves are structured predictors, such as hidden Markov models (HMMs), the Fixed-Share framework extends as follows:

Maintain a joint posterior $\lambda_t(s,i) = P(\text{state } s \text{ in expert } i\,|\,x_{<t})$ .
After the standard loss (forward) update, perform the share/mix operation using a convex combination with a prior state distribution $\pi$ (either the model's initial distribution—"freezing"—or a time-shifted variant—"sleeping").
Computational complexity per round is $O(S)$ , where $S = \sum_{i} |Q_i|$ , matching HMM filtering cost.
For a segmentation of $1:T$ into $k$ segments and segmentwise choice of experts/HMMs, the regret bound has the same structure as in the finite-expert case, with the segment loss replaced by the HMM segment likelihood under appropriate resetting (Koolen et al., 2010).

This nested HMM approach (FS $^\text{fr}$ for freezing, FS $^\text{sl}$ for sleeping) allows efficient, scalable learning and tracking even when experts themselves adapt, maintaining the optimality of Fixed-Share's core regret guarantees.

7. Impact, Interpretation, and Comparative Context

The Fixed-Share algorithm provides a principled and efficient mechanism for online tracking in nonstationary environments, consolidating earlier "tracking the best expert" combinatorial analyses into a rigorous, entropic mirror descent framework. Its capacity for low shifting, adaptive, and discounted regret, as well as its seamless extensibility to structured learning experts and more general comparator sequences, positions it as a canonical algorithmic primitive in modern online learning theory.

The equivalence between projection-based and mixing-based (Fixed-Share) mirror descent updates unifies a broad class of regret minimization schemes and clarifies the operational meaning of the share parameter $\alpha$ as a tracking–concentration trade-off (Cesa-Bianchi et al., 2012). The approach has influenced refinements including small-loss bounds, support sparsity adaptivity, and adaptive parameterization.

A plausible implication is that Fixed-Share's conceptual simplicity, analytical clarity, and extensibility make it a natural choice for robust sequential prediction in dynamic or abruptly changing environments, whether the experts are fixed, learning, or hidden Markov models.

Markdown Report Issue Upgrade to Chat

References (2)

Switching between Hidden Markov Models using Fixed Share (2010)

Mirror Descent Meets Fixed Share (and feels no regret) (2012)

Topic to Video (Beta)

Whiteboard

Follow Topic

Get notified by email when new papers are published related to Fixed-Share Regret Minimization Algorithm.