Miss Rate (MR): Definition & Applications

Updated 16 January 2026

Miss Rate (MR) is a metric that quantifies the probability of a missed event, such as cache misses or deadline misses in real-time systems.
Researchers apply a range of techniques, including exact combinatorial methods and Che's approximation, to compute MR in various domains.
Optimizing MR is crucial for enhancing system efficiency, balancing trade-offs between energy, latency, and resource allocation in applications like caching and WLAN.

Miss Rate (MR) is a fundamental metric quantifying the proportion of expected retrieval or access events that result in a miss, i.e., the desired item, measurement, or result is absent or not received. MR serves as a performance indicator for diverse systems, including caching mechanisms, networked measurement acquisition, and real-time scheduling. It is typically formalized as the long-run ratio of misses to total expected accesses, with precise mathematical definitions varying across domains but consistently reflecting the probability or fraction of missed events conditioned on expected opportunity.

1. Formal Definitions and Domain-Specific Formulations

In caching, the miss rate $\mathrm{MR}$ for a cache of size $C$ under the independent reference model (IRM) is defined as the steady-state probability that a requested item is not resident in the cache, equivalently the long-run fraction of cache requests incurring a miss:

$\mathrm{MR}(C) = P\{\text{miss at a generic request}\}$

This complements the hit rate and, for LRU caching, relates to the probability that the stack distance for the requested item exceeds the cache capacity (Berthet, 2016, Berthet, 2017).

For periodic soft real-time systems, deadline miss rate (DMR) is given by

$\mathrm{DMR}_N = \frac{1}{N} \sum_{j=1}^N \mathrm{DM}(j)$

where $\mathrm{DM}(j)$ is an indicator for whether job $j$ missed its deadline. The long-run DMR is defined as the almost sure limit as $N \rightarrow \infty$ (Chen et al., 2024).

In networked measurement domains, such as WLAN-based indoor positioning, MR for received signal strength (RSS) samples is: $\mathrm{MR} = \frac{N_{\rm exp} - N_{\rm obs}}{N_{\rm exp}} = 1 - \frac{N_{\rm obs}}{N_{\rm exp}}$ where $N_{\rm exp}$ is the expected number of samples and $N_{\rm obs}$ the number actually observed in a window (Schmidt et al., 2019).

In constrained on-device expert caching such as SliceMoE, MR is formulated as: $C$ 0 and is treated as a constraint in energy-minimization optimization (Choi et al., 15 Dec 2025).

2. Exact and Approximate MR Computation in Caching

In LRU caching, MR admits both exact combinatorial and approximate analytical formulations. King's 1971 formula computes MR via summation over all subsets and permutations,

$C$ 1

where $C$ 2 aggregates products over probabilities $C$ 3 and complements $C$ 4 according to subset and ordering structure (Berthet, 2016). Flajolet et al. (1992) provide an equivalent integral-generating function formula, with correspondence proven by combinatorial means.

For large-scale or complex access patterns, exact MR computations are combinatorially infeasible. Che's approximation uses the concept of characteristic time $C$ 5,

$C$ 6

where $C$ 7 is the unique time such that $C$ 8. Under non-stationary request processes, corrections to Che's approximation are explicitly quantifiable, with asymptotic expansions providing order- $C$ 9 error bounds (Olmos et al., 2015).

Table 1: Representative Cache MR Formulas

Formula Type	Expression	Remarks
King's exact (IRM)	$\mathrm{MR}(C) = P\{\text{miss at a generic request}\}$ 0	Permutations over all subsets
Flajolet's integral	Integral+coefficient extraction (see above)	Equivalent to King by combinatorial lemma
Che's approximation	$\mathrm{MR}(C) = P\{\text{miss at a generic request}\}$ 1	Characteristic time-based
Power-law closed-form [1705]	$\mathrm{MR}(C) = P\{\text{miss at a generic request}\}$ 2	For Zipf/popularity laws

3. Analytical Models and Asymptotics

Under IRM and power-law popularity ( $\mathrm{MR}(C) = P\{\text{miss at a generic request}\}$ 3), MR exhibits distinct asymptotic regimes depending on the tail index $\mathrm{MR}(C) = P\{\text{miss at a generic request}\}$ 4, cache fraction $\mathrm{MR}(C) = P\{\text{miss at a generic request}\}$ 5, and request distribution. For heavy-tail ( $\mathrm{MR}(C) = P\{\text{miss at a generic request}\}$ 6), small cache sizes yield suboptimal LRU performance relative to static caching; conversely, Zipf ( $\mathrm{MR}(C) = P\{\text{miss at a generic request}\}$ 7) and light-tail ( $\mathrm{MR}(C) = P\{\text{miss at a generic request}\}$ 8) regimes offer near-optimality (Berthet, 2017).

When modeling networked measurement or packet arrivals, MR can be interpreted via Poisson or binomial point process approximations—e.g., the Poisson arrival rate $\mathrm{MR}(C) = P\{\text{miss at a generic request}\}$ 9 yields

$\mathrm{DMR}_N = \frac{1}{N} \sum_{j=1}^N \mathrm{DM}(j)$ 0

which generalizes to binomial-loss models depending on packet success probability (Schmidt et al., 2019).

Table 2: MR Asymptotics in LRU Cache under Power-law Demand

Case	MR Behavior	Static Optimality
$\mathrm{DMR}_N = \frac{1}{N} \sum_{j=1}^N \mathrm{DM}(j)$ 1 (heavy-tail)	$\mathrm{DMR}_N = \frac{1}{N} \sum_{j=1}^N \mathrm{DM}(j)$ 2	LRU can be $\mathrm{DMR}_N = \frac{1}{N} \sum_{j=1}^N \mathrm{DM}(j)$ 3 static
$\mathrm{DMR}_N = \frac{1}{N} \sum_{j=1}^N \mathrm{DM}(j)$ 4 (Zipf)	$\mathrm{DMR}_N = \frac{1}{N} \sum_{j=1}^N \mathrm{DM}(j)$ 5	Near optimal
$\mathrm{DMR}_N = \frac{1}{N} \sum_{j=1}^N \mathrm{DM}(j)$ 6 (light)	$\mathrm{DMR}_N = \frac{1}{N} \sum_{j=1}^N \mathrm{DM}(j)$ 7	Optimal as $\mathrm{DMR}_N = \frac{1}{N} \sum_{j=1}^N \mathrm{DM}(j)$ 8

4. Impact, Measurement, and Optimization

MR directly governs system efficiency, energy usage, and latency. For instance, fast-rate WLAN RSS acquisition reduces MR from 41.7% (normal mode, –80 dBm) to 5.6% (monitor mode), yielding >7× improvement and accelerating indoor navigation survey duration by an order of magnitude (Schmidt et al., 2019). In SliceMoE inference, DRAM cache allocation subject to a miss-rate constraint enables model designers to minimize energy while bounding Flash retrieval events: $\mathrm{DMR}_N = \frac{1}{N} \sum_{j=1}^N \mathrm{DM}(j)$ 9 (Choi et al., 15 Dec 2025). MR can be fine-tuned via dynamic bit-sliced caching, precision routing, and predictive cache warmup.

Deadline miss rate in periodic scheduling is amenable to rigorous analysis using finite-state Markov chains whose stationary distribution encodes the long-run MR, with system parameters (e.g., dismiss point $\mathrm{DM}(j)$ 0) allowing designers to interpolate between immediate dismiss and never dismiss regimes (Chen et al., 2024).

5. Practical Trade-offs, Optimization, and Policy Design

Miss rate acts as a pivotal constraint and objective in system optimization: MR can be minimized at the expense of increased memory footprint, energy, or latency in hardware-constrained environments. In MoE models, bit-sliced caching and calibration-free quantization free up DRAM for storing critical expert slices, directly lowering MR under a fixed capacity (Choi et al., 15 Dec 2025). In general, tuning eviction policies, replacement strategies, and prefetching approaches can mitigate the adverse effects of high miss rates.

In real-time scheduling, the design of the dismissal policy (parameter $\mathrm{DM}(j)$ 1) quantitatively alters the long-run deadline miss rate and the trade-off between tardy task execution and interference suppression (Chen et al., 2024). In caching, the choice among LRU, static, and adaptive policies imparts distinct MR profiles, with replacement optimization—particularly under skewed or clustered access patterns—yielding significant practical gains.

6. Algorithmic and Implementation Notes

Exact MR formulas are combinatorially expensive; dynamic programming recurrences and integral approximations greatly enhance tractability for realistic system sizes (Berthet, 2016). In performance-sensitive domains, MR estimation via Che-type approximations, Poisson/TTL modeling, and clustering-aware corrections provide robust, scalable solutions (Olmos et al., 2015, Berthet, 2017). Algorithmic best practices involve exploiting monotonicity, set-partitioning, and access frequency statistics to manage or predict MR.

Table 3: MR Policies and Techniques

System	MR Definition	Policy/Optimization
LRU Cache	Fraction of cache misses	Che’s approximation, static
WLAN IPS [1907]	Fraction of lost RSS samples	Monitor mode, faster survey
MoE Inference	DRAM cache misses	DBSC, AMAT, PCW, constraint
Soft RT [2401]	Fraction of jobs missing DL	Markov-chain tuning, dismiss

7. Conceptual Significance and Directions

Miss rate provides a unifying lens for analyzing retrieval, scheduling, and resource-constrained systems. Its interpretation encompasses classical probability, queueing theory, combinatorial analysis, and stochastic process modeling. MR figures centrally in both theoretical performance bounds and practical user experience, guiding system evolution in domains from processor caches to real-time data acquisition and model serving. Future work extends MR analysis to non-iid, non-stationary, and adversarial access patterns, distributed caching, and energy-aware embedded inference.

The application-specific character of MR, together with its formal equivalence across diverse computational domains, underpins its continued centrality as a design metric and analytical construct.