Discrete Lomax Distribution
- The discrete Lomax distribution is a two-parameter discrete model defined by closed-form pmf, cdf, survival, and hazard functions, capturing overdispersed count data with heavy right tails.
- It effectively models phenomena with extra-Poisson variability, such as accident counts or industrial strike data, thanks to its flexibility in handling a mass at zero and extreme events.
- Parameter estimation is carried out through frequency methods and maximum likelihood, with goodness-of-fit assessed via chi-square and discrete KS tests for reliable inference.
The discrete Lomax distribution is a two-parameter discrete probability model that arises as the “zero-vertex” () specialization of the discrete generalized Pareto distribution. Its pmf, cumulative, survival, quantile, and hazard functions all admit closed-form expressions. The distribution is suitable for overdispersed count data typically featuring a substantial mass at zero and heavy right tails. Its flexibility and tractability make it effective in modeling phenomena such as the annual number of deaths at road-accident blackspots and count data exhibiting extra-Poisson variability (Prieto et al., 2013, &&&1&&&).
1. Definition and Fundamental Properties
The discrete Lomax distribution, denoted , is defined for by its cumulative distribution function (cdf): Here is a shape parameter and a scale parameter. The pmf follows as the difference: The survival function (tail probability) is
These closed-form expressions imply computational tractability for evaluation and sampling via inversion.
The discrete Lomax arises as the special case of the more general discrete gamma-Lomax distribution (DGLD). For and , the DGLD has pmf
which matches the Lomax form under the identification , (Ghosh et al., 2018).
2. Moments and Tail Behavior
The th moment of the discrete Lomax exists for and is expressible via a tail sum: The mean exists for , and the variance for . Closed-form solution is not available, but the convergent sum is practical for numerical computation.
The distribution possesses heavy right tails. As , . This tail behavior enables the model to capture extreme count events.
Overdispersion is inherent: for a broad parameter range. This makes the discrete Lomax suitable for data where classical Poisson models severely understate the empirical variance (Prieto et al., 2013, Ghosh et al., 2018).
3. Quantile and Hazard Functions
The quantile function (inverse cdf) for level is given by
This operation facilitates efficient random variate generation via inversion.
The discrete hazard function is
This function is strictly decreasing in , reflecting a decreasing-failure-rate (DFR): the risk of observing an exact count declines as increases (Prieto et al., 2013, Ghosh et al., 2018).
4. Parameter Estimation
Two inference approaches are common: the -frequency method and maximum likelihood estimation (MLE).
Frequency method: For sample proportions (zero count) and (one count), equations
are solved by first eliminating : Numerical root finding for yields , and then
The resulting estimates supply initial values for MLE.
Maximum likelihood estimation: For data , the log-likelihood is
Parameter optimization requires numerical maximization, typically initialized at the frequency-based estimates. The asymptotic normality of MLEs enables standard error computation via the observed information matrix (Prieto et al., 2013, Ghosh et al., 2018).
5. Model Assessment and Goodness-of-Fit
Goodness-of-fit is conventionally assessed by Chi-square or Kolmogorov–Smirnov (KS) tests adapted for discrete data.
- Chi-square test: Data are binned with expected counts , and the statistic
is compared to the quantile with degrees of freedom under the null model fit.
- Discrete KS test by parametric bootstrap: The empirical cdf is compared to the fitted model cdf : A large number of synthetic samples are generated, re-fitted, and used to compute a bootstrap p-value as the proportion of simulated exceeding the observed . Rejection occurs for (Prieto et al., 2013).
6. Applications and Empirical Studies
Prieto et al. (2014) modeled annual counts of deaths on Spanish road accident blackspots (2003–2007) using the discrete Lomax. Maximum likelihood estimates and standard errors for each year are summarized as follows:
| Year | (s.e.) | (s.e.) |
|---|---|---|
| 2003 | 6.55 (2.07) | 0.314 (0.118) |
| 2004 | 13.86 (9.90) | 0.129 (0.100) |
| 2005 | 5.49 (1.68) | 0.381 (0.144) |
| 2006 | 4.34 (1.16) | 0.536 (0.186) |
| 2007 | 10.83 (5.88) | 0.204 (0.125) |
Goodness-of-fit tests (both chi-square and KS with bootstrap) yielded p-values above the $0.05$ significance level in all years (e.g., 2003: with , ; KS , bootstrap ), indicating the discrete Lomax model was not rejected at conventional levels (Prieto et al., 2013).
Additionally, the discrete Lomax arises as a special case () of the discrete gamma-Lomax, which has been fitted to blockades and strike counts in UK industrial data, demonstrating its capacity to model overdispersed, heavy-tailed count processes (Ghosh et al., 2018).
7. Connections to Related Distributions
The discrete Lomax is embedded within the general class of discrete generalized Pareto distributions (DGP), parameterized by location , shape, and scale. Setting yields the Lomax as a two-parameter reduction. It is also a limiting form of the discrete gamma-Lomax for , characterized via Poisson–Gamma mixtures discretized by the cdf-difference scheme.
A notable property is unification of heavy-tailed count models with closed-form expressions for likelihoods and moments. The decreasing failure rate property for (and thus for the discrete Lomax) distinguishes it from other popular models such as the Poisson and negative binomial, especially in settings with many zeros and high dispersion (Ghosh et al., 2018).
The discrete Lomax and its generalizations are readily implemented in likelihood-based inferential frameworks using standard statistical software, and offer interpretable parameters controlling both dispersion and tail heaviness.