Discrete Generalized Pareto Distribution

Updated 20 January 2026

DGPD is a discrete parametric model derived by bin-wise discretization of the continuous Generalized Pareto Distribution, tailored for integer-valued, heavy-tailed data.
It employs three key parameters—location, scale, and shape—to flexibly model overdispersed counts, rare-event tails, and extreme phenomena across diverse applications.
Applications range from insurance claims and road accident frequencies to environmental extremes, with estimation performed via seed methods, MLE, and bootstrap diagnostics.

A Discrete Generalized Pareto Distribution (DGPD) is a discrete, parametric probability model that arises by bin-wise discretization of the continuous Generalized Pareto Distribution. DGPDs are designed for modeling heavily right-tailed, integer-valued data—especially exceedances over high thresholds in count settings, where ties and lack of long-tailedness challenge traditional continuous extreme value methods. The canonical DGPD assumes three parameters (location, scale, and shape), supporting flexible modeling of overdispersed count phenomena, rare-event tails, and applications in fields such as insurance claim frequencies, environmental extremes, accident counts, and word frequency distributions (Prieto et al., 2013, &&&1&&&, Aka et al., 24 Jun 2025, Ahmad et al., 2024).

1. Definition, Support, and Genesis

For a random variable $X \in \{\mu, \mu+1, \mu+2, \dots \}$ , the three-parameter DGPD is specified by

$F(x) = \Pr(X \le x) = 1 - [1 + \lambda (x - \mu + 1)]^{-\alpha}, \qquad x \geq \mu$

where

$\alpha > 0$ : shape parameter, controlling tail thickness and moments,
$\lambda > 0$ : scale parameter, controlling tail spread,
$\mu \in \mathbb{N}_0$ : location (threshold).

The probability mass function (pmf) is given by

$p(x) = [1+\lambda(x-\mu)]^{-\alpha} - [1+\lambda(x-\mu+1)]^{-\alpha}, \qquad x=\mu,\mu+1,\dots$

This DGPD arises from discretizing the continuous GP survival function, $S_{\text{cont}}(x) = [1+\lambda(x-\mu)]^{-\alpha}$ , by the bin-wise difference, so that $p(x) = S_{\text{cont}}(x) - S_{\text{cont}}(x+1)$ (Prieto et al., 2013, Dzidzornu et al., 2020).

For $\mu=0$ , the DGPD simplifies to the two-parameter discrete Lomax law, used for zero-inflated or left-truncated domains.

The DGPD framework generalizes across parameterizations: in statistical extreme value theory, the alternative form

$p(k; \sigma, \xi) = (1+\xi \tfrac{k}{\sigma})^{-1/\xi} - (1+\xi \tfrac{k+1}{\sigma})^{-1/\xi}$

for $k=0,1,2,\dots$ , is prevalent; it recovers the geometric law at $\xi=0$ (Hitz et al., 2017, Aka et al., 24 Jun 2025, Ahmad et al., 2024).

2. Distributional Properties and Functions

DGPDs admit explicit forms for key distributional quantities:

Survival function: $\bar{F}(x) = [1+\lambda(x-\mu)]^{-\alpha}$
Hazard function:

$r(x) = \frac{p(x)}{\bar{F}(x)} = 1 - \left[\frac{1+\lambda(x-\mu)}{1+\lambda(x-\mu+1)}\right]^\alpha$

(strictly decreasing in $x$ ) (Prieto et al., 2013).

Quantile function: for $\gamma \in (0,1)$ ,

$x_\gamma = \left\lceil \frac{(1-\gamma)^{-1/\alpha}-1}{\lambda}-1+\mu \right\rceil$

Moments: The $r$ -th moment,

$E[X^r] = \sum_{x=\mu+1}^{\infty} \frac{x^r - (x-1)^r}{[1+\lambda(x-\mu)]^\alpha}$

is finite for $\alpha>r$ (Prieto et al., 2013).

3. Inference, Estimation, and Diagnostics

Parameter estimation proceeds via two principal routes:

μ–(μ+1) frequency (“seed”) method: Empirical frequencies at minimal and next-minimal observed values yield nonlinear equations for scale and shape,

$\begin{aligned} \hat p_\mu &= 1 - (1+\lambda)^{-\alpha}\ \hat p_{\mu+1} &= (1+\lambda)^{-\alpha}-(1+2\lambda)^{-\alpha} \end{aligned}$

solved numerically for $\lambda$ , then $\alpha$ (Prieto et al., 2013, Dzidzornu et al., 2020).

Maximum Likelihood Estimation (MLE):

$\ell(\alpha, \lambda) = \sum_{i=1}^n \ln\left\{[1+\lambda(x_i - \mu)]^{-\alpha} - [1+\lambda(x_i - \mu + 1)]^{-\alpha}\right\}$

Numerical optimization is typically used, with the “seed” estimates as starting values (Prieto et al., 2013, Dzidzornu et al., 2020). The MLE is consistent and asymptotically normal under regularity conditions (Hitz et al., 2017).

To assess standard errors, non-parametric bootstrap resampling is commonly applied. Model selection and fit are evaluated using information criteria (AIC, BIC) and tailored goodness-of-fit tests:

Chi-square test: with bin merging for expected counts <5 and p-value assessed on the fit (Prieto et al., 2013, Dzidzornu et al., 2020).
Discrete Kolmogorov–Smirnov (KS) test with bootstrap: maximum discrepancy between empirical and fitted CDFs, p-value determined by comparison to simulated KS statistics under parametric bootstrap (Prieto et al., 2013, Ahmad et al., 2024).

Threshold selection (for exceedances modeling) is addressed via mean residual life plots or parameter stability plots of $(\widehat{\lambda}, \widehat{\alpha})$ versus the threshold (Hitz et al., 2017, Aka et al., 24 Jun 2025).

4. Extensions and Both Univariate and Multivariate Generalizations

Recent proposals address deficiencies in classical DGPD modeling, primarily regarding threshold selection and coverage of the bulk distribution:

Extended DGPDs (Editor’s term): Model entire count distributions by composing the continuous GPD CDF with a bulk-shaping CDF $\mathcal{G}(u; \kappa)$ , e.g., power transforms, truncated normal or beta (Ahmad et al., 2024, Ahmad et al., 2022).
Zero-Inflated DGPDs: Incorporate excess zeros via a parameter $\pi$ : $P(Y=y) = \begin{cases} \pi + (1-\pi)\mathcal{G}(F_{\mathrm{GPD}}(1)), & y=0 \ (1-\pi)\left[\mathcal{G}(F_{\mathrm{GPD}}(y+1))-\mathcal{G}(F_{\mathrm{GPD}}(y))\right], & y\ge1 \end{cases}$ for both DEGPD and tail-focused DEGPDs, with associated four-parameter likelihood estimation (Ahmad et al., 2024, Ahmad et al., 2022).
Regression Extensions: In DEGPD and ZIDEGPD, parameters may be regressed on covariates using a generalized additive model (GAM) framework, with penalized maximum likelihood estimation to avoid overfitting (Ahmad et al., 2022).

In the multivariate case, the MDGPD representation encodes dependence via a generator variable $\mathbf{T}$ , with marginals exhibiting geometric–exponential tails: $P(G>n) = e^{-n}$ Simulation techniques employ generator-difference algorithms and neural Bayes estimators (NBE) for likelihood-free inference in high dimensions (Aka et al., 24 Jun 2025).

5. Empirical Studies and Applications

DGPDs are empirically validated and compared to classical count models in fields with overdispersed and heavy-tailed distributions:

Road Accidents: Spanish blackspot data, over 16,000 cases, fitted by DGPD and discrete Lomax. Accident counts ( $\mu=3$ ) and death counts ( $\mu=0$ ), yielded MLEs $\hat\alpha \sim 3.26-4.04$ , $\hat\lambda \sim 0.22-0.29$ (accidents); $\hat\alpha \sim 4.34-13.86$ , $\hat\lambda \sim 0.13-0.54$ (deaths). Both DGPD and Lomax provided parsimonious, adequate fit to overdispersed count data, with goodness-of-fit p-values >0.05 except for accidents in 2003 (Prieto et al., 2013).
Non-Life Insurance Claims: Ghanaian claims data modeled annually and in aggregate, with DGPD outperforming negative binomial in AIC/BIC by margins of ~88–1500 (annual/aggregate scales). DGPD showed tight fit: for 2012–2016, $\hat\mu=2.3354$ , $\hat\xi=0.0017$ , $\hat\sigma=2.8171$ , bootstrap SEs all <2.05 (Dzidzornu et al., 2020).
Discrete Extremes in other domains: DGPDs outperform continuous GPD when discrete ties are prevalent, as shown for Poisson, inverse-gamma, word-frequency, word-length, tornado count, and multiple-birth data. DGPD and generalized Zipf distributions (GZD) yield near-identical tail inference for regularly varying discrete phenomena (Hitz et al., 2017).
Multivariate Dry Spells: Swiss dry spell lengths at two stations analyzed via MDGPD, with inference via neural Bayes estimators yielding precise marginal and dependence parameter estimates with tight bootstrap CIs (RMSEs $\sigma \sim 0.6, \xi \sim 0.1, \rho \sim 0.04$ ). Model fit validated by bootstrap simulation and marginal QQ plots (Aka et al., 24 Jun 2025).
Extended DGPD in Count Data: Applications to upheld insurance complaints, doctor visits (zero inflation), and gaming offenses demonstrate that extended DGPDs with appropriate $\mathcal{G}$ yield unbiased parameter recovery in simulation, superior BIC and goodness-of-fit in real-data, and stabilized tail index estimation at low thresholds where classical DGPD is biased (Ahmad et al., 2024).

Application Domain	DGPD Variant	Key Model Parameters
Road accident blackspots	3-param. DGPD, Lomax	$\alpha, \lambda, \mu$
Non-life insurance claims	3-param. DGPD	$\mu, \xi, \sigma$
Word frequency/length, births	D-GPD, GZD	$\sigma, \xi$ ; regular variation focus
Environmental (dry spells)	Multivariate DGPD	per-variable $\sigma, \xi$ , copula generator
Zero-inflated/entire count	DEGPD, ZIDEGPD	$\beta, \xi, \kappa, \pi$ , bulk-shaping

6. Practical Guidance and Theoretical Remarks

DGPDs are recommended whenever:

Data are integer-valued with heavy right tails or ties (discrete, overdispersed, rare events)
Standard continuous GPD tail modeling is inadequate (likelihood inference, QQ plots signal lack of fit)
Applications demand flexible, tail-robust count modeling (insurance, traffic, environmental events)

Extended DGPDs—by incorporating bulk shaping ( $\mathcal{G}$ ), zero inflation, and regression structure—allow:

Modeling of entire discrete distributions without threshold selection,
Stabilization of tail parameter inference at moderate/low thresholds,
Inclusion of covariate effects in bulk, tail, and zero-inflation parameters, with smooth predictors via GAMs (Ahmad et al., 2024, Ahmad et al., 2022).

Limitations include:

The necessity of selecting or estimating shaping functions and generator laws (multivariate dependency),
The increased difficulty in tail parameter estimation for small samples,
Curse of dimensionality in high-dimensional inference,
Absence of simple closed-form expressions for moments/generating functions in extended models; numerical summation is required (Aka et al., 24 Jun 2025, Ahmad et al., 2024).

DGPDs nest the geometric law ( $\xi=0$ ), Lomax, and negative binomial as limiting cases. Tail flexibility arises primarily from the shape parameter, with scale and bulk-shaping influencing model fit to lower count ranges and excess zeros.

7. Current Research Trends and Open Directions

Active developments in DGPD research focus on:

Likelihood-free neural inference methods for high-dimensional multivariate DGPDs, leveraging permutation-invariant deep learning architectures for amortized Bayes risk minimization (Aka et al., 24 Jun 2025).
Nonparametric estimation of discrete spectral laws and optimal transport for multivariate discrete joint law matching.
Bulk–tail transitions and threshold-free modeling through continuous-discrete composition and flexible mixing, as in DEGPD and ZIDEGPD frameworks (Ahmad et al., 2024, Ahmad et al., 2022).
Extensions to spatio-temporal modeling of discrete-valued extreme processes.

These directions aim to further robust discrete extreme value analysis, expand the utility of DGPDs across count-data domains, and address statistical uncertainties associated with thresholding and tail estimation.