Nearly Minimax Discrete Distribution Estimation in Kullback-Leibler Divergence with High Probability
Abstract: We consider the problem of estimating a discrete distribution $p$ with support of size $K$ and provide both upper and lower bounds with high probability in KL divergence. We prove that in the worst case, for any estimator $\widehat{p}$, with probability at least $\delta$, $\text{KL}(p | \widehat{p}) \geq C\max{K,\ln(K)\ln(1/\delta) }/n $, where $n$ is the sample size and $C > 0$ is a constant. We introduce a computationally efficient estimator $p{\text{OTB}}$, based on Online to Batch conversion and suffix averaging, and show that with probability at least $1 - \delta$ $\text{KL}(p | \widehat{p}) \leq C(K\log(\log(K)) + \ln(K)\ln(1/\delta)) /n$. Furthermore, we also show that with sufficiently many observations relative to $\log(1/\delta)$, the maximum likelihood estimator $\bar{p}$ guarantees that with probability at least $1-\delta$ $$ 1/6 \chi2(\bar{p}|p) \leq 1/4 \chi2(p|\bar{p}) \leq \text{KL}(p|\bar{p}) \leq C(K + \log(1/\delta))/n\,, $$ where $\chi2$ denotes the $\chi2$-divergence.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.