Papers
Topics
Authors
Recent
Search
2000 character limit reached

Assessment of the quality of a prediction

Published 24 Apr 2024 in math.ST, stat.ME, and stat.TH | (2404.15764v5)

Abstract: Shannon defined the mutual information between two variables. We illustrate why the true mutual information between a variable and the predictions made by a prediction algorithm is not a suitable measure of prediction quality, but the apparent Shannon mutual information (ASI) is; indeed it is the unique prediction quality measure with either of two very different lists of desirable properties, as previously shown by de Finetti and other authors. However, estimating the uncertainty of the ASI is a difficult problem, because of long and non-symmetric heavy tails to the distribution of the individual values of $j(x,y)=\log\frac{Q_y(x)}{P(x)}$ We propose a Bayesian modelling method for the distribution of $j(x,y)$, from the posterior distribution of which the uncertainty in the ASI can be inferred. This method is based on Dirichlet-based mixtures of skew-Student distributions. We illustrate its use on data from a Bayesian model for prediction of the recurrence time of prostate cancer. We believe that this approach is generally appropriate for most problems, where it is infeasible to derive the explicit distribution of the samples of $j(x,y)$, though the precise modelling parameters may need adjustment to suit particular cases.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (17)
  1. D. Lindley, “Scoring rules and inevitability of probability,” International Statistical Review, vol. 50, pp. 1–26, 1982.
  2. Cambridge University Press, 2003.
  3. J. Hilden, J. Habbema, and B. Bjerregaard, “The measurement of performance in probabilistic diagnosis: III: Methods based on continuous functions of the diagnostic probabilities,” Method of Information in Medicine, vol. 17, pp. 238–246, 1978.
  4. C. E. Shannon, “A Mathematical Theory of Communication,” Bell Systems Technical Journal, vol. 27, p. 379–423, 1948.
  5. R. Winkler, “Scoring rules and the evaluation of probability assessors,” Journal of the American Statistical Association, vol. 64, pp. 1073–1078, 1969.
  6. A. Shapiro, “The evaluation of clinical predictions,” New England Journal of Medicine, vol. 296, no. 26, pp. 1509–1514, 1977.
  7. David J.C. MacKay. Personal communication, 2010.
  8. F. Harrell, K. Lee, and D. Mark, “Multivariable prognostic models: Issues in developing modeols, evaluating assumptions and adequacy, and measuring and reducing errors,” Statistics in Medicine, vol. 15, pp. 361–387, 1996.
  9. R. F. Sewell, E. J. Crowe, and S. F. Shariat, “Biomarkers can predict time of recurrence of prostate cancer with strictly positive apparent shannon information against an exponential attrition prior,” 2010. Paper to be uploaded to arxiv shortly; v2 of the present paper will contain the definitive reference.
  10. T. Cover and J. Thomas, Elements of Information Theory. Wiley (New York), 1991.
  11. C. E. Shannon and W. Weaver, The mathematical theory of communication. University of Illinois Press, 1949.
  12. E. S. Jr, A. Albert, and H. Massengill, “Admissible probability measurement procedures,” Psychometrika, vol. 31, pp. 125–145, 1966.
  13. B. de Finetti and L. Savage.
  14. C. E. R. et al, “DELVE: Data for evaluating learning in valid experiments.” http://www.ph.tn.tudelft.nl/PRInfo/data/msg00028.html. Downloaded 15th April 2010.
  15. C. E. Rasmussen and R. M. Neal, “The DELVE manual.” Technical Report, University of Toronto, Department of Computer Science, 1996.
  16. W. Gilks and P. Wild, “Adaptive rejection sampling for gibbs sampling,” Applied Statistics, vol. 41, pp. 337–348, 1992.
  17. R. M. Neal, “Probabilistic inference using Markov chain Monte Carlo methods.” Technical report CRG-TR-93-1, Dept of Computer Science, University of Toronto, 1993.
Citations (1)

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (1)

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 0 likes about this paper.