Abstract
We review the Akaike, deviance, and Watanabe-Akaike information criteria from a Bayesian perspective, where the goal is to estimate expected out-of-sample-prediction error using a bias-corrected adjustment of within-sample error. We focus on the choices involved in setting up these measures, and we compare them in three simple examples, one theoretical and two applied. The contribution of this paper is to put all these information criteria into a Bayesian predictive context and to better understand, through small examples, how these methods can apply in practice.



Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Aitkin, M.: Statistical Inference: an Integrated Bayesian/Likelihood Approach. Chapman & Hall, London (2010)
Akaike, H.: Information theory and an extension of the maximum likelihood principle. In: Petrov, B.N., Csaki, F. (eds.) Proceedings of the Second International Symposium on Information Theory, pp. 267–281. Akademiai Kiado, Budapest (1973). Reprinted in: Kotz, S. (ed.) Breakthroughs in Statistics, pp. 610–624. Springer, New York (1992)
Ando, T., Tsay, R.: Predictive likelihood for Bayesian model selection and averaging. Int. J. Forecast. 26, 744–763 (2010)
Bernardo, J.M.: Expected information as expected utility. Ann. Stat. 7, 686–690 (1979)
Bernardo, J.M., Smith, A.F.M.: Bayesian Theory. Wiley, New York (1994)
Burman, P.: A comparative study of ordinary cross-validation, v-fold cross-validation and the repeated learning-testing methods. Biometrika 76, 503–514 (1989)
Burman, P., Chow, E., Nolan, D.: A cross-validatory method for dependent data. Biometrika 81, 351–358 (1994)
Burnham, K.P., Anderson, D.R.: Model Selection and Multimodel Inference: a Practical Information Theoretic Approach. Springer, New York (2002)
Celeux, G., Forbes, F., Robert, C., Titterington, D.: Deviance information criteria for missing data models. Bayesian Anal. 1, 651–706 (2006)
DeGroot, M.H.: Optimal Statistical Decisions. McGraw-Hill, New York (1970)
Dempster, A.P.: The direct use of likelihood for significance testing. In: Proceedings of Conference on Foundational Questions in Statistical Inference, Department of Theoretical Statistics: University of Aarhus, pp. 335–352 (1974)
Draper, D.: Model uncertainty yes, discrete model averaging maybe. Stat. Sci. 14, 405–409 (1999)
Efron, B., Tibshirani, R.: An Introduction to the Bootstrap. Chapman & Hall, New York (1993)
Geisser, S., Eddy, W.: A predictive approach to model selection. J. Am. Stat. Assoc. 74, 153–160 (1979)
Gelfand, A., Dey, D.: Bayesian model choice: asymptotics and exact calculations. J. R. Stat. Soc. B 56, 501–514 (1994)
Gelman, A., Carlin, J.B., Stern, H.S., Rubin, D.B.: Bayesian Data Analysis, 2nd edn. CRC Press, London (2003)
Gelman, A., Meng, X.L., Stern, H.S.: Posterior predictive assessment of model fitness via realized discrepancies (with discussion). Stat. Sin. 6, 733–807 (1996)
Gelman, A., Shalizi, C.: Philosophy and the practice of Bayesian statistics (with discussion). Br. J. Math. Stat. Psychol. 66, 8–80 (2013)
Gneiting, T.: Making and evaluating point forecasts. J. Am. Stat. Assoc. 106, 746–762 (2011)
Gneiting, T., Raftery, A.E.: Strictly proper scoring rules, prediction, and estimation. J. Am. Stat. Assoc. 102, 359–378 (2007)
Hibbs, D.: Implications of the ‘bread and peace’ model for the 2008 U.S. presidential election. Public Choice 137, 1–10 (2008)
Hoeting, J., Madigan, D., Raftery, A.E., Volinsky, C.: Bayesian model averaging (with discussion). Stat. Sci. 14, 382–417 (1999)
Jones, H.E., Spiegelhalter, D.J.: Improved probabilistic prediction of healthcare performance indicators using bidirectional smoothing models. J. R. Stat. Soc. A 175, 729–747 (2012)
McCulloch, R.E.: Local model influence. J. Am. Stat. Assoc. 84, 473–478 (1989)
Plummer, M.: Penalized loss functions for Bayesian model comparison. Biostatistics 9, 523–539 (2008)
Ripley, B.D.: Statistical Pattern Recognition and Neural Networks. Cambridge University Press, Cambridge (1996)
Robert, C.P.: Intrinsic losses. Theory Decis. 40, 191–214 (1996)
Rubin, D.B.: Estimation in parallel randomized experiments. J. Educ. Stat. 6, 377–401 (1981)
Rubin, D.B.: Bayesianly justifiable and relevant frequency calculations for the applied statistician. Ann. Stat. 12, 1151–1172 (1984)
Schwarz, G.E.: Estimating the dimension of a model. Ann. Stat. 6, 461–464 (1978)
Shibata, R.: Statistical aspects of model selection. In: Willems, J.C. (ed.) From Data to Model, pp. 215–240. Springer, Berlin (1989)
Spiegelhalter, D.J., Best, N.G., Carlin, B.P., van der Linde, A.: Bayesian measures of model complexity and fit (with discussion). J. R. Stat. Soc. B (2002)
Spiegelhalter, D., Thomas, A., Best, N., Gilks, W., Lunn, D.: BUGS: Bayesian inference using Gibbs sampling. MRC Biostatistics Unit, Cambridge, England (1994, 2003). http://www.mrc-bsu.cam.ac.uk/bugs/
Stone, M.: An asymptotic equivalence of choice of model cross-validation and Akaike’s criterion. J. R. Stat. Soc. B 36, 44–47 (1977)
van der Linde, A.: DIC in variable selection. Stat. Neerl. 1, 45–56 (2005)
Vehtari, A., Lampinen, J.: Bayesian model assessment and comparison using cross-validation predictive densities. Neural Comput. 14, 2439–2468 (2002)
Vehtari, A., Ojanen, J.: A survey of Bayesian predictive methods for model assessment, selection and comparison. Stat. Surv. 6, 142–228 (2012)
Watanabe, S.: Algebraic Geometry and Statistical Learning Theory. Cambridge University Press, Cambridge (2009)
Watanabe, S.: Asymptotic equivalence of Bayes cross validation and widely applicable information criterion in singular learning theory. J. Mach. Learn. Res. 11, 3571–3594 (2010)
Watanabe, S.: A widely applicable Bayesian information criterion. J. Mach. Learn. Res. 14, 867–897 (2013)
Acknowledgements
We thank two reviewers for helpful comments and the National Science Foundation, Institute of Education Sciences, and Academy of Finland (grant 218248) for partial support of this research.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Gelman, A., Hwang, J. & Vehtari, A. Understanding predictive information criteria for Bayesian models. Stat Comput 24, 997–1016 (2014). https://doi.org/10.1007/s11222-013-9416-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11222-013-9416-2