We discuss tools for the evaluation of probabilistic forecasts and the critique of
statistical models for ordered discrete data. Our proposals include a non-randomized version
of the probability integral transform, marginal calibration diagrams and proper scoring rules,
such as the predictive deviance. In case studies, we critique count regression models for
patent data, and assess the predictive performance of Bayesian age-period-cohort models for
larynx cancer counts in Germany.
Keywords:
Calibration; Forecast verification; Model diagnostics; Predictive deviance; Probability integral transform; Proper scoring rule; Ranked probability score.