Local scale invariance and robustness of proper scoring rules

David Bolin, Jonas Wallin

Research output: Contribution to journalArticlepeer-review

3 Scopus citations


Averages of proper scoring rules are often used to rank probabilistic forecasts. In many cases, the individual terms in these averages are based on observations and forecasts from different distributions. We show that some of the most popular proper scoring rules, such as the continuous ranked probability score (CRPS), give more importance to observations with large uncertainty, which can lead to unintuitive rankings. To describe this issue, we define the concept of local scale invariance for scoring rules. A new class of generalized proper kernel scoring rules is derived and as a member of this class we propose the scaled CRPS (SCRPS). This new proper scoring rule is locally scale invariant and, therefore, works in the case of varying uncertainty. Like the CRPS, it is computationally available for output from ensemble forecasts, and does not require the ability to evaluate densities of forecasts. We further define robustness of scoring rules, show why this also can be an important concept for average scores unless one is specifically interested in extremes, and derive new proper scoring rules that are robust against outliers. The theoretical findings are illustrated in three different applications from spatial statistics, stochastic volatility models and regression for count data.
Original languageEnglish (US)
JournalStatistical Science
Issue number-1
StatePublished - Jan 1 2022

Bibliographical note

KAUST Repository Item: Exported on 2022-10-31
Acknowledgements: The authors would like to acknowledge the Editors and the reviewers, as well as Håvard Rue, Finn Lindgren and Tilmann Gneiting for helpful comments and suggestions that greatly improved the manuscript.

ASJC Scopus subject areas

  • Statistics and Probability
  • Statistics, Probability and Uncertainty
  • General Mathematics


Dive into the research topics of 'Local scale invariance and robustness of proper scoring rules'. Together they form a unique fingerprint.

Cite this