Practical strategies for generalized extreme value-based regression models for extremes

Daniela Castro-Camilo, Raphaël Huser, Haavard Rue

Research output: Contribution to journalArticlepeer-review

10 Scopus citations

Abstract

The generalized extreme value (GEV) distribution is the only possible limiting distribution of properly normalized maxima of a sequence of independent and identically distributed random variables. As such, it has been widely applied to approximate the distribution of maxima over blocks. In these applications, GEV properties such as finite lower endpoint when the shape parameter ξ is positive or the loss of moments due to the magnitude of ξ are inherited by the finite-sample maxima distribution. The extent to which these properties are realistic for the data at hand has been widely ignored. Motivated by these overlooked consequences in a regression setting, we here make three contributions. First, we propose a blended GEV (bGEV) distribution, which smoothly combines the left tail of a Gumbel distribution (GEV with ξ=0 ) with the right tail of a Fréchet distribution (GEV with ξ>0 ). Our resulting distribution has, therefore, unbounded support. Second, we proposed a principled method called property-preserving penalized complexity (P 3 C) prior to decide on the existence of the GEV distribution first and second moments a priori. Third, we propose a reparametrization of the GEV distribution that provides a more natural interpretation of the (possibly covariate-dependent) model parameters, which in turn helps define meaningful priors. We implement the bGEV distribution with the new parameterization and the P 3 C prior approach in the R-INLA package to make it readily available to users. We illustrate our methods with a simulation study that reveals that the GEV and bGEV distributions are comparable when estimating the right tail under large-sample settings. Moreover, some small-sample settings show that the bGEV fit slightly outperforms the GEV fit. Finally, we conclude with an application to NO 2 pollution levels in California that illustrates the suitability of the new parameterization and the P 3 C prior approach in the Bayesian framework.
Original languageEnglish (US)
JournalEnvironmetrics
DOIs
StatePublished - Jun 28 2022

Bibliographical note

KAUST Repository Item: Exported on 2022-06-30
Acknowledged KAUST grant number(s): OSR-CRG2017-3434
Acknowledgements: We thank Sabrina Vettori for providing a simplified version of the pollution data. We acknowledge Lars Holden for the notion of blending at the distribution level, an idea that came to light during a hallway conversation 20 years ago. This publication is partially based upon work supported by the King Abdullah University of Science and Technology (KAUST) Office of Sponsored Research (OSR) under Award No. OSR-CRG2017-3434.

ASJC Scopus subject areas

  • Ecological Modeling
  • Statistics and Probability

Fingerprint

Dive into the research topics of 'Practical strategies for generalized extreme value-based regression models for extremes'. Together they form a unique fingerprint.

Cite this