Bayesian Subset Modeling for High-Dimensional Generalized Linear Models

Faming Liang, Qifan Song, Kai Yu

Research output: Contribution to journalArticlepeer-review

55 Scopus citations

Abstract

This article presents a new prior setting for high-dimensional generalized linear models, which leads to a Bayesian subset regression (BSR) with the maximum a posteriori model approximately equivalent to the minimum extended Bayesian information criterion model. The consistency of the resulting posterior is established under mild conditions. Further, a variable screening procedure is proposed based on the marginal inclusion probability, which shares the same properties of sure screening and consistency with the existing sure independence screening (SIS) and iterative sure independence screening (ISIS) procedures. However, since the proposed procedure makes use of joint information from all predictors, it generally outperforms SIS and ISIS in real applications. This article also makes extensive comparisons of BSR with the popular penalized likelihood methods, including Lasso, elastic net, SIS, and ISIS. The numerical results indicate that BSR can generally outperform the penalized likelihood methods. The models selected by BSR tend to be sparser and, more importantly, of higher prediction ability. In addition, the performance of the penalized likelihood methods tends to deteriorate as the number of predictors increases, while this is not significant for BSR. Supplementary materials for this article are available online. © 2013 American Statistical Association.
Original languageEnglish (US)
Pages (from-to)589-606
Number of pages18
JournalJournal of the American Statistical Association
Volume108
Issue number502
DOIs
StatePublished - Jun 2013
Externally publishedYes

Bibliographical note

KAUST Repository Item: Exported on 2020-10-01
Acknowledged KAUST grant number(s): KUS-C1-016-04
Acknowledgements: Faming Liang is Professor, Department of Statistics, Texas A&M University, College Station, TX 77843-3143 (E-mail: fliang@stat.tamu.edu). Qifan Song is Graduate Student, Department of Statistics, Texas A&M University, College Station, TX 77843-3143 (E-mail: qsong@stat.tamu.edu). Kai Yu is Investigator, Division of Cancer Epidemiology & Genetics, National Cancer Institute, Rockville, MD 20892-7335 (E-mail: yuka@mail.nih.gov). Liang's research was partially supported by grants from the National Science Foundation (DMS-1007457 and DMS-1106494) and the award (KUS-C1-016-04) made by King Abdullah University of Science and Technology (KAUST). The authors thank Dr. Chris Hans for sending us the lymph data, and thank the editor, associate editor, and two referees for their constructive comments that have led to significant improvement of this article.
This publication acknowledges KAUST support, but has no KAUST affiliated authors.

Fingerprint

Dive into the research topics of 'Bayesian Subset Modeling for High-Dimensional Generalized Linear Models'. Together they form a unique fingerprint.

Cite this