TY - JOUR
T1 - Predicting viral infection from high-dimensional biomarker trajectories
AU - Chen, Minhua
AU - Zaas, Aimee
AU - Woods, Christopher
AU - Ginsburg, Geoffrey S.
AU - Lucas, Joseph
AU - Dunson, David
AU - Carin, Lawrence
N1 - Generated from Scopus record by KAUST IRTS on 2021-02-09
PY - 2011/12/1
Y1 - 2011/12/1
N2 - There is often interest in predicting an individual's latent health status based on high-dimensional biomarkers that vary over time. Motivated by time-course gene expression array data that we have collected in two influenza challenge studies performed with healthy human volunteers, we develop a novel time-aligned Bayesian dynamic factor analysis methodology. The time course trajectories in the gene expressions are related to a relatively low-dimensional vector of latent factors, which vary dynamically starting at the latent initiation time of infection. Using a nonparametric cure rate model for the latent initiation times, we allow selection of the genes in the viral response pathway, variability among individuals in infection times, and a subset of individuals who are not infected. As we demonstrate using held-out data, this statistical framework allows accurate predictions of infected individuals in advance of the development of clinical symptoms, without labeled data and even when the number of biomarkers vastly exceeds the number of individuals under study. Biological interpretation of several of the inferred pathways (factors) is provided. © 2011 American Statistical Association.
AB - There is often interest in predicting an individual's latent health status based on high-dimensional biomarkers that vary over time. Motivated by time-course gene expression array data that we have collected in two influenza challenge studies performed with healthy human volunteers, we develop a novel time-aligned Bayesian dynamic factor analysis methodology. The time course trajectories in the gene expressions are related to a relatively low-dimensional vector of latent factors, which vary dynamically starting at the latent initiation time of infection. Using a nonparametric cure rate model for the latent initiation times, we allow selection of the genes in the viral response pathway, variability among individuals in infection times, and a subset of individuals who are not infected. As we demonstrate using held-out data, this statistical framework allows accurate predictions of infected individuals in advance of the development of clinical symptoms, without labeled data and even when the number of biomarkers vastly exceeds the number of individuals under study. Biological interpretation of several of the inferred pathways (factors) is provided. © 2011 American Statistical Association.
UR - http://www.tandfonline.com/doi/abs/10.1198/jasa.2011.ap10611
UR - http://www.scopus.com/inward/record.url?scp=84862961816&partnerID=8YFLogxK
U2 - 10.1198/jasa.2011.ap10611
DO - 10.1198/jasa.2011.ap10611
M3 - Article
SN - 0162-1459
VL - 106
SP - 1259
EP - 1279
JO - Journal of the American Statistical Association
JF - Journal of the American Statistical Association
IS - 496
ER -