TY - JOUR
T1 - Hierarchical bayesian modeling of topics in time-stamped documents
AU - Pruteanu-Malinici, Iulian
AU - Ren, Lu
AU - Paisley, John
AU - Wang, Eric
AU - Carin, Lawrence
N1 - Generated from Scopus record by KAUST IRTS on 2021-02-09
PY - 2010/4/8
Y1 - 2010/4/8
N2 - We consider the problem of inferring and modeling topics in a sequence of documents with known publication dates. The documents at a given time are each characterized by a topic and the topics are drawn from a mixture model. The proposed model infers the change in the topic mixture weights as a function of time. The details of this general framework may take different forms, depending on the specifics of the model. For the examples considered here, we examine base measures based on independent multinomial-Dirichlet measures for representation of topic-dependent word counts. The form of the hierarchical model allows efficient variational Bayesian inference, of interest for large-scale problems. We demonstrate results and make comparisons to the model when the dynamic character is removed, and also compare to latent Dirichlet allocation (LDA) and Topics over Time (TOT). We consider a database of Neural Information Processing Systems papers as well as the US Presidential State of the Union addresses from 1790 to 2008. © 2010 IEEE.
AB - We consider the problem of inferring and modeling topics in a sequence of documents with known publication dates. The documents at a given time are each characterized by a topic and the topics are drawn from a mixture model. The proposed model infers the change in the topic mixture weights as a function of time. The details of this general framework may take different forms, depending on the specifics of the model. For the examples considered here, we examine base measures based on independent multinomial-Dirichlet measures for representation of topic-dependent word counts. The form of the hierarchical model allows efficient variational Bayesian inference, of interest for large-scale problems. We demonstrate results and make comparisons to the model when the dynamic character is removed, and also compare to latent Dirichlet allocation (LDA) and Topics over Time (TOT). We consider a database of Neural Information Processing Systems papers as well as the US Presidential State of the Union addresses from 1790 to 2008. © 2010 IEEE.
UR - http://ieeexplore.ieee.org/document/5072227/
UR - http://www.scopus.com/inward/record.url?scp=77951622728&partnerID=8YFLogxK
U2 - 10.1109/TPAMI.2009.125
DO - 10.1109/TPAMI.2009.125
M3 - Article
SN - 0162-8828
VL - 32
SP - 996
EP - 1011
JO - IEEE Transactions on Pattern Analysis and Machine Intelligence
JF - IEEE Transactions on Pattern Analysis and Machine Intelligence
IS - 6
ER -