Multi-task learning for classification with Dirichlet process priors

Xue Ya, Liao Xuejun, Lawrence Carin, Balaji Krishnapuram

Research output: Contribution to journalArticlepeer-review

427 Scopus citations

Abstract

Consider the problem of learning logistic-regression models for multiple classification tasks, where the training data set for each task is not drawn from the same statistical distribution. In such a multi-task learning (MTL) scenario, it is necessary to identify groups of similar tasks that should be learned jointly. Relying on a Dirichlet process (DP) based statistical model to learn the extent of similarity between classification tasks, we develop computationally efficient algorithms for two different forms of the MTL problem. First, we consider a symmetric multi-task learning (SMTL) situation in which classifiers for multiple tasks are learned jointly using a variational Bayesian (VB) algorithm. Second, we consider an asymmetric multi-task learning (AMTL) formulation in which the posterior density function from the SMTL model parameters (from previous tasks) is used as a prior for a new task: this approach has the significant advantage of not requiring storage and use of all previous data from prior tasks. The AMTL formulation is solved with a simple Markov Chain Monte Carlo (MCMC) construction. Experimental results on two real life MTL problems indicate that the proposed algorithms: (a) automatically identify subgroups of related tasks whose training data appear to be drawn from similar distributions; and (b) are more accurate than simpler approaches such as single-task learning, pooling of data across all tasks, and simplified approximations to DP.
Original languageEnglish (US)
Pages (from-to)35-63
Number of pages29
JournalJournal of Machine Learning Research
Volume8
StatePublished - Jan 1 2007
Externally publishedYes

Bibliographical note

Generated from Scopus record by KAUST IRTS on 2021-02-09

Fingerprint

Dive into the research topics of 'Multi-task learning for classification with Dirichlet process priors'. Together they form a unique fingerprint.

Cite this