Multi-Label Learning from Medical Plain Text with Convolutional Residual Models

Xinyuan Zhang, Ricardo Henao, Zhe Gan, Yitong Li, Lawrence Carin

Research output: Contribution to journalArticlepeer-review

17 Downloads (Pure)


Predicting diagnoses from Electronic Health Records (EHRs) is an important medical application of multi-label learning. We propose a convolutional residual model for multi-label classification from doctor notes in EHR data. A given patient may have multiple diagnoses, and therefore multi-label learning is required. We employ a Convolutional Neural Network (CNN) to encode plain text into a fixed-length sentence embedding vector. Since diagnoses are typically correlated, a deep residual network is employed on top of the CNN encoder, to capture label (diagnosis) dependencies and incorporate information directly from the encoded sentence vector. A real EHR dataset is considered, and we compare the proposed model with several well-known baselines, to predict diagnoses based on doctor notes. Experimental results demonstrate the superiority of the proposed convolutional residual model.
Original languageEnglish (US)
JournalArxiv preprint
StatePublished - Jan 15 2018
Externally publishedYes

Bibliographical note

Machine Learning for Healthcare 2018 spotlight paper


  • stat.ML
  • cs.LG
  • stat.AP


Dive into the research topics of 'Multi-Label Learning from Medical Plain Text with Convolutional Residual Models'. Together they form a unique fingerprint.

Cite this