DDL: Deep Dictionary Learning for Predictive Phenotyping



  • Tianfan Fu
  • Trong Nghia Hoang
  • Cao Xiao
  • Jimeng Sun

Published on


Predictive phenotyping is about accurately predicting what phenotypes will occur in the next clinical visit based on longitudinal Electronic Health Record (EHR) data. Several deep learning (DL) models have demonstrated great performance in predictive phenotyping. However, these DL-based phenotyping models requires access to a large amount of labeled data, which are often  expensive to acquire. To address this label-insufficient challenge, we propose a deep dictionary learning framework (DDL) for phenotyping, which utilizes unlabeled data as a complementary source of information to generate a better, more succinct data representation. With extensive experiments on multiple real-world EHR datasets, we demonstrated DDL can outperform the state of the art predictive phenotyping methods on a wide variety of clinical tasks that require patient phenotyping such as heart failure classification, mortality prediction, and sequential prediction. All empirical results consistently show that unlabeled data can indeed be used to generate better data representation, which helps improve DDL’s phenotyping performance over existing baseline methods that only uses labeled data.

Please cite our work using the BibTeX below.

  title     = {DDL: Deep Dictionary Learning for Predictive Phenotyping},
  author    = {Fu, Tianfan and Hoang, Trong Nghia and Xiao, Cao and Sun, Jimeng},
  booktitle = {Proceedings of the Twenty-Eighth International Joint Conference on
               Artificial Intelligence, {IJCAI-19}},
  publisher = {International Joint Conferences on Artificial Intelligence Organization},
  pages     = {5857--5863},
  year      = {2019},
  month     = {7},
  doi       = {10.24963/ijcai.2019/812},
  url       = {},
Close Modal