CASTER: An AI framework for preventing adverse reactions to medication

AI in Healthcare


  • Kexin Huang
  • Cao Xiao
  • Nghia Hoang
  • Lucas M. Glass
  • Jimeng Sun

Published on



AAAI AI in Healthcare

Every year, more than 1 million people in the United States are hospitalized as a result of adverse drug events, meaning a drug affects a person’s biochemistry in a detrimental way. Drugs interact with our bodies, but they also interact with each other, so multiple simultaneous prescriptions increase complexity and therefore risk of adverse drug events, especially if those prescriptions are poorly coordinated with one another by multiple doctors. Adverse drug-drug interactions (DDI) result in high morbidity and mortality rates, driving human suffering and high medical costs. Thus, gaining accurate and comprehensive knowledge of DDI’s, especially during the drug design process, is important to both patients and pharmaceutical industry. To address this, we have created a new AI tool that can more accurately predict potentially harmful and unsafe adverse interactions for drugs in the market and ones in the early development phase.

Introducing CASTER for predicting drug interactions

In our paper, CASTER: Predicting Drug Interactions with Chemical Substructure Representation](, presented at this year’s Association for the Advancement of Artificial Intelligence (AAAI), we develop a computational framework for DDI prediction called ChemicAl SubstrucTurE Representation (CASTER). CASTER is an end-to-end dictionary learning framework that incorporates a specialized representation for DDI prediction inspired by the chemical mechanism of drug interactions.

Related works in DDI prediction are currently restricted by several limitations. For example, many drug pairs that are not similar in terms of DDI interaction can still have significant overlap on irrelevant substructures. Previous works (Ryu, Kim, and Lee 2018; Gomez-Bombarelli et al. 2018; Jaeger, Fulle, and Turk 2018) often generate drug representations using the entire chemical representation, which causes the learned representations to be potentially biased toward irrelevant sub-structures. This undermines the learned drug similarity and DDI predictions.

Additionally, some of previous methods need external biomedical knowledge for improved performance and cannot be generalized to drugsin early development phase (Ma et al. 2018; Ferdousi,Safdari, and Omidi 2017; Zhang et al. 2015). Others rely on a small set of labelled training data, which impairs their generalizability to new drugs or DDIs (Ryu, Kim,and Lee 2018; Zhang et al. 2015).

Although DL models show good performance in DDI prediction, they often produce predictions that are characterized by a large number of parameters, which is hard to interpret (Gomez-Bombarelli et al. 2018; Jaeger, Fulle, and Turk 2018).

How CASTER works

CASTER has three major component steps:

  1. A chemical sequential pattern mining algorithm extracts frequent substructures from a molecular database
  2. A latent feature embedding module represents drugs/drug pairs locally in terms of the extracted frequent substructures, then generates a generalizable embedded representation for them via deep dictionary learning
  3. A prediction module learns a small set of coefficients to measure the relevance of each frequent substructure to the DDI outcome – this is achieved by further projecting drug pair embedded representation onto the subspace defined by the embedding of those frequent substructures (i.e., the deep dictionary)
CASTER for drug-drug-interactions

Figure 1: Illustrating the CASTER workflow

The above computation pipeline allows us to generate a specialized representation for drugs that, unlike previous methods, allows the predictive analysis to focus on what’s important and ignore what’s not.

This also allows us to extend the learning beyond the original scope of pure DDI to include any compound pairs with associated chemical information, such as drug-food interaction (DFI) since food compounds are described by the same set of chemical features. This implies using unlabeled drug-food pair can help improve drug-drug representation’s generalizability for better DDI performance.

Finally, since the relevance between a small number of frequent substructures to the DDI outcomes are directly generated, the prediction made by the developed method is also more interpretable to human-practitioners than those made by existing works.

One of the major mechanism of drug interactions results from the chemical reactions among only a few functional sub-structures of the entire drug’s molecular structure (Silverman and Holladay 2014), while the remaining substructures are less relevant.

This observation motivated us to devise a specialized representation that automatically allows the predictive learning to focus only on the most relevant functional substructures which are more likely to be responsible for the interaction.

Experimental Results on DrugBank and BIOSNAP

We evaluate CASTER on two popular public drug database: DrugBank and BIOSNAP.

Experimental results show that the developed method can leverage unlabeled data to improve prediction performance, and consequently achieves higher accuracy in DDI prediction than existing methods.

CASTER experiments

Table 1: CASTER provides more accurate DDI prediction than other strong baselines. First / second row of each method corresponds to results reported on BIOSNAP / DrugBank (DDI) dataset respectively.

CASTER results on BIOSNAP and DrugBank DDI

CASTER can leverage unlabelled data to improve DDI prediction in scarce labels scenario.

CASTER produces interpretable coefficients for understanding drug interaction mechanism

CASTER produces interpretable coefficients for understanding drug interaction mechanism

Case Study: Sildenafil

We also provide a case study to demonstrate that our method indeed generates interpretable prediction. This is achieved by examining the prediction made by the developed method to the known interaction between sildenafil and niltrate-based drugs. Sildenafil is an effective treatment for erectile dysfunction and pulmonary hypertension (Langtry and Markham 1999). Sildenafil is developed as a phosphodiesterase-5 (PDE5) inhibitor. In the presence of PDE5 inhibitor, nitrate (NO−3)-based medications such as isosorbide mononitrate (IM) can cause dramatic increase in cyclic guanosine monophosphate (Murad 1986), which leads to intense drops in blood pressure that can cause heart attack (Langtry and Markham 1999; Ishikura et al. 2000; Chamsi-Pasha 2001).

Our research showed that the prediction was largely influenced by high coefficients assigned to the nitrate group, which is consistent with the above common knowledge between sildenafil and nitrate-based drugs. This demonstrates the interpretability of our prediction model.

Through this research, we were able to empirically demonstrate that the developed method is able to provide more accurate and interpretable DDI predictions than the previous approaches that use generic drug representations. For future works, we plan to extend it to chemical sub-graph embedding and incorporate metric learning for further improvement.”


Unlike current methods to check drug to drug interactions, this new AI tool develops a specialized drug representation to predict the likelihood of adverse reactions between drugs based on their frequent chemical sub-structures, which are more likely to be responsible for their chemical reactions than other less frequent (hence, less relevant) sub-structures. Empirically, this was shown to help the developed method achieve higher accuracy than previous methods in predicting drug-drug interaction.

Please cite our work using the BibTeX below.

title={CASTER: Predicting Drug Interactions with Chemical Substructure Representation},
author={Kexin Huang and Cao Xiao and Trong Nghia Hoang and Lucas M. Glass and Jimeng Sun},
Close Modal