Close

Presentation

This content is available for: Workshop Reg Pass. Upgrade Registration
Entropy-Based Regularization on Deep Learning Models for Anti-Cancer Drug Response Prediction
DescriptionThis work studies a particular setting for regression problems – tasks with complex combinatorial data space where samples can be divided into distinct groups. Anti-cancer drug response prediction is a perfect example of this setting, in which each sample includes cancer biological features and drug chemical information. Many existing works of pan-drug and pan-cancer response modeling treat different combinations of drugs and cancers as individual samples. A potential problem in these works is that a model may be heavily influenced and biased toward overrepresented drugs and cancers. Our work develops a method to solve this issue by adjusting the model training process in a deep learning framework.

In the drug response prediction field, the performance of pan-cancer pan-drug models is commonly evaluated on a holdout test set through cross-validation (CV) using performance metrics like the coefficient of determination (R2) and the mean squared error (MSE). However, drug response prediction can be viewed as a multi-objective optimization task, attempting to maximize the prediction performance over different drugs and cancers. We consider the performance for each drug as a separate objective and attempt to find a model on the corresponding Pareto front that provides balanced performances for all compounds. We propose adding an entropy-based regularization to the loss function for model training to reach this balanced state. The intuition behind it is straightforward – we minimize the MSE on all data points while encouraging the drug-specific model fitting error variability to stay as low as possible. We achieve this by grouping samples by their drug identity and computing the MSE for each group in the training batch. Then we calculate entropy over normalized group-specific losses. This value is plugged into the regularization term that incentivizes the loss function to maximize it. The maximum entropy can be achieved when the MSEs across all drugs take the same value.

We investigate the regularization effect on response modeling using a drug screening dataset of Cancer Cell Line Encyclopedia (CCLE) and a state-of-the-art drug response prediction model DeepTTA [1]. We consider two CV strategies for model evaluation – random split and drug-blind split. In a random split, the testing set can share both cell lines and drugs with the training set, while in a drug-blind split, a drug can not appear simultaneously in both the training and testing sets. We perform 10-fold CV analyses and evaluate the model performance using R2. For the random split, we see that adding the entropy-based regularization leads to a marginal improvement in prediction performance, which is 0.736 without regularization versus 0.746 with regularization and a p-value of 0.130 from the pairwise t-test. Importantly, we observe a substantial improvement in the more challenging setting of drug-blind split, where the prediction performance increases from -0.128 (without regularization) to 0.168 (with regularization) with a statistically significant p-value of 0.005 from pair-wise t-test.


1 Jiang, L., Jiang, C., Yu, X., Fu, R., Jin, S., and Liu, X.: ‘DeepTTA: a transformer-based model for predicting cancer drug response’, Briefings in Bioinformatics, 2022, 23, (3), pp. bbac100
Event Type
Workshop
TimeSunday, 12 November 20235:15pm - 5:30pm MST
Location506
Tags
Applications
State of the Practice
Registration Categories
W