Abstract
While medication for opioid use disorder (MOUD) is effective for a significant proportion of patients, many return to using opioids during treatment. Understanding which factors lead to successful treatment informs the development of implementation approaches that can improve outcomes. This manuscript and its accompanying website provide an applied introduction to interpretable machine learning for clinical investigators interested in predicting treatment response for people using MOUD.
This study applied machine learning (ML) algorithms (K-Nearest Neighbors (KNN), logistic regression with and without regularization, Multivariate Additive Regression Splines (MARS), Support Vector Machines, Classification and Regression Trees (CART), Random Forest, Bayesian Additive Regression Trees (BART), Boosted Trees, Neural Networks) to predict failure of treatment in a collection of 2478 individuals who had participated in the three largest pragmatic, clinical trials of MOUD.
All models produced Receiver Operating Characteristic Area Under the Curve (ROC AUC) estimates in the range of 0.62 to 0.67 using cross-validation data and the optimal model, random forest, achieved 0.65 using testing data. The algorithms nearly universally identified predictive features such as age, intravenous drug use days, study medication, and study site. Most algorithms also identified various aspects of smoking. Only the algorithms that detect complex non-linear trends identified details from timeline follow-back. One algorithm, BART, performed well while devaluing all treatment-specific details.
After explaining how to apply, compare, and contrast various ML workflows, the results show that while overall modeling performance is similar across the models developed, the use of different algorithms identifies different sets of predictive features. Previous research has not recognized some features as important for predicting treatment outcomes. A companion website introduces clinical investigators to the concepts and implementations this study presents. That site also provides a detailed annotated blueprint to fully replicate, or even expand, this work.
•Overall model performance was similar for ML and traditional modeling methods.•Multiple algorithms identified consistent features predicting failure of treatment.•Behavioral features like smoking and IV drug use were important predictors.•Care features like study site and days in detox were important predictors.•Data and replicable analysis workflows with explanations are provided.