Machine Learning Risk Stratification Approach Using Patient-Reported Outcomes for Forecasting Unplanned Health Care Use and Symptom Burden in Cancer Survivors

Akina Natori; Jerry R. Bonnell; Vasileios Stathias; Sara E. Fleszar-Pavlovic; Mitsunori Ogihara; Andrew Wang; Ravi Vadapalli; Blanca Silvia Noriega Esquives; Tracy E. Crane; Frank J. Penedo

doi:10.1200/CCI-25-00389

Back

Machine Learning Risk Stratification Approach Using Patient-Reported Outcomes for Forecasting Unplanned Health Care Use and Symptom Burden in Cancer Survivors

Journal article

Peer reviewed

Machine Learning Risk Stratification Approach Using Patient-Reported Outcomes for Forecasting Unplanned Health Care Use and Symptom Burden in Cancer Survivors

Akina Natori, Jerry R. Bonnell, Vasileios Stathias, Sara E. Fleszar-Pavlovic, Mitsunori Ogihara, Andrew Wang, Ravi Vadapalli, Blanca Silvia Noriega Esquives, Tracy E. Crane and Frank J. Penedo

JCO clinical cancer informatics, Vol.10, e2500389

2026-05

DOI: https://doi.org/10.1200/CCI-25-00389

PMID: 42190150

Appears in College Of Engineering - Latest Publications

Abstract

PURPOSEEffective risk stratification in cancer survivorship requires handling longitudinal data characterized by multimodal inputs, irregular follow-up, and recurrent clinical events. This study evaluated the incremental value of integrating patient-reported outcomes (PROs) with electronic health record (EHR) data and identified optimal windowing strategies for machine learning-based prediction of adverse survivorship outcomes.PATIENTS AND METHODSThis study used a cohort of 25,592 cancer survivors followed for 36 months. Data from four domains were integrated: baseline measures, treatments, PROs, and health care utilization (emergency room visits and hospitalizations). Two classification models, LASSO and CATBOOST, were applied across modality combinations and five temporal representations of patient history: static early-phase (0-6 months), cumulative history, sliding windows (4- and 12-month), and a most-recent baseline. Performance was evaluated for predicting monthly health care utilization and patient-reported symptom burden using average precision (AP). SHapley Additive exPlanations (SHAP) analysis identified key predictors and characterized their evolving influence.RESULTSFor health care utilization, CATBOOST models trained on the full multimodal data set with time-windowed predictors achieved strong discrimination (AP = 0.207), outperforming static baselines by 27%. SHAP analyses emphasized dynamic contributions from recent utilization and treatment toxicity. For symptom burden, PRO integration was crucial, nearly doubling clinical-only performance (AP = 0.132 v 0.071), with longer historical context improving characterization of progressive functional decline and symptom severity. Flagging the top 10% of patients by predicted risk captured 51.7% of health care utilizations and 46.7% of symptom burden events.CONCLUSIONAdverse survivorship risk is dynamic and outcome-specific: acute health care utilization is best predicted by recent clinical momentum, while longitudinal patient-reported trends drive symptom burden. Implementing decoupled, dynamic windows provides a flexible framework for risk stratification and risk prediction beyond standard clinical heuristics, facilitating proactive, precision-based survivorship care.

Metrics

1 Record Views

Details

Title: Machine Learning Risk Stratification Approach Using Patient-Reported Outcomes for Forecasting Unplanned Health Care Use and Symptom Burden in Cancer Survivors
Creators: Akina Natori - Sylvester Comprehensive Cancer Center
Jerry R. Bonnell - University of Miami
Vasileios Stathias - Sylvester Comprehensive Cancer Center
Sara E. Fleszar-Pavlovic - Sylvester Comprehensive Cancer Center
Mitsunori Ogihara - University of Miami
Andrew Wang - Sylvester Comprehensive Cancer Center
Ravi Vadapalli - University of Miami
Blanca Silvia Noriega Esquives - Sylvester Comprehensive Cancer Center
Tracy E. Crane - Sylvester Comprehensive Cancer Center
Frank J. Penedo - University of Miami
Publication Details: JCO clinical cancer informatics, Vol.10, e2500389
Publisher: Wolters Kluwer Health; PHILADELPHIA
Number of pages: 10
Grant note: National Institutes of Health through the University of Miami Sylvester Comprehensive Cancer Center's Cancer Center Support Grant: P30CA240139 Miami Clinical and Translational Science Institute - National Center for Advancing Translational Sciences, National Institutes of Health: UM1TR004556, T32CA251064
Supported in part by the National Institutes of Health through the University of Miami Sylvester Comprehensive Cancer Center's Cancer Center Support Grant P30CA240139 (Stephen Nimer, MD, PI). We also received the support from the Miami Clinical and Translational Science Institute, which is supported by the National Center for Advancing Translational Sciences, National Institutes of Health, Award Number UM1TR004556. Blanca S. Noriega Esquives was supported by Grant T32CA251064 (Frank J. Penedo and Erin N. Kobetz, MPIs).
Academic Unit: Miller School of Medicine; College of A&S; CoE - Electrical & Computer Engineering; UMMG Dept of Medicine - Hematology/Oncology; College of Engineering; A&S - Computer Science; A&S - Psychology
Language: English
Resource Type: Journal article
PMID: 42190150
Record Identifier: 991033080268402976