Performance of Statistical and Machine Learning Risk Prediction Models for Surveillance Benefits and Failures in Breast Cancer Survivors

Yu-Ru Su; Diana S M Buist; Janie M Lee; Laura Ichikawa; Diana L Miglioretti; Erin J Aiello Bowles; Karen J Wernli; Karla Kerlikowske; Anna Tosteson; Kathryn P Lowry; Louise M Henderson; Brian L Sprague; Rebecca A Hubbard

doi:10.1158/1055-9965.EPI-22-0677

Performance of Statistical and Machine Learning Risk Prediction Models for Surveillance Benefits and Failures in Breast Cancer Survivors

Cancer Epidemiol Biomarkers Prev. 2023 Apr 3;32(4):561-571. doi: 10.1158/1055-9965.EPI-22-0677.

Authors

Yu-Ru Su¹, Diana S M Buist¹, Janie M Lee², Laura Ichikawa¹, Diana L Miglioretti^{1

3}, Erin J Aiello Bowles¹, Karen J Wernli¹, Karla Kerlikowske^{4

5

6}, Anna Tosteson⁷, Kathryn P Lowry², Louise M Henderson⁸, Brian L Sprague^{9

10}, Rebecca A Hubbard¹¹

Affiliations

¹ Kaiser Permanente Washington Health Research Institute, Kaiser Permanente WA, Seattle, Washington.
² Department of Radiology, University of Washington and Seattle Cancer Care Alliance, Seattle, Washington.
³ Division of Biostatistics, Department of Public Health Sciences, University of California Davis, Davis, California.
⁴ Department of Medicine, University of California, San Francisco, California.
⁵ Department of Epidemiology and Biostatistics, University of California, San Francisco, California.
⁶ General Internal Medicine Section, Department of Veterans Affairs, University of California, San Francisco, California.
⁷ The Dartmouth Institute for Health Policy and Clinical Practice and Norris Cotton Cancer Center, Geisel School of Medicine at Dartmouth, Lebanon, New Hampshire.
⁸ Department of Radiology, University of North Carolina, Chapel Hill, North Carolina.
⁹ Department of Surgery, University of Vermont, Burlington, Vermont.
¹⁰ Department of Radiology, University of Vermont, Burlington, Vermont.
¹¹ Department of Biostatistics, Epidemiology & Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania.

Abstract

Background: Machine learning (ML) approaches facilitate risk prediction model development using high-dimensional predictors and higher-order interactions at the cost of model interpretability and transparency. We compared the relative predictive performance of statistical and ML models to guide modeling strategy selection for surveillance mammography outcomes in women with a personal history of breast cancer (PHBC).

Methods: We cross-validated seven risk prediction models for two surveillance outcomes, failure (breast cancer within 12 months of a negative surveillance mammogram) and benefit (surveillance-detected breast cancer). We included 9,447 mammograms (495 failures, 1,414 benefits, and 7,538 nonevents) from years 1996 to 2017 using a 1:4 matched case-control samples of women with PHBC in the Breast Cancer Surveillance Consortium. We assessed model performance of conventional regression, regularized regressions (LASSO and elastic-net), and ML methods (random forests and gradient boosting machines) by evaluating their calibration and, among well-calibrated models, comparing the area under the receiver operating characteristic curve (AUC) and 95% confidence intervals (CI).

Results: LASSO and elastic-net consistently provided well-calibrated predicted risks for surveillance failure and benefit. The AUCs of LASSO and elastic-net were both 0.63 (95% CI, 0.60-0.66) for surveillance failure and 0.66 (95% CI, 0.64-0.68) for surveillance benefit, the highest among well-calibrated models.

Conclusions: For predicting breast cancer surveillance mammography outcomes, regularized regression outperformed other modeling approaches and balanced the trade-off between model flexibility and interpretability.

Impact: Regularized regression may be preferred for developing risk prediction models in other contexts with rare outcomes, similar training sample sizes, and low-dimensional features.

Publication types

Research Support, Non-U.S. Gov't
Research Support, N.I.H., Extramural
Research Support, U.S. Gov't, P.H.S.

MeSH terms

Breast
Breast Neoplasms*
Cancer Survivors*
Female
Humans
Machine Learning
Mammography

Abstract

Publication types

MeSH terms

Grants and funding