Uncertainty in lung cancer stage for survival estimation via set-valued classification

Stat Med. 2022 Aug 30;41(19):3772-3788. doi: 10.1002/sim.9448. Epub 2022 Jun 8.

Abstract

The difficulty in identifying cancer stage in health care claims data has limited oncology quality of care and health outcomes research. We fit prediction algorithms for classifying lung cancer stage into three classes (stages I/II, stage III, and stage IV) using claims data, and then demonstrate a method for incorporating the classification uncertainty in survival estimation. Leveraging set-valued classification and split conformal inference, we show how a fixed algorithm developed in one cohort of data may be deployed in another, while rigorously accounting for uncertainty from the initial classification step. We demonstrate this process using SEER cancer registry data linked with Medicare claims data.

Keywords: cancer; classification; conformal inference; survival analysis.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Aged
  • Humans
  • Insurance Claim Review*
  • Lung Neoplasms*
  • Medicare
  • SEER Program
  • Uncertainty
  • United States / epidemiology