On the uncertainty of individual prediction because of sampling predictors

Stat Med. 2016 May 30;35(12):2016-30. doi: 10.1002/sim.6849. Epub 2015 Dec 28.

Abstract

Prediction of an outcome for a given unit based on prediction models built on a training sample plays a major role in many research areas. The uncertainty of the prediction is predominantly characterized by the subject sampling variation in current practice, where prediction models built on hypothetically re-sampled units yield variable predictions for the same unit of interest. It is almost always true that the predictors used to build prediction models are simply a subset of the entirety of factors related to the outcome. Following the frequentist principle, we can account for the variation because of hypothetically re-sampled predictors used to build the prediction models. This is particularly important in medicine where the prediction has important and sometime life-death consequences on a patient's health status. In this article, we discuss some rationale along this line in the context of medicine. We propose a simple approach to estimate the standard error of the prediction that accounts for the variation because of sampling both subjects and predictors under logistic and Cox regression models. A simulation study is presented to support our argument and demonstrate the performance of our method. The concept and method are applied to a real data set. Copyright © 2015 John Wiley & Sons, Ltd.

Keywords: conditional distribution; frequentist principle; prediction uncertainty; predictor-sampling variation; subject-sampling variation.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Biomedical Research / methods
  • Forecasting
  • Humans
  • Likelihood Functions
  • Models, Statistical
  • Models, Theoretical
  • Probability
  • Proportional Hazards Models
  • Research Design
  • Sampling Studies*
  • Uncertainty*