On the uncertainty of individual prediction because of sampling predictors

Changyu Shen; Xiaochun Li

doi:10.1002/sim.6849

On the uncertainty of individual prediction because of sampling predictors

Stat Med. 2016 May 30;35(12):2016-30. doi: 10.1002/sim.6849. Epub 2015 Dec 28.

Authors

Changyu Shen¹, Xiaochun Li¹

Affiliation

¹ Department of Biostatistics, School of Medicine, Richard M. Fairbanks School of Public Health, Indiana University, Indianapolis, IN, 46202, U.S.A.

PMID: 26712471
DOI: 10.1002/sim.6849

Abstract

Prediction of an outcome for a given unit based on prediction models built on a training sample plays a major role in many research areas. The uncertainty of the prediction is predominantly characterized by the subject sampling variation in current practice, where prediction models built on hypothetically re-sampled units yield variable predictions for the same unit of interest. It is almost always true that the predictors used to build prediction models are simply a subset of the entirety of factors related to the outcome. Following the frequentist principle, we can account for the variation because of hypothetically re-sampled predictors used to build the prediction models. This is particularly important in medicine where the prediction has important and sometime life-death consequences on a patient's health status. In this article, we discuss some rationale along this line in the context of medicine. We propose a simple approach to estimate the standard error of the prediction that accounts for the variation because of sampling both subjects and predictors under logistic and Cox regression models. A simulation study is presented to support our argument and demonstrate the performance of our method. The concept and method are applied to a real data set. Copyright © 2015 John Wiley & Sons, Ltd.

Keywords: conditional distribution; frequentist principle; prediction uncertainty; predictor-sampling variation; subject-sampling variation.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Biomedical Research / methods
Forecasting
Humans
Likelihood Functions
Models, Statistical
Models, Theoretical
Probability
Proportional Hazards Models
Research Design
Sampling Studies*
Uncertainty*