Identification of smoking using Medicare data--a validation study of claims-based algorithms

Pharmacoepidemiol Drug Saf. 2016 Apr;25(4):472-5. doi: 10.1002/pds.3953. Epub 2016 Jan 13.

Abstract

Purpose: This study examined the accuracy of claims-based algorithms to identify smoking against self-reported smoking data.

Methods: Medicare patients enrolled in the Brigham and Women's Hospital Rheumatoid Arthritis Sequential Study were identified. For each patient, self-reported smoking status was extracted from Women's Hospital Rheumatoid Arthritis Sequential Study and the date of this measurement was defined as the index-date. Two algorithms identified smoking in Medicare claims: (i) only using diagnoses and procedure codes and (ii) using anti-smoking prescriptions in addition to diagnoses and procedure codes. Both algorithms were implemented: first, only using 365-days pre-index claims and then using all available pre-index claims. Considering self-reported smoking status as the gold standard, we calculated specificity, sensitivity, positive predictive value, negative predictive value (NPV), and area under the curve (AUC).

Results: A total of 128 patients were included in this study, of which 48% reported smoking. The algorithm only using diagnosis and procedure codes had the lowest sensitivity (9.8%, 95%CI 2.4%-17.3%), NPV (54.9%, 95%CI 46.1%-63.9%), and AUC (0.55, 95%CI 0.51-0.59) when applied in the period of 365 days pre-index. Incorporating pharmacy claims and using all available pre-index information improved the sensitivity (27.9%, 95%CI 16.6%-39.1%), NPV (60.4%, 95%CI 51.3%-69.5%), and AUC (0.64, 95%CI 0.58-0.70). The specificity and positive predictive value was 100% for all the algorithms tested.

Conclusion: Claims-based algorithms can identify smokers with limited sensitivity but very high specificity. In the absence of other reliable means, use of a claims-based algorithm to identify smoking could be cautiously considered in observational studies.

Keywords: claims-based algorithm; pharmacoepidemiology; smoking; validation.

Publication types

  • Research Support, N.I.H., Extramural
  • Validation Study

MeSH terms

  • Aged
  • Algorithms*
  • Databases, Factual / statistics & numerical data*
  • Female
  • Humans
  • Male
  • Medicare / statistics & numerical data*
  • Middle Aged
  • Predictive Value of Tests
  • Prospective Studies
  • Self Report
  • Sensitivity and Specificity
  • Smoking / epidemiology*
  • United States