Setting performance standards for mannequin-based acute-care scenarios: an examinee-centered approach

John R Boulet; David Murray; Joseph Kras; Julie Woodhouse

doi:10.1097/SIH.0b013e31816e39e2

Setting performance standards for mannequin-based acute-care scenarios: an examinee-centered approach

Simul Healthc. 2008 Summer;3(2):72-81. doi: 10.1097/SIH.0b013e31816e39e2.

Authors

John R Boulet¹, David Murray, Joseph Kras, Julie Woodhouse

Affiliation

¹ Foundation for Advancement of International Medical Education and Research, Philadelphia, Pennsylvania 19104, USA. jboulet@faimer.org

PMID: 19088645
DOI: 10.1097/SIH.0b013e31816e39e2

Abstract

Background: In medicine, standard setting methodologies have been developed for both selected-response and performance-based assessments. For simulation-based tasks, research efforts have been directed primarily at assessments that incorporate standardized patients. Mannequin-based evaluations often demand complex, time-sensitive, hierarchically ordered, sequential actions that are difficult to evaluate and score. Moreover, collecting reliable proficiency judgments, necessary to estimate meaningful cut points, can be challenging. The purpose of this investigation was to explore whether expert judgments obtained using an examinee-centered standard setting method that was previously validated for standardized patient-based assessments could be used to set defensible standards for acute-care, mannequin-based scenarios.

Methods: Nineteen physicians were recruited to serve as panelists. For each of 12 simulation scenarios, between 8 and 10 performance samples (audio-video recordings), covering the expected ability continuum, were chosen for review. The performance samples were selected from a previously administered evaluation of postgraduate trainees. Based on a consensus definition of readiness to enter unsupervised practice, the panelists made independent judgments of each performance. For each scenario, the association between the panelists' judgments and the assessment scores was summarized and used to estimate a scenario-specific cut score.

Results: For 9 of the scenarios, there was at least a moderately strong relationship between the aggregate panelists' rating and the performance scores, thus allowing for estimation of meaningful numeric standards. For the other 3 scenarios, the aggregate decision rules used by the panelists did not correspond with the achievement measures. For scenarios independently rated by split panels, the estimated cut scores were similar.

Conclusions: An examinee-centered approach, using aggregate expert judgments of audio-video performances, was suitable for setting standards on most acute-care, mannequin-based scenarios. It is necessary, however, to have valid scores for the chosen scenarios and to sample performances across the ability spectrum.

MeSH terms

Anesthesia*
Anesthesiology / education*
Curriculum
Data Collection
Educational Measurement
Educational Status
Humans
Manikins*
Outcome Assessment, Health Care
Patient Care*
Patient Simulation*
Physicians
Pilot Projects
Program Evaluation
Psychological Tests
Psychometrics
Quality of Health Care*
Regression Analysis
Surveys and Questionnaires