How Do Programs Measure Resident Performance? A Multi-Institutional Inventory of General Surgery Assessments

John Luckoski; Danielle Jean; Angela Thelen; Laura Mazer; Brian George; Daniel E Kendrick

doi:10.1016/j.jsurg.2021.08.024

How Do Programs Measure Resident Performance? A Multi-Institutional Inventory of General Surgery Assessments

J Surg Educ. 2021 Nov-Dec;78(6):e189-e195. doi: 10.1016/j.jsurg.2021.08.024. Epub 2021 Sep 28.

Authors

John Luckoski¹, Danielle Jean², Angela Thelen³, Laura Mazer³, Brian George³, Daniel E Kendrick⁴

Affiliations

¹ Department of Surgery, University of Michigan, Ann Arbor, Michigan. Electronic address: Luckoski@med.umich.edu.
² University of Michigan Medical School, Ann Arbor, Michigan.
³ Department of Surgery, University of Michigan, Ann Arbor, Michigan.
⁴ Department of Surgery, University of Minnesota, Minneapolis, Minnesota.

PMID: 34593329
DOI: 10.1016/j.jsurg.2021.08.024

Abstract

Objective: To perform an inventory of assessment tools in use at surgical residency programs and their alignment with the Milestone Competencies.

Design: We conducted an inventory of all assessment tools from a sample of general surgery training programs participating in a multi-center study of resident operative development in the United States. Each instrument was categorized using a data extraction tool designed to identify criteria for effective assessment in competency based education and according to which Milestone Competency was being evaluated. Tabulations of each category were then analyzed using descriptive statistics. Interviews with program directors and assessment coordinators were conducted to understand each instrument's intended use within each program.

Setting: Multi-institutional review of general surgery assessment programs.

Participants: We identified assessment tools used by 10 general surgery programs during the 2019 to 2020 academic year. Programs were selected from a cohort already participating in a separate research study of resident operative development in the United States.

Results: We identified 42 unique assessment tools used. Each program used an average of 7.2 (range 4-13) unique assessment instruments to measure performance, of which only 5 (11.9%) were used by at least 1 other program in our sample. Of all assessments, 59.5% were used monthly or less frequently. The majority (66.7%) of instruments were retrospective global assessments, rather than discrete observed performances. There were 4 (9.5%) instruments with established reliability or validity evidence. Across programs there was also significant variation in the volume of assessment used to evaluate residents, with the median total number of evaluations/trainee across all Milestone Competencies being 217 (IQR 78) per year. Patient care was the most frequently evaluated Milestone Competency.

Conclusions: General surgical assessment systems predominantly employ non-standardized global assessment tools that lack reliability or validity evidence. This variability makes it challenging to interpret and compare competency standards across programs. A standardized assessment toolkit with established reliability and validity evidence would allow training programs to measure the competence of their trainees more uniformly and understand where improvements in our training system can be made.

Keywords: Assessment; Competency measures; Milestones; Surgical education.

Publication types

Multicenter Study

MeSH terms

Clinical Competence
Education, Medical, Graduate
General Surgery* / education
Humans
Internship and Residency*
Reproducibility of Results
Retrospective Studies
United States