Should we trust our judgments about the proficiency of Motivational Interviewing counselors? A glimpse at the impact of low inter-rater reliability

Motiv Interviewing. 2014;1(3):38-41. doi: 10.5195/mitrip.2014.43.

Abstract

Standardized rating systems are often used to evaluate the proficiency of Motivational Interviewing (MI) counselors. The published inter-rater reliability (degree of coder agreement) in many studies using these instruments has varied a great deal; some studies report MI proficiency scores that have only fair inter-rater reliability, and others report scores with excellent reliability. How much can we to trust the scores with fair versus excellent reliability? Using a Monte Carlo statistical simulation, we compared the impact of fair (0.50) versus excellent (0.90) reliability on the error rates of falsely judging a given counselor as MI proficient or not proficient. We found that improving the inter-rater reliability of any given score from 0.5 to 0.9 would cause a marked reduction in proficiency judgment errors, a reduction that in some MI evaluation situations would be critical. We discuss some practical tradeoffs inherent in various MI evaluation situations, and offer suggestions for applying findings from formal MI research to problems faced by real-world MI evaluators, to help them minimize the MI proficiency judgment errors bearing the greatest cost.

Keywords: Motivational Interviewing Treatment Integrity; counselor proficiency; inter-rater reliability; motivational interviewing; proficiency judgments.