Clinical Advice by Voice Assistants on Postpartum Depression: Cross-Sectional Investigation Using Apple Siri, Amazon Alexa, Google Assistant, and Microsoft Cortana

JMIR Mhealth Uhealth. 2021 Jan 11;9(1):e24045. doi: 10.2196/24045.

Abstract

Background: A voice assistant (VA) is inanimate audio-interfaced software augmented with artificial intelligence, capable of 2-way dialogue, and increasingly used to access health care advice. Postpartum depression (PPD) is a common perinatal mood disorder with an annual estimated cost of $14.2 billion. Only a small percentage of PPD patients seek care due to lack of screening and insufficient knowledge of the disease, and this is, therefore, a prime candidate for a VA-based digital health intervention.

Objective: In order to understand the capability of VAs, our aim was to assess VA responses to PPD questions in terms of accuracy, verbal response, and clinically appropriate advice given.

Methods: This cross-sectional study examined four VAs (Apple Siri, Amazon Alexa, Google Assistant, and Microsoft Cortana) installed on two mobile devices in early 2020. We posed 14 questions to each VA that were retrieved from the American College of Obstetricians and Gynecologists (ACOG) patient-focused Frequently Asked Questions (FAQ) on PPD. We scored the VA responses according to accuracy of speech recognition, presence of a verbal response, and clinically appropriate advice in accordance with ACOG FAQ, which were assessed by two board-certified physicians.

Results: Accurate recognition of the query ranged from 79% to 100%. Verbal response ranged from 36% to 79%. If no verbal response was given, queries were treated like a web search between 33% and 89% of the time. Clinically appropriate advice given by VA ranged from 14% to 29%. We compared the category proportions using the Fisher exact test. No single VA statistically outperformed other VAs in the three performance categories. Additional observations showed that two VAs (Google Assistant and Microsoft Cortana) included advertisements in their responses.

Conclusions: While the best performing VA gave clinically appropriate advice to 29% of the PPD questions, all four VAs taken together achieved 64% clinically appropriate advice. All four VAs performed well in accurately recognizing a PPD query, but no VA achieved even a 30% threshold for providing clinically appropriate PPD information. Technology companies and clinical organizations should partner to improve guidance, screen patients for mental health disorders, and educate patients on potential treatment.

Keywords: conversational agent; mental health; mobile health; postpartum depression; virtual assistant; voice assistant.

MeSH terms

  • Adult
  • Artificial Intelligence
  • Communication*
  • Cross-Sectional Studies
  • Delivery of Health Care / methods*
  • Depression, Postpartum / psychology*
  • Depression, Postpartum / therapy
  • Female
  • Humans
  • Internet
  • Mental Health
  • Smartphone
  • Speech Recognition Software
  • Telemedicine*