Utilizing a multi-class classification approach to detect therapeutic and recreational misuse of opioids on Twitter

Comput Biol Med. 2021 Feb:129:104132. doi: 10.1016/j.compbiomed.2020.104132. Epub 2020 Nov 20.

Abstract

Background: Opioid misuse (OM) is a major health problem in the United States, and can lead to addiction and fatal overdose. We sought to employ natural language processing (NLP) and machine learning to categorize Twitter chatter based on the motive of OM.

Materials and methods: We collected data from Twitter using opioid-related keywords, and manually annotated 6988 tweets into three classes-No-OM, Pain-related-OM, and Recreational-OM-with the No-OM class representing tweets indicating no use/misuse, and the Pain-related misuse and Recreational-misuse classes representing misuse for pain or recreation/addiction. We trained and evaluated multi-class classifiers, and performed term-level k-means clustering to assess whether there were terms closely associated with the three classes.

Results: On a held-out test set of 1677 tweets, a transformer-based classifier (XLNet) achieved the best performance with F1-score of 0.71 for the Pain-misuse class, and 0.79 for the Recreational-misuse class. Macro- and micro-averaged F1-scores over all classes were 0.82 and 0.92, respectively. Content-analysis using clustering revealed distinct clusters of terms associated with each class.

Discussion: While some past studies have attempted to automatically detect opioid misuse, none have further characterized the motive for misuse. Our multi-class classification approach using XLNet showed promising performance, including in detecting the subtle differences between pain-related and recreation-related misuse. The distinct clustering of class-specific keywords may help conduct targeted data collection, overcoming under-representation of minority classes.

Conclusion: Machine learning can help identify pain-related and recreational-related OM contents on Twitter to potentially enable the study of the characteristics of individuals exhibiting such behavior.

Keywords: Classification; Deep learning; Opioid abuse; Opioid misuse; Pain; Recreational opioid misuse; Twitter; Word2Vec.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Analgesics, Opioid / adverse effects
  • Humans
  • Machine Learning
  • Natural Language Processing
  • Opioid-Related Disorders*
  • Social Media*
  • United States

Substances

  • Analgesics, Opioid