Using Artificial Intelligence to Develop a Lexicon-Based African American Tweet Detection Algorithm to Inform Culturally Sensitive Twitter-Based Social Support Interventions for African American Dementia Caregivers

Stud Health Technol Inform. 2022 Jan 14:289:1-4. doi: 10.3233/SHTI210844.

Abstract

We extracted 3,291,101 Tweets using hashtags associated with African American-related discourse (#BlackTwitter, #BlackLivesMatter, #StayWoke) and 1,382,441 Tweets from a control set (general or no hashtags) from September 1, 2019 to December 31, 2019 using the Twitter API. We also extracted a literary historical corpus of 14,692 poems and prose writings by African American authors and 66,083 items authored by others as a control, including poems, plays, short stories, novels and essays, using a cloud-based machine learning platform (Amazon SageMaker) via ProQuest TDM Studio. Lastly, we combined statistics from log likelihood and Fisher's exact tests as well as feature analysis of a batch-trained Naive Bayes classifier to select lexicons of terms most strongly associated with the target or control texts. The resulting Tweet-derived African American lexicon contains 1,734 unigrams, while the control contains 2,266 unigrams. This initial version of a lexicon-based African American Tweet detection algorithm developed using Tweet texts will be useful to inform culturally sensitive Twitter-based social support interventions for African American dementia caregivers.

Keywords: caregiver; dementia; disparity; lexicon; social media; unigram.

MeSH terms

  • Algorithms
  • Artificial Intelligence
  • Bayes Theorem
  • Black or African American
  • Caregivers
  • Dementia*
  • Humans
  • Social Media*
  • Social Support