A Prospective Comparison of Evidence Synthesis Search Strategies Developed With and Without Text-Mining Tools [Internet]

Robin A. Paynter; Celia Fiordalisi; Elizabeth Stoeger; Eileen Erinoff; Robin Featherstone; Christiane Voisin; Gaelen P. Adam

A Prospective Comparison of Evidence Synthesis Search Strategies Developed With and Without Text-Mining Tools [Internet]

Review

Rockville (MD): Agency for Healthcare Research and Quality (US); 2021 Mar. Report No.: 21-EHC008.

Authors

Robin A. Paynter, Celia Fiordalisi, Elizabeth Stoeger, Eileen Erinoff, Robin Featherstone, Christiane Voisin, Gaelen P. Adam

PMID: 33755394
Bookshelf ID: NBK568625

Excerpt

Background: In an era of explosive growth in biomedical evidence, improving systematic review (SR) search processes is increasingly critical. Text-mining tools (TMTs) are a potentially powerful resource to improve and streamline search strategy development. Two types of TMTs are especially of interest to searchers: word frequency (useful for identifying most used keyword terms, e.g., PubReminer) and clustering (visualizing common themes, e.g., Carrot2).

Objectives: The objectives of this study were to compare the benefits and trade-offs of searches with and without the use of TMTs for evidence synthesis products in real world settings. Specific questions included: (1) Do TMTs decrease the time spent developing search strategies? (2) How do TMTs affect the sensitivity and yield of searches? (3) Do TMTs identify groups of records that can be safely excluded in the search evaluation step? (4) Does the complexity of a systematic review topic affect TMT performance? In addition to quantitative data, we collected librarians’ comments on their experiences using TMTs to explore when and how these new tools may be useful in systematic review search creation.

Methods: In this prospective comparative study, we included seven SR projects, and classified them into simple or complex topics. The project librarian used conventional “usual practice” (UP) methods to create the MEDLINE search strategy, while a paired TMT librarian simultaneously and independently created a search strategy using a variety of TMTs. TMT librarians could choose one or more freely available TMTs per category from a pre-selected list in each of three categories: (1) keyword/phrase tools: AntConc, PubReMiner; (2) subject term tools: MeSH on Demand, PubReMiner, Yale MeSH Analyzer; and (3) strategy evaluation tools: Carrot2, VOSviewer. We collected results from both MEDLINE searches (with and without TMTs), coded every citation’s origin (UP or TMT respectively), deduplicated them, and then sent the citation library to the review team for screening. When the draft report was submitted, we used the final list of included citations to calculate the sensitivity, precision, and number-needed-to-read for each search (with and without TMTs). Separately, we tracked the time spent on various aspects of search creation by each librarian. Simple and complex topics were analyzed separately to provide insight into whether TMTs could be more useful for one type of topic or another.

Results: Across all reviews, UP searches seemed to perform better than TMT, but because of the small sample size, none of these differences was statistically significant. UP searches were slightly more sensitive (92% [95% confidence intervals (CI) 85–99%]) than TMT searches (84.9% [95% CI 74.4–95.4%]). The mean number-needed-to-read was 83 (SD 34) for UP and 90 (SD 68) for TMT. Keyword and subject term development using TMTs generally took less time than those developed using UP alone. The average total time was 12 hours (SD 8) to create a complete search strategy by UP librarians, and 5 hours (SD 2) for the TMT librarians. TMTs neither affected search evaluation time nor improved identification of exclusion concepts (irrelevant records) that can be safely removed from the search set.

Conclusion: Across all reviews but one, TMT searches were less sensitive than UP searches. For simple SR topics (i.e., single indication–single drug), TMT searches were slightly less sensitive, but reduced time spent in search design. For complex SR topics (e.g., multicomponent interventions), TMT searches were less sensitive than UP searches; nevertheless, in complex reviews, they identified unique eligible citations not found by the UP searches. TMT searches also reduced time spent in search strategy development. For all evidence synthesis types, TMT searches may be more efficient in reviews where comprehensiveness is not paramount, or as an adjunct to UP for evidence syntheses, because they can identify unique includable citations. If TMTs were easier to learn and use, their utility would be increased.

Sections

Publication types

Review

Grants and funding

Prepared for: Agency for Healthcare Research and Quality, U.S. Department of Health and Human Services, 5600 Fishers Lane, Rockville, MD 20857; www.ahrq.govContract No. 290-2017-00003CPrepared by: Scientific Resource Center, Portland, OR