Retrieval and classification of dental research articles

W C Bartling; T K Schleyer; S Visweswaran

doi:10.1177/154407370301700126

Retrieval and classification of dental research articles

Adv Dent Res. 2003 Dec:17:115-20. doi: 10.1177/154407370301700126.

Authors

W C Bartling¹, T K Schleyer, S Visweswaran

Affiliation

¹ Center for Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA, USA. wcb@cbmi.upmc.edu

PMID: 15126221
DOI: 10.1177/154407370301700126

Abstract

Successful retrieval of a corpus of literature on a broad topic can be difficult. This study demonstrates a method to retrieve the dental and craniofacial research literature. We explored MeSH manually for dental or craniofacial indexing terms. MEDLINE was searched using these terms, and a random sample of references was extracted from the resulting set. Sixteen dental research experts categorized these articles, reading only the title and abstract, as either: (1) dental research, (2) dental non-research, (3) non-dental, or (4) not sure. Identify Patient Sets (IPS), a probabilistic text classifier, created models, based on the presence or absence of words or UMLS phrases, that distinguished dental research articles from all others. These models were applied to a test set with different inputs for each article: (1) title and abstract only, (2) MeSH terms only, or (3) both. By title and abstract only, IPS correctly classified 64% of all dental research articles present in the test set. The percentage of correctly classified dental research articles in this retrieved set was 71%. MeSH term inclusion decreased performance. Computer programs that use text input to categorize articles may aid in retrieval of a broad corpus of literature better than indexing terms or key words alone.

Publication types

Research Support, U.S. Gov't, P.H.S.

MeSH terms

Abstracting and Indexing
Classification
Dental Research* / classification
Humans
Information Storage and Retrieval / methods*
MEDLINE
Periodicals as Topic
Sensitivity and Specificity
Subject Headings
Unified Medical Language System