Working words: real-life lexicon of North American workers

J Occup Environ Med. 2005 Aug;47(8):859-64. doi: 10.1097/01.jom.0000169095.16779.66.

Abstract

Objective: This study describes a new computer methodology for analyzing workers' free text work descriptions.

Methods: Computerized lexical analysis was applied to work descriptions of participants in the Lung Health Study, a smoking-cessation study in persons with early chronic obstructive pulmonary disease. Text was parsed and analyzed as single term roots and pairs of roots commonly occurring together.

Results: The frequencies of terms reflect the work of a population; our subjects' most frequently used terms included "sale, office, service, business, engine[er], secretary, construct, driv[e], comput[e], teach, truck." Standard classification schemes (NAICS and SOC) and textbooks use terms inconsistent with those of actual workers. Many common empirical terms imply both industry and job information content, although traditional coding schemes separate industry and job title.

Conclusions: Formal analyses of language may facilitate communication, identify translation priorities, and allow automated work coding.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Female
  • Humans
  • Interviews as Topic
  • Job Description*
  • Language*
  • Male
  • Natural Language Processing*
  • North America
  • Occupational Medicine / classification
  • Pulmonary Disease, Chronic Obstructive / prevention & control
  • Randomized Controlled Trials as Topic
  • Smoking Cessation
  • Textbooks as Topic
  • User-Computer Interface
  • Vocabulary*
  • Work / classification*