Objectives: Evaluating methods for building data frameworks for application of AI in large scale datasets for women's health studies.
Methods: We created methods for transforming raw data to a data framework for applying machine learning (ML) and natural language processing (NLP) techniques for predicting falls and fractures.
Results: Prediction of falls was higher in women compared to men. Information extracted from radiology reports was converted to a matrix for applying machine learning. For fractures, by applying specialized algorithms, we extracted snippets from dual x-ray absorptiometry (DXA) scans for meaningful terms usable for predicting fracture risk.
Discussion: Life cycle of data from raw to analytic form includes data governance, cleaning, management, and analysis. For applying AI, data must be prepared optimally to reduce algorithmic bias.
Conclusion: Algorithmic bias is harmful for research using AI methods. Building AI ready data frameworks that improve efficiency can be especially valuable for women's health.