Can Big Data Science Deliver Precision Public Health?

Posted on by Muin J. Khoury, Office of Public Health Genomics, Centers for Disease Control and Prevention, Atlanta, Georgia; Michael Engelgau, George A. Mensah, Center for Translation Research and Implementation Science, National Heart, Lung, and Blood Institute, Bethesda, Maryland; David A. Chambers, Division of Cancer Control and Population Sciences, National Cancer Institute, Bethesda, Maryland

a big data word globe held by a person and a crowd of people with a small group being focused onThis blog is a quick summary of our recent paper in Public Health Genomics.   Increasingly, a large volume of health and non-health related data from multiple sources is becoming available that has the potential to drive health related discoveries and implementation. The term “big data” is often used as a buzzword to refer to large data sets that require new data science approaches to manipulation, analysis, interpretation, and integration. Such data include genomic and other biomarkers, sociodemographic, environmental, geographic, and other information. Our ability to improve population health depends to a large extent on collecting and analyzing the best available population level data on burden and causes of disease distribution, as well as level of uptake of evidence-based interventions that can improve health for all.

The emerging abundance of data and its associated predictive analytics can contribute to precision public health by including more extensive information in public health assessment of disease burden, facilitators and barriers to evidence-based intervention implementation and outcome measures, as related to person, place and time.

More Precision Assessment

Place: The use of big data sources could allow a more in-depth analysis of disease burden and implementation gaps and disparities in healthcare systems and population subgroups. For example, using small area analysis, we might be able to uncover pockets of disparities in implementation of health interventions that are often masked in analysis performed on areas such as counties or states.

Person: Similarly, in characterizing gaps and disparities in implementation and outcomes, personal characteristics of patients, providers and policy makers can be further refined beyond the use of traditional indicators such as age, gender, race/ethnicity. Genomic and other biomarkers can stratify disease outcomes and susceptibility into subgroups that reflect the underlying disease heterogeneity and potential response to different types of interventions.

Time: Big data may also improve precision through analysis of repeated measurements of the same variables over time. The use of personal devices such as sensors, smart phones and other digital devices can provide measurement of variability over time, for various health indicators such as nutrition, physical activity, and blood pressure.

More Precision Implementation

Place: Implementation studies evaluate delivery of interventions in real-world contexts of health care delivery systems and communities, with the goal of delivering interventions optimally across populations. Tools of predictive analytics and big data can help identify major challenges for implementation including the identification of key barriers and facilitators within the socioecological context, various health and community policies, delivery strategies within health systems.

Person: In order to reach subpopulations with unique health conditions, targeted intervention strategies will be needed. For example, a decision support tool was recently developed using a machine-learning algorithm based on structured and unstructured data to help identify individuals with probable familial hypercholesterolemia within electronic health records, large-scale laboratories and claims databases

Time: Smartphone apps can use big data to allow real-world collection and analysis over time for many evidence-based interventions (e.g., testing of adherence to medication use and longer-term measuring of outcomes over time). Apps could serve as a microcosm of a learning system that collects data on person, place and time and use the patterns detected to adjust an intervention based on its overall pattern of use and effectiveness.

The Role and Challenges of Predictive Analytics

To maximize the benefits of big data in precision public health, robust data science methods are needed for individual studies and to synthesize information across studies. Machine learning and predictive analytic tools are increasingly used in healthcare and population health settings to make sense of the large amount of data, both for assessment and implementation purposes. In principle, predictive analytics can provide novel approaches to analyze disease prediction and forecasting models and to pinpoint key barriers and facilitators to delivery of proven effective interventions.

There are numerous gaps and methodologic limitations that need to be overcome before big data can fulfill the promise of precision public health. For example, issues involving data inaccuracy, missing data, and selective measurement are substantial concerns that can potentially affect predictive modeling results and decision-making. In addition, deficiencies in model calibration can interfere with inferences.

Conclusion

In the age of big data, more extensive information by place, person and time are becoming available to measure public health impact and implementation needs. In principle, big data could point to implementation gaps and disparities and accelerate the evaluation of implementation strategies to reach population groups in most need for interventions. However, major challenges need to be overcome. For precision public health to succeed, further advances in predictive analytics, and practical tools for data integration and visualization are needed. As most public health and implementation scientists are not well versed in big data science, it will be crucial to offer robust training and career development at the intersection of big data and public health.


Posted on by Muin J. Khoury, Office of Public Health Genomics, Centers for Disease Control and Prevention, Atlanta, Georgia; Michael Engelgau, George A. Mensah, Center for Translation Research and Implementation Science, National Heart, Lung, and Blood Institute, Bethesda, Maryland; David A. Chambers, Division of Cancer Control and Population Sciences, National Cancer Institute, Bethesda, MarylandTags ,
Page last reviewed: April 9, 2024
Page last updated: April 9, 2024