Integrating Predictive Models into Public Health Policy: Forecasting Lead Exposure Risks Across the United States
Main Article Content
Abstract
In the United States, lead exposure has been one of the most persistent and avoidable environmental health risks, disproportionately impacting low-income and minority groups. Conventional risk assessment tools have been mainly reactive in nature and were used to detect the contamination when the exposure has already taken place. This paper combines predictive modeling with countywide environmental and census data to predict the communities that are most susceptible to lead exposure and provide interventions of active public health policy. Based on the data of the Environmental Protection Agency (EPA), U.S Census Bureau, and Centers of Disease Control and Prevention (CDC), several machine learning models including Random Forest and Gradient Boosting models were trained to examine the associations among the environmental quality indicators, socioeconomic variables, and demographic factors. The predictive models proved to be very accurate with key predictors such as the age of houses, median household income, racial makeup, and closeness to industrial discharge sites. The spatial mapping found that there were concentrated areas of high risks in older urban areas and post-industrial areas emphasizing structural inequalities in environmental protection. The results highlight the possibilities of using data-driven forecasting to inform targeted preventive actions, resource distribution, and infrastructure investment, allowing shifting the paradigm of response to proactive policy. Finally, the study offers a framework of incorporating predictive analytics into the national health decision-making systems, which facilitate fair and sustainable strategies to reduce lead exposure throughout the United States.