CS6a: Panel: Community Counts: Data Science for Equity and Public Health

Conference: Women in Statistics and Data Science 2025
11/14/2025: 11:35 AM - 1:00 PM EST
Panel 

Description

From food deserts to foster care systems, data science and statistical innovation-paired with deep community engagement-can illuminate disparities, inform policy, and improve health outcomes. These efforts require not only thoughtful analytic approaches, but also a deep understanding of the social context in which data are generated and decisions are made-ensuring that statistical tools are both technically sound and truly responsive to community needs.

This session, sponsored by the Caucus for Women in Statistics and Data Science, highlights how data scientists and statisticians are driving meaningful change through work rooted in real-world questions of access, equity, and community well-being. Alexis Fleming will present an R Shiny dashboard built from complex child welfare data to empower policymakers and stakeholders across Tennessee with accessible, actionable insights. Ashley Mullan will tackle the challenge of misclassified measures of food access in diabetes research, proposing a new maximum likelihood estimator that corrects for bias while preserving efficiency. Madhumita Ghosh Dastidar shares findings from the RAND PHRESH study, a large-scale natural experiment evaluating whether opening a supermarket in an underserved neighborhood improves residents' diets-offering a nuanced look at the long-term effects of community-level health interventions.

Keywords

Food deserts

Foster system

Child welfare

Diabetes

R Shiny

Diet 

Organizer

Sarah Lotspeich, Wake Forest University

Target Audience

Mid-Level

Tracks

Knowledge
Women in Statistics and Data Science 2025

Presentations

We CANS do it: Translating Complex Child Welfare Data into Actionable Insights with R Shiny

"Analyzing child welfare assessment data presents unique challenges, particularly when insights must be communicated across diverse geographic levels and to a wide range of stakeholders. Thus, dynamic, interactive tools that allow users to select variables of interest and explore data through customizable tables and visualizations are essential for those working in the child welfare space. The R Shiny package enables the creation of browser-based, server-hosted applications for data examination and visualization, facilitating easier access to up-to-date data insights with just a click of a button. Given the complexity of integrating multiple datasets, the open-source tool RStudio is an ideal choice for combining and cleaning the data and developing a tool to present findings from the merged, cleaned dataset.

This presentation will showcase an R Shiny application developed through a collaboration between the Vanderbilt University Center of Excellence for Children in State Custody and the Department of Biostatistics. The application demonstrates analyses of Child and Adolescent Needs and Strengths (CANS) assessment data merged with custody and placement data from across Tennessee, featuring customizable tables, plots, and heat maps at the county, regional, and statewide levels. The session will also cover practical strategies for data cleaning, integration, and visualization, offering insights into both the technical development and applied use of the tool in a real-world policy setting."
 

Speaker

Alexis Fleming

Linking Potentially Misclassified Healthy Food Access to Diabetes Prevalence

Access to healthy food is key to maintaining a healthy lifestyle and can be quantified by the distance to the nearest grocery store. However, calculating this distance forces a trade-off between cost and correctness. Accurate route-based distances following passable roads are cost-prohibitive, while simple straight-line distances ignoring infrastructure and natural barriers are accessible yet error-prone. Categorizing low-access neighborhoods based on these straight-line distances induces misclassification and would introduce bias into standard regression models to estimate the relationship between disease prevalence and access. Yet, fully observing the more accurate, route-based food access measure is often impossible, which induces a missing data problem. We combat this bias and address this missingness with a new maximum likelihood estimator for Poisson regression with a binary, misclassified exposure (access to healthy food within some threshold), where the misclassification may depend on additional error-free covariates. In simulations, we show the consequence of ignoring the misclassification (bias) and how the proposed estimator corrects for them while preserving more statistical efficiency than the complete case analysis (i.e., modeling only fully observed data). Finally, we apply our estimator to model the relationship between census tract diabetes prevalence and access to healthy food in northwestern North Carolina
 

Speaker

Ashley Mullan, Vanderbilt University

Community Investments and Diet-related Outcomes: A longitudinal study of residents of two urban neighborhoods

"Diet is a social determinant of health. Low income, racial minorities are at increased risk for mental illness, chronic disease, higher mortality, and lower life expectancy. Also, they often have limited access to healthy foods. Evidence suggests that the neighborhood food environment may influence diet and obesity. To promote healthy eating for improved health, the federal government financed the Healthy Food Financing Initiative to incentivize supermarkets to open in neighborhoods lacking access to fresh food (i.e. food deserts).

Funded by NCI, the RAND PHRESH study was designed as the largest natural experiment (n=1,372) to assess if opening a supermarket in a food desert actually impacts healthy eating. We leveraged a longitudinal pre-post design using an "intervention" neighborhood and using qualitative informative with quantitative methods to identify a "counterfactual". A community-based participatory research (CBPR) approach to ensure study success including enrollment and retention. We conducted a rigorous evaluation using difference-in-difference and Instrumental variables methods with the rich data to estimate the "intervention effect" on residents' diet. While we found short-term positive effects, they disappeared in the long term. On the other hand, there were large, sustained effects on secondary outcomes such as perceptions and neighborhood satisfaction. Diet, like all health behaviors, is hard to change. A combination of individual and neighborhood level interventions may be necessary to bring about sustained dietary changes."
 

Speaker

Madhumita Ghosh Dastidar, Rand Corporation