AI, Statistics, and Data Science In Practice

Nancy McMillan Chair
Battelle
 
Nancy McMillan Organizer
Battelle
 
Monday, Aug 4: 10:30 AM - 12:20 PM
0799 
Topic-Contributed Paper Session 
Music City Center 
Room: CC-202A 

Keywords

causal AI

causal inference

active learning

reinforcement learning

marketing

large language model 

Applied

Yes

Main Sponsor

Business and Economic Statistics Section

Co Sponsors

Section on Statistical Consulting
Section on Statistics in Marketing

Presentations

Aligning Large Language Models with Heterogeneous Human Preferences: How Statistics Helps LLMs

Aligning large language models (LLMs) with human preferences is essential for improving generative AI systems. However, the heterogeneity of human feedback—due to varying contexts, expertise, and individual preferences—presents significant challenges in reward learning. This talk presents a dual active learning framework for reinforcement learning from human feedback (RLHF), which efficiently selects both conversations and teachers based on a D-optimal design. This strategy improves the reward learning process by minimizing generalized variance and optimizing the use of available feedback. Through theoretical analysis and extensive experiments, we demonstrate that our methods achieve superior alignment of LLMs with diverse human preferences. 

Keywords

Large language models

Optimal design

Reinforcement learning from human feedback 

Speaker

Will Wei Sun, Purdue University

Generative AI in action

we explore the architectural foundations and practical applications of generative AI, with particular emphasis on transformer-based models that have revolutionized natural language processing paradigms. We begin by examining the fundamental principles of transformer architectures, highlighting the key innovations in self-attention mechanisms and parallel processing that enable unprecedented contextual understanding. The discussion then transitions to concrete applications, specifically focusing on information extraction and conversational systems, where we analyze the shift from traditional rule-based approaches to neural architectures. Through detailed technical exposition, we demonstrate how generative AI has simplified complex NLP pipelines while significantly improving performance metrics. The presentation culminates in a live demonstration showcasing these capabilities in action. This comprehensive overview bridges theoretical frameworks. 

Speaker

Yijun Wei

An Overview of Causal AI in Business Practices

I will first discuss why we should care about causal inference or causal AI and then provide an overview of three major schools of thought and their key approaches from the disciplines of Statistics, Epidemiology, Computer Science, and Economics/Econometrics. The techniques developed in these fields have been applied to a wide range of fields including medical sciences, economics, political science, education, and business analytics. Their successes in the real world have led to winning a Turing Award (2011) and Nobel Prizes in Economics (2019, 2021). Extension to uplift modeling will also be covered in this seminar. Various real-life applications based on different approaches will be used for illustration. 

Speaker

Victor Lo, Fidelity Investments

Marketing Funnel Causal Inference in Financial Services

In marketing optimization, the marketing funnel or buyer's journey is a common heuristic through which marketing departments analyze their audiences. This work explores whether awareness, familiarity, and satisfaction, measured through survey responses, are causal mediators of the relationship between exogenous factors like market share and marketing spend, and an outcome of interest like sales. An example with open-source U.S. government banking data is presented to illustrate the Bayesian causal mediation analysis. 

Speaker

Robert Carnell, Huntington

COVID-19 Focused Cost-benefit Analysis of Public Health Emergency Preparedness and Crisis Response Programs

Background
The United States (US) Centers for Disease Control and Prevention (CDC) Division of State and Local Readiness (DSLR) plays a crucial role in supporting state, local, and territorial governments through the Public Health Emergency Preparedness (PHEP) cooperative agreement program. During the COVID-19 pandemic, CDC's DSLR extended additional financial support to bolster response efforts through the Public Health Crisis Response (PHCR) cooperative agreement. We used data on PHEP and PHCR program implementation from within the population of funded recipients and external measures of COVID-19 response effectiveness to assess the cost and benefit of the PHEP and the PHCR programs on the COVID-19 response through a cost-benefit analysis.

Methods
Annual workplans and progress reports provided significant components of the program implementation information from both PHEP and PHCR; NLP was used to create structured program implementation features. The relationship between recipient reported PHEP and PHCR implementation features (activities and outputs planned or achieved vs. not planned or achieved) and externally measured outcomes that represent an effective response to the COVID-19 pandemic was assessed using path analysis and lasso regression models. Outcomes assessed included time to implement control measures, availability of COVID-19 therapeutics, COVID-19 tests and vaccines administered, and hospital bed availability. The benefits associated with specific implementation decisions such as funding allocation decisions and planned activities and outputs were estimated for statistically significant relationships.

Results
Activities and outputs were associated with faster non-essential business closures, earlier implementation of mask mandates, more frequent reporting to the public, administering more COVID-19 tests, and maintaining a larger availability of hospital beds and COVID-19 therapeutics during surges. Additionally, funding allocations to four of the six preparedness capability domain areas, countermeasures and mitigation, incident management, information management, and surge management, were associated with the ability to administer more COVID-19 tests and vaccines and maintain increased hospital bed availability during peak surges.

Conclusions
PHEP and PHCR funding had measurable positive effects on recipients' ability to respond to the COVID-19 pandemic effectively. Ongoing efforts in specific areas of public health emergency preparedness will improve future responses to COVID-19-like events.
 

Keywords

public health emergency preparedness

cost-benefit analysis

program evaluation

COVID-19

PHEP 

Speaker

Nancy McMillan, Battelle