Evaluating Demographic Fairness in Prompt-Based Behavioral Inference Using LLMs: A Smart Charging Case Study

Matthew Dean Co-Author
University of California, Irvine
 
Wenlong Jin Co-Author
University of California, Irvine
 
Chenyu Yuan Speaker
University of California Irvine
 
Sunday, Aug 3: 4:30 PM - 4:55 PM
Invited Paper Session 
Music City Center 

Description

While established econometric approaches use latent variables to model attitudes and improve model fit, we propose a prompt-based framework that uses large language models (LLMs) to provide additional insights into complex reasoning processes surrounding smart charging adoption. Our approach analyzes structured survey profiles to infer behavioral reasoning about smart charging interest and topically relevant attitudes (e.g., privacy, cost, and trust). We evaluate three prompting strategies— zero-shot, chain-of-though, and self-consistency—to assess LLM output fairness across race, income, age and so on, using demographic parity, total variation distance, equalized odds, and equality of opportunity. Early findings indicate that while LLMs produce more neutral outputs than human survey responses at the population level, certain prompting strategies can amplify subgroup disparities . These results caution against assuming moderation implies fairness and highlights the need for multi-level equity diagnostics in human-centered predictive modeling for energy policy. Ongoing work explores causal prompt design and uncertainty-aware inference to improve interpretability and policy sensitivity.