Monday, Aug 4: 10:30 AM - 12:20 PM
0112
Invited Paper Session
Music City Center
Room: CC-202B
Applied
Yes
Main Sponsor
Section on Statistics in Marketing
Presentations
Researchers often wish to understand how an event affected individual behaviour (e.g., customer spending over time); oftentimes, such events affect the entire population of interest simultaneously, leaving no "control group" for comparison (e.g., the COVID-19 pandemic, a viral marketing campaign, or a national regulatory change).
In such settings, the researcher can infer causal effects by forecasting the counterfactual baseline (i.e., what would have happened without the event) based on pre-event trends. These inferences depend on accurate forecasts of baseline behaviour. In the customer base setting, researchers can observe multiple "cohorts" of customers who made their first purchase at differing times. Exploiting regularity in behavior across cohorts, data from older cohorts can be used to help predict behaviour of younger cohorts, enabling good forecasts of the baseline.
Recent work has proposed methods for such settings relying on an assumption that different cohorts follow parallel time trends. In this work, we develop an alternative method that relaxes the parallel trends assumption. Specifically, we propose a hierarchical Bayesian model that uses nonparametric Gaussian processes to model the spending over time of each cohort while pooling information across cohorts based on data-driven inferences of cohort similarity. We benchmark our approach against prior methods to show that it can achieve superior validity and precision of causal estimates, particularly when the parallel trends assumption is not exactly satisfied.
This paper aims to uncover market structure, with a focus on complementary and substitutable relationships, within a large set of products. While understanding market structure has played a crucial role in designing new products, repositioning existing products, and planning marketing actions such as pricing, extant literature has mostly focused on learning market structure for a small subset of products or at an aggregated level (e.g., brand, category). We seek to overcome this limitation by using a modern graph representation learning technique termed Variational Graph Auto Encoder (VGAE). Specifically, we extend VGAE, which has primarily been used to learn synergistic and antagonistic effects among a large set of molecules in the field of Computational Biology, to learn complementary and substitutable relationships among a large set of products.
Keywords
Market structure
Graph representation learning
Speaker
Mingyung Kim, Fisher College of Business, the Ohio State University
Credit scores are integral to financial inclusion and credit access for consumers. Companies model credit risk and use credit scores to evaluate individual consumers' creditworthiness across diverse sectors including banking, lending, insurance, utilities, and rentals. Despite the widespread application, a substantial segment of the population, including minorities, young adults, recent immigrants, and those in lower-income neighborhoods, remain `unscorable' or `credit invisible' due to insufficient or non-existent credit history. We introduce a novel application of deep kernel learning within a Gaussian process regression framework to increase the number of scorable consumers and broaden credit access. This methodology is motivated by the need to statistically rationalize missingness in the credit history data collected by credit rating agencies, a critical issue that traditional credit scoring models fail to accommodate effectively. We apply our method to a comprehensive dataset encompassing 600,000 U.S. consumers and over 3,000 credit history report attributes. We undertake a counterfactual analysis on the welfare implications of earlier scoring for individuals deemed conventionally unscorable. Our findings challenge prevailing assumptions of a stark transition in the accuracy and precision of credit risk models when a consumer transitions from unscorable to scorable. Our results reveal a more nuanced reality: the transition at this 'boundary of scorability' is smoother than commonly perceived, suggesting that current credit scoring practices might be overly conservative. Our findings offer the potential for earlier and more inclusive scoring while maintaining the fidelity of credit risk models, benefiting consumers currently unscorable by conventional methods, as well as firms seeking to serve these consumers.
Keywords
credit scoring
missing data
Gaussian process
Deep kernel learning
Speaker
Longxiu Tian, University of North Carolina Kenan-Flagler Business School
Digital marketers are struggling to measure campaign effectiveness due to the loss of customer-level tracking, rendering multi-touch attribution models obsolete. Moreover, constantly running experiments may be a costly alternative if effectiveness changes over time. As a consequence, firms have turned to using classic measurement tools like media mix models, which have always been built on potentially endogenous aggregate measures of campaign spend and performance.
We propose a Unified Marketing Measurement (UMM) framework that allows us to measure time-varying marketing effectiveness. Methodologically, we use a modern Bayesian nonparametrics framework that fuses the (available) experiments with aggregate media-mix data and leverages the exogenous variation in experiments to de-bias a media mix model. Using Gaussian Processes, our model regularizes ad effectiveness over time smoothly, allowing the experiments to separate marketing effectiveness from the correlation between sales and ad spending close to the experiment.
Our modeling framework also provides uncertainty quantification on ad effectiveness, which can be leveraged to determine if further experiments are needed. Using a series of simulations, we show the conditions for properly inferring ad effectiveness over time. We further show that endogeneity bias in observational data induces higher posterior uncertainty on the effectiveness and structural correlation estimates, which does not decrease with more observational data. This means we can use posterior uncertainty quantification to diagnose when additional experiments are needed.
Keywords
Marketing Mix Models
Probabilistic Machine Learning
Bayesian nonparametrics
Experiments