Probabilistic Machine Learning in Marketing

Hortense Fong Chair
 
Hortense Fong Organizer
 
Monday, Aug 4: 10:30 AM - 12:20 PM
0112 
Invited Paper Session 
Music City Center 
Room: CC-202B 

Applied

Yes

Main Sponsor

Section on Statistics in Marketing

Presentations

A Bayesian Approach to Inferring the Effects of Events Using Cohorted Data

Researchers often wish to understand how an event affected individual behaviour (e.g., customer spending over time); oftentimes, such events affect the entire population of interest simultaneously, leaving no "control group" for comparison (e.g., the COVID-19 pandemic, a viral marketing campaign, or a national regulatory change).
In such settings, the researcher can infer causal effects by forecasting the counterfactual baseline (i.e., what would have happened without the event) based on pre-event trends. These inferences depend on accurate forecasts of baseline behaviour. In the customer base setting, researchers can observe multiple "cohorts" of customers who made their first purchase at differing times. Exploiting regularity in behavior across cohorts, data from older cohorts can be used to help predict behaviour of younger cohorts, enabling good forecasts of the baseline.
Recent work has proposed methods for such settings relying on an assumption that different cohorts follow parallel time trends. In this work, we develop an alternative method that relaxes the parallel trends assumption. Specifically, we propose a hierarchical Bayesian model that uses nonparametric Gaussian processes to model the spending over time of each cohort while pooling information across cohorts based on data-driven inferences of cohort similarity. We benchmark our approach against prior methods to show that it can achieve superior validity and precision of causal estimates, particularly when the parallel trends assumption is not exactly satisfied. 

Co-Author(s)

Shin Oblander, University of British Columbia
Leyao Tan, University of British Columbia

Speaker

Shin Oblander, University of British Columbia

Graph Representation Learning for Inferring Market Structure

This paper aims to uncover market structure, with a focus on complementary and substitutable relationships, within a large set of products. While understanding market structure has played a crucial role in designing new products, repositioning existing products, and planning marketing actions such as pricing, extant literature has mostly focused on learning market structure for a small subset of products or at an aggregated level (e.g., brand, category). We seek to overcome this limitation by using a modern graph representation learning technique termed Variational Graph Auto Encoder (VGAE). Specifically, we extend VGAE, which has primarily been used to learn synergistic and antagonistic effects among a large set of molecules in the field of Computational Biology, to learn complementary and substitutable relationships among a large set of products.  

Keywords

Market structure

Graph representation learning 

Speaker

Mingyung Kim, Fisher College of Business, the Ohio State University

Thin But Not Forgotten: Deep Kernel Learning for Credit Risk Modeling with High-Dimensional Missingness

Credit scores are integral to financial inclusion and credit access for consumers. Companies model credit risk and use credit scores to evaluate individual consumers' creditworthiness across diverse sectors including banking, lending, insurance, utilities, and rentals. Despite the widespread application, a substantial segment of the population, including minorities, young adults, recent immigrants, and those in lower-income neighborhoods, remain `unscorable' or `credit invisible' due to insufficient or non-existent credit history. We introduce a novel application of deep kernel learning within a Gaussian process regression framework to increase the number of scorable consumers and broaden credit access. This methodology is motivated by the need to statistically rationalize missingness in the credit history data collected by credit rating agencies, a critical issue that traditional credit scoring models fail to accommodate effectively. We apply our method to a comprehensive dataset encompassing 600,000 U.S. consumers and over 3,000 credit history report attributes. We undertake a counterfactual analysis on the welfare implications of earlier scoring for individuals deemed conventionally unscorable. Our findings challenge prevailing assumptions of a stark transition in the accuracy and precision of credit risk models when a consumer transitions from unscorable to scorable. Our results reveal a more nuanced reality: the transition at this 'boundary of scorability' is smoother than commonly perceived, suggesting that current credit scoring practices might be overly conservative. Our findings offer the potential for earlier and more inclusive scoring while maintaining the fidelity of credit risk models, benefiting consumers currently unscorable by conventional methods, as well as firms seeking to serve these consumers. 

Keywords

credit scoring

missing data

Gaussian process

Deep kernel learning 

Speaker

Longxiu Tian, University of North Carolina Kenan-Flagler Business School

Unified Marketing Measurement: How to fuse experimental data with marketing mix data?

Digital marketers are struggling to measure campaign effectiveness due to the loss of customer-level tracking, rendering multi-touch attribution models obsolete. Moreover, constantly running experiments may be a costly alternative if effectiveness changes over time. As a consequence, firms have turned to using classic measurement tools like media mix models, which have always been built on potentially endogenous aggregate measures of campaign spend and performance.

We propose a Unified Marketing Measurement (UMM) framework that allows us to measure time-varying marketing effectiveness. Methodologically, we use a modern Bayesian nonparametrics framework that fuses the (available) experiments with aggregate media-mix data and leverages the exogenous variation in experiments to de-bias a media mix model. Using Gaussian Processes, our model regularizes ad effectiveness over time smoothly, allowing the experiments to separate marketing effectiveness from the correlation between sales and ad spending close to the experiment.

Our modeling framework also provides uncertainty quantification on ad effectiveness, which can be leveraged to determine if further experiments are needed. Using a series of simulations, we show the conditions for properly inferring ad effectiveness over time. We further show that endogeneity bias in observational data induces higher posterior uncertainty on the effectiveness and structural correlation estimates, which does not decrease with more observational data. This means we can use posterior uncertainty quantification to diagnose when additional experiments are needed. 

Keywords

Marketing Mix Models

Probabilistic Machine Learning

Bayesian nonparametrics

Experiments 

Co-Author(s)

Ryan Dew, University of Pennsylvania, Wharton School
Nicolas Padilla, London Business School

Speaker

Nicolas Padilla, London Business School