08 Data Science in Practice - Hacks to Make Things Work

Tim Hesterberg Speaker
Instacart
 
Sunday, Aug 4: 8:30 PM - 9:25 PM
Invited Posters 
Oregon Convention Center 
Instacart is a small but complicated company, with a four-sided marketplace (customers, shoppers, retailers and advertisers), making for interesting data science problems. We run many experiments to test improvements; we want to run them faster. We reduce variance using covariate adjustment to correct for differences in covariates such as pre-period metrics between control and treatment groups. However, there are difficulties in practice. We can't run a regression on the full dataset; we combine regression on a subset with updates on the full data, with minimal loss of variance reduction. Ratio metrics (e.g. sum w_i y_i/sum w_i) present additional difficulties - weighted regression is inconsistent, while performing separate linear regressions for numerator and denominator is consistent but gives poor variance reductions. A multiplicative+additive model is efficient and consistent, but has no closed-form solution and is ill-conditioned, requiring a creative implementation.