13 - Fast resampling methods for massive Generalized Linear Models

Conference: Women in Statistics and Data Science 2022
10/07/2022: 2:30 PM - 4:00 PM CDT
Speed 
Room: Grand Ballroom Salon G 

Description

Residual bootstrap is a widely used method in the context of Linear regression for assessing the quality of relevant estimators. Moulton & Zeger (1990) extended the idea of Residual bootstrap to the class of Generalized Linear Model (GLM), a wider class of models, which includes the linear regression model along with other commonly used models like logistic, poisson and probit regression. However, with massive datasets becoming more and more common, the ordinary residual bootstrap techniques are turning out to be computationally demanding and hence less feasible. Some computationally efficient alternatives to bootstrap exist in the literature, such as 'm out of n bootstrap' by Bickel et al (2012), 'Bag of Little bootstraps' by Kleiner et al (2014) and 'Subsampled Double Bootstrap' by Sengupta et al (2016). However, residual bootstrap is not yet known to have direct extensions to these methods.In our work, we introduce a Subsampled Residual Bootstrap (SRB) strategy applicable to GLMs, which is much more computationally efficient compared to Residual Bootstrap, and hence more feasible in cases with a stringent time budget. We establish the consistency of SRB estimators under mild assumptions. Finally, we demonstrate the computational advantages of our method through numerical simulations.

Keywords

Residual bootstrap

Subsampling

Computational efficiency

Big data

Generalized linear model 

Presenting Author

Indrila Ganguly, North Carolina State University

First Author

Indrila Ganguly, North Carolina State University

CoAuthor

Srijan Sengupta, North Carolina State University

Target Audience

Mid-Level

Tracks

Knowledge
Women in Statistics and Data Science 2022