PACE: Privacy Aware Collaborative Estimation for Heterogeneous GLMs

Srijan Sengupta Co-Author
North Carolina State University
 
Aritra Mitra Co-Author
North Carolina State University
 
Bhaskar Ray First Author
 
Bhaskar Ray Presenting Author
 
Thursday, Aug 7: 9:35 AM - 9:50 AM
2182 
Contributed Papers 
Music City Center 
With sensitive data collected across various sites, restrictions on data sharing can hinder statistical estimation and inference. The seminal paper on Federated Learning proposed Federated Averaging (FedAvg) to perform Maximum Likelihood estimation. However, FedAvg and other algorithms for parameter estimation can lead to erroneous estimation or fail to converge under model heterogeneity across sites. We propose a novel method of parameter estimation for a broad class of Generalized Linear Models with clusters of sites obtaining data based on the same distribution with possibly different values of the true parameters across clusters. It accounts for the uncertainty in the local ML estimator and that in the optimization algorithm iterates and leverages established concentration inequalities to provide non-asymptotic risk bounds. We conduct a hypothesis test-type classification based on one-shot estimation and utilize the inference to conduct a decentralized collaborative estimation, improving upon local estimation with high probability. We also prove asymptotic accuracy of the clustering algorithm and the consistency of the estimates. We validate our results with simulation studies.

Keywords

Federated Learning

Privacy

Heterogeneity

Generalized Linear Models

Maximum Likelihood Estimation

Non-asymptotic risk bound 

Main Sponsor

Section on Statistical Learning and Data Science