Synthetic Data with Heterogeneous Differential Privacy

Fang Liu Co-Author
University of Notre Dame
 
Gina Mannino First Author
University of Notre Dame
 
Gina Mannino Presenting Author
University of Notre Dame
 
Tuesday, Aug 5: 9:35 AM - 9:50 AM
0988 
Contributed Papers 
Music City Center 
Differential privacy (DP) offers rigorous privacy guarantees but often applies a uniform privacy level across entire datasets, neglecting user preferences and varying attribute sensitivity. We propose a framework incorporating these granularities to enhance the privacy-utility trade-off in DP synthetic data. We introduce multi-dimensional heterogeneous DP (HDP), combining user-dependent and attribute-dependent HDP guarantees, along with a privacy budget allocation policy. We propose and compare a synthetic data generation framework for combining user groups with diverse privacy needs and across attributes with different levels of sensitivity. Additionally, we develop the technique of SoftMax weighting that downweights the contribution of highly perturbed privacy groups at small sample sizes by borrowing information from less perturbed groups to improve the utility of the final synthetic data. We run extensive simulation studies and apply our proposed framework to a real-world dataset. The results demonstrate improved utility with heterogeneous DP over uniform DP for synthetic data generation

Keywords

Differential privacy

synthetic data

Bayesian

personalized DP

attribute DP

heterogeneous DP, privacy-utility trade-off 

Main Sponsor

Privacy and Confidentiality Interest Group