A Deep Learning Framework for Statistical Disclosure Control

Patrick Tendick First Author
Federal Reserve
 
Patrick Tendick Presenting Author
Federal Reserve
 
Tuesday, Aug 5: 9:20 AM - 9:35 AM
1507 
Contributed Papers 
Music City Center 

Description

Statistical disclosure control (SDC) seeks to prevent data intended for legitimate analyses from being used to obtain sensitive information about individuals. We introduce a new approach, neural network SDC (NN-SDC), that uses deep learning to preserve privacy. We will focus on microdata (records corresponding to individuals), but the techniques presented may also apply to aggregate data and information retrieval. Existing SDC methods, which are primarily intended for numeric or categorical data, include adding noise, data swapping, and micro aggregation. But machine learning and AI often require attributes like text and images, to which existing methods may not apply. Also, the release of data typically involves multiple goals, including a desire to provide useful data and a need to protect privacy.

NN-SDC first trains a model, then uses that model to produce anonymized data. The training process can account for goals, including privacy protection and utility of the data. NN-SDC can incorporate existing methods while having the potential to preserve confidentiality in new and novel ways. We argue that NN-SDC generalizes existing approaches and is at least as effective.

Keywords

Statistical disclosure control

Deep learning

Microdata

Machine learning

AI

Differential privacy 

Main Sponsor

Privacy and Confidentiality Interest Group