Evaluating the Disclosure Risk and Analytic Utility of Synthetic Data in a Municipal Health Survey
Abstract Number:
3545
Submission Type:
Contributed Abstract
Contributed Abstract Type:
Paper
Participants:
Stephen Immerwahr (1), Wen Qin Deng (1), Jingchen Hu (2), Tashema Bholanath (1), Fangtao He (1), Nneka Lundy De La Cruz (1)
Institutions:
(1) NYC Department of Health and Mental Hygiene, Long Island City, NY, (2) Vassar College, N/A
Co-Author(s):
Fangtao He
NYC Department of Health and Mental Hygiene
First Author:
Presenting Author:
Abstract Text:
Releasing public-use micro-level data files from health surveys holds immense value for science and health policy. However, even after removing personally identifying information, the privacy of survey respondents may still be compromised. Using a large NYC population-representative health survey (n=10,271), we identified high-risk observations based on population estimates through a combination of key variables. We compared three different solutions to mitigate the risk of re-identification – suppression, synthesis using Classification and Regression Trees, and synthesis via Bayesian models – and assess their impact on both risk and loss of utility of the resulting protected data. While both synthesis methods resulted in slightly higher disclosure risks compared to the suppression method, the synthetic datasets preserved a higher level of utility. We will discuss our proposed solutions to avoid over-protecting and potentially obscuring estimates for underserved and vulnerable groups and share our experiences with data curators in advancing disclosure risk controls and data sharing in public health.
Keywords:
Health Surveys|Data Privacy Risk|Synthetic Data|Survey Research Methods|Government Statistics|
Sponsors:
Survey Research Methods Section
Tracks:
Privacy and Confidentiality Methods
Can this be considered for alternate subtype?
Yes
Are you interested in volunteering to serve as a session chair?
No
I have read and understand that JSM participants must abide by the Participant Guidelines.
Yes
I understand that JSM participants must register and pay the appropriate registration fee by June 1, 2024. The registration fee is non-refundable.
I understand
You have unsaved changes.