Utilizing Variational Autoencoders to Shift Individual Level Data Towards Summary Level Statistics
Abstract Number:
1669
Submission Type:
Contributed Abstract
Contributed Abstract Type:
Poster
Participants:
Sarah Milligan (1), Janice Weinberg (2), Fatema Shafie Khorassani (1)
Institutions:
(1) N/A, N/A, (2) Boston Univ School of Public Health, N/A
Co-Author(s):
First Author:
Presenting Author:
Abstract Text:
Access to individual level data (ILD) from published literature poses a hurdle for researchers. However, access is a driving force for many analyses (surrogate outcome validation, subgroup analyses, and other settings). Generative modeling can produce synthetic data that reflects the underlying properties of existing ILD. Specifically, while utilizing Variational Autoencoders (VAEs) and extending to tabular data, new possibilities for accelerating research arise. This application of VAEs, within R, presents a simple method for researchers to leverage a set of ILD. This method applies to a mixture of distributions (binary, categorical, normal, etc.). While access to ILD may be difficult, summary level information is more readily available. We propose an extension of VAEs to shift the underlying distribution of the data towards summary level statistics. This extension produces multiple sets of ILD under different prior information. The resulting, shifted, ILD can be considered a trustworthy representation of a published paper's data. By extending the framework of VAEs to tabular data and allowing for a distribution shift, exploratory research without direct ILD access is plausible.
Keywords:
Variational Autoencoders
|Synthetic Data
|Distribution Shift |Machine Learning|Generative Modeling|Summary Level Data
Sponsors:
Section on Statistical Learning and Data Science
Tracks:
Machine Learning
Can this be considered for alternate subtype?
No
Are you interested in volunteering to serve as a session chair?
No
I have read and understand that JSM participants must abide by the Participant Guidelines.
Yes
I understand that JSM participants must register and pay the appropriate registration fee by June 3, 2025. The registration fee is non-refundable.
I understand
You have unsaved changes.