Utilizing Variational Autoencoders to Shift Individual Level Data Towards Summary Level Statistics

Abstract Number:

1669 

Submission Type:

Contributed Abstract 

Contributed Abstract Type:

Poster 

Participants:

Sarah Milligan (1), Janice Weinberg (2), Fatema Shafie Khorassani (1)

Institutions:

(1) N/A, N/A, (2) Boston Univ School of Public Health, N/A

Co-Author(s):

Janice Weinberg  
Boston Univ School of Public Health
Fatema Shafie Khorassani  
N/A

First Author:

Sarah Milligan  
N/A

Presenting Author:

Sarah Milligan  
N/A

Abstract Text:

Access to individual level data (ILD) from published literature poses a hurdle for researchers. However, access is a driving force for many analyses (surrogate outcome validation, subgroup analyses, and other settings). Generative modeling can produce synthetic data that reflects the underlying properties of existing ILD. Specifically, while utilizing Variational Autoencoders (VAEs) and extending to tabular data, new possibilities for accelerating research arise. This application of VAEs, within R, presents a simple method for researchers to leverage a set of ILD. This method applies to a mixture of distributions (binary, categorical, normal, etc.). While access to ILD may be difficult, summary level information is more readily available. We propose an extension of VAEs to shift the underlying distribution of the data towards summary level statistics. This extension produces multiple sets of ILD under different prior information. The resulting, shifted, ILD can be considered a trustworthy representation of a published paper's data. By extending the framework of VAEs to tabular data and allowing for a distribution shift, exploratory research without direct ILD access is plausible.

Keywords:

Variational Autoencoders

|Synthetic Data
|Distribution Shift |Machine Learning|Generative Modeling|Summary Level Data

Sponsors:

Section on Statistical Learning and Data Science

Tracks:

Machine Learning

Can this be considered for alternate subtype?

No

Are you interested in volunteering to serve as a session chair?

No

I have read and understand that JSM participants must abide by the Participant Guidelines.

Yes

I understand that JSM participants must register and pay the appropriate registration fee by June 3, 2025. The registration fee is non-refundable.

I understand