Enhancing Iterative Proportional Fitting for Efficient and Scalable Synthetic Population Generation

Amy Wagler Co-Author
University of Texas At El Paso
 
William Agyapong First Author
University of Texas At El Paso
 
William Agyapong Presenting Author
University of Texas At El Paso
 
Wednesday, Aug 6: 2:35 PM - 2:50 PM
2239 
Contributed Papers 
Music City Center 
The Iterative Proportional Fitting (IPF) algorithm is widely used for survey weighting and synthetic population generation. While efficient in low-dimensional settings, IPF struggles with zero-cell issues in sparse contingency tables and becomes computationally infeasible as dimensionality increases. To address these challenges, we propose a block-wise IPF framework that partitions variables into smaller, correlated feature groups, applying IPF independently within each group. Simulation studies and real-world synthetic population experiments demonstrate that this approach significantly improves computational efficiency and scalability in high-dimensional settings while maintaining a reasonable fit to marginal distributions and preserving inter-variable dependencies comparable to standard IPF. Furthermore, we introduce a hybrid framework that integrates IPF-synthesized data with generative models such as Bayesian networks, and Tabular Variational Autoencoders. This approach ensures accurate marginal fitting while enhancing realism and diversity in synthetic populations. Our contributions improve upon stan-
dard IPF and generative models, advancing synthetic population modeling.

Keywords

Iterative Proportional Fitting (IPF), block-wise IPF, synthetic population generation, high-dimensional data, contingency tables, marginal constraints, scalability

Zero-cell-issues, computational efficiency, survey weighting, generative models, Bayesian networks, tabular variational autoencoders (TVAEs) 

Main Sponsor

Survey Research Methods Section