Abstract Number:
1669
Submission Type:
Topic-Contributed Paper Session
Participants:
Luca Sartore (1), David Matteson (3), Valbona Bejleri (4), Piaomu Liu (2), Luca Sartore (1), Ivy Zhang (5), Johannes Bleher (6), Aliaksandr Hubin (7)
Institutions:
(1) National Institute of Statistical Sciences, N/A, (2) Bentley University, N/A, (3) Cornell University & National Institute of Statistical Sciences, N/A, (4) United States Department of Agriculture – National Agricultural Statistics Service, N/A, (5) N/A, N/A, (6) University of Hohenheim, N/A, (7) University of Oslo, N/A
Chair:
Co-Organizer:
David Matteson
Cornell University & National Institute of Statistical Sciences
Discussant:
Valbona Bejleri
United States Department of Agriculture – National Agricultural Statistics Service
Session Organizer:
Speaker(s):
Session Description:
Traditional regression approaches are not suitable for analyzing high-dimensional data sets. Recent advances in big-data analytics have enabled the sparse selection of informative variables to enhance the interpretability and predictive accuracy of models for high-dimensional data. However, several challenges in high-dimensional spaces remain unaddressed in the statistical literature. For example, from a frequentist perspective, model selection and its properties are not fully studied in capture-recapture contexts or when dealing with data from heterogeneous domains. From a Bayesian perspective, however, approaches to modeling high-dimensional data sets focus on stochastic variable selection, adaptive shrinkage, or model averaging. Nevertheless, current state-of-the-art Bayesian methods are not fully equipped to simultaneously handle hierarchical population structures, heteroscedastic designs, various missing data mechanisms, and different levels of missingness. Addressing these challenges requires the development of new methods that improve computational efficiency relative to existing techniques. These innovations are crucial for advancements in various fields such as econometrics, healthcare, and social sciences. Overall, this section presents diverse perspectives to advance high-dimensional analytics, providing reliable and effective alternatives for statistical practitioners.
Luca Sartore from the National Institute of Statistical Sciences will begin the session with an advanced variable selection method designed for the US Census of Agriculture. He will highlight iterative approaches for the initialization and successive optimization of model parameters in high-dimensional settings. Ivy Yuexin Zhang from Stanford University will present a delta-invariant method for feature selection, addressing the challenges of retrieving a stable signal in high-dimensional heterogeneous domains. Johannes Bleher from Hohenheim University will discuss a probabilistic procedure for variable selection when missing covariate data are handled through multiple imputations. He will evaluate his procedure through a Monte Carlo study under several missing data mechanisms and demonstrate its application using survey data. Aliaksandr Hubin from Oslo University will introduce the concept of active paths for accurately identifying true covariates in high-dimensional non-linear systems. He will offer a novel perspective on a sparse representation of latent binary Bayesian neural networks to identify over-parameterized models. Finally, Valbona Bejleri from the United States Department of Agriculture's National Agricultural Statistics Service will conclude the session as a discussant. She will summarize the innovations in high-dimensional methods, highlighting future research directions and opportunities for collaboration among statisticians from various backgrounds.
Sponsors:
Biometrics Section 2
Government Statistics Section 3
Section on Statistical Computing 1
Theme:
Communities in Action: Advancing Society
Yes
Applied
Yes
Estimated Audience Size
Large (150-275)
I have read and understand that JSM participants must abide by the Participant Guidelines.
Yes
I understand and have communicated to my proposed speakers that JSM participants must register and pay the appropriate registration fee by June 1, 2026. The registration fee is nonrefundable.
I understand