Mind your zeros: accurate p-value approximation in permutation testing

Abstract Number:

3424 

Submission Type:

Contributed Abstract 

Contributed Abstract Type:

Speed 

Participants:

Stefanie Peschel (1), Martin Depner (2), Erika von Mutius (2), Anne-Laure Boulesteix (3), Christian L. Müller (4)

Institutions:

(1) Department of Statistics, LMU München, Munich, Germany, (2) Institute of Asthma and Allergy Prevention, Helmholtz Munich, Neuherberg, Germany, (3) Institute for Medical Information Processing, Biometry and Epidemiology, LMU München, Munich, Germany, (4) Institute of Computational Biology, Helmholtz Munich, Neuherberg, Germany

Co-Author(s):

Martin Depner  
Institute of Asthma and Allergy Prevention, Helmholtz Munich
Erika von Mutius  
Institute of Asthma and Allergy Prevention, Helmholtz Munich
Anne-Laure Boulesteix  
Institute for Medical Information Processing, Biometry and Epidemiology, LMU München
Christian L. Müller  
Institute of Computational Biology, Helmholtz Munich

First Author:

Stefanie Peschel  
Department of Statistics, LMU München

Presenting Author:

Stefanie Peschel  
Ludwig-Maximilians-Universität München

Abstract Text:

Permutation procedures are essential for hypothesis testing when the distributional assumptions about the considered test statistic are not met or unknown, but are challenging in scenarios with limited permutations, such as complex biomedical studies. P-values may either be zero, making multiple testing adjustment problematic, or too large to remain significant after adjustment. A common heuristic solution is to approximate extreme p-values by fitting a Generalized Pareto Distribution (GPD) to the tail of the distribution of the permutation test statistics. In practice, an estimated negative shape parameter combined with extreme test statistics can again result in zero p-values. To address this issue, we present a comprehensive workflow for accurate permutation p-value approximation that fits a constrained GPD and strictly avoids zero p-values. We also propose new methods that address the challenges of determining an optimal GPD threshold and correcting for multiple testing. Through extensive simulations, our approach demonstrates considerably higher accuracy than existing methods. The computational framework will be available as the open-source R package "permAprox".

Keywords:

Non-parametric hypothesis testing|Permutation test|Generalized Pareto Distribution (GPD)|Multiple testing correction|Differential abundance and differential association testing in microbiome studies|R package

Sponsors:

Section on Nonparametric Statistics

Tracks:

Nonparametric testing

Can this be considered for alternate subtype?

Yes

Are you interested in volunteering to serve as a session chair?

No

I have read and understand that JSM participants must abide by the Participant Guidelines.

Yes

I understand that JSM participants must register and pay the appropriate registration fee by June 1, 2024. The registration fee is non-refundable.

I understand