Statistical methods for handling missing data

Linda Harrison Chair
Harvard TH Chan School of Public Health
 
Sunday, Aug 4: 2:00 PM - 3:50 PM
5005 
Contributed Papers 
Oregon Convention Center 
Room: CC-E144 

Main Sponsor

Biopharmaceutical Section

Presentations

A Closer Look at Control-Based Imputation for Active Arm Dropouts in Randomized Clinical Trials

We focus on two-arm (active, control) randomized clinical trials where the primary estimand is defined using the treatment policy strategy to handle intercurrent events. In such settings, valid statistical estimation and inference is easy with complete follow-up for all subjects regardless of their adherence to assigned treatment. However, it is often the case that some subjects drop out from the study before their endpoint can be assessed, resulting in missing data. This problem is often tackled using mixed‐effects model repeated measures (MMRM) analyses. An alternative is using control-based imputation (CBI) methods that impute missing data in the active arm using data from the control arm. The imputation can be done either separately for each active arm dropout (e.g., jump-to-reference; J2R) or at a mean level for the pool of active arm dropouts (control-based mean imputation; CBMI). We use simulations to compare the performance of MMRM, J2R, CBMI and other approaches (including using a common worse-rank score for all dropouts) when data are either missing not at random (MNAR) in both arms, or MNAR in the active arm but missing completely at random (MCAR) in the control arm. 

Keywords

dropout

missing not at random

control-based imputation 

View Abstract 3697

Co-Author(s)

Fang Liu, Merck
Devan Mehrotra, Merck & Co., Inc.

First Author

Naimin Jing, Merck & Co.

Presenting Author

Naimin Jing, Merck & Co.

A simulation study of multiple imputation methods applied to missing covariates in meta-regression

Meta-analysis is a statistical method for quantitatively synthesizing primary study evidence. Meta-regression is used to assess whether study-level predictors can explain a part of heterogeneity in the results of primary studies. The problem of missing data can arise when primary studies do not provide candidate predictors for meta-regression, but there is insufficient research on how to deal with that. It is implied that the majority of meta-regressions using a single covariate might be due to the fact that increasing the number of covariates makes it more difficult to deal with the problem of missing data. Multiple imputation is well established and practical method for missing data, while it is known that its use in the context of missing covariates and the use of weights in the imputation model should be carefully considered. In this presentation, we will discuss how to deal with missing covariates in meta-regression using multiple imputation methods. Especially, the algorithms and models of the imputation methods, including weighted regression, will be compared under the settings of various missing mechanisms via simulation studies. 

Keywords

meta-analysis

meta-regression

missing data

multiple imputation 

View Abstract 2463

Co-Author

Takayuki Abe, Kyoto Women’s University, School of Data Science

First Author

Shintaro Hirano

Presenting Author

Shintaro Hirano

Addressing Missing Responses and Categorical Covariates in Binary Regression Modeling: An Integrate

Binary regression, a key technique in applied statistics, often encounters missing values in practice.
Complete case analysis (CC) is commonly used, involving the exclusion of subjects with missing values,
particularly in large sample sizes. However, it is well-known that CC can lead to biased estimates with
small or medium-sized datasets. Existing methods for handling missing data typically focus on either missing covariates or missing responses, but not both simultaneously. In biomedical research and other real-world applications, missing values commonly occur in both the response variable and the covariates. In this presentation, we propose a method that effectively handles missing data in both response and covariate levels. Our method assumes that missing covariate data are missing at random (MAR) and that missing responses are nonignorable. Additionally, we propose a bias correction method based on Firth (1993) for fitting models with small samples. The proposed methods offer a comprehensive approach to address missing data in binary regression, demonstrating its effectiveness in both simulated scenarios and practical applications. 

Keywords

Missing data

Likelihood

Binary regression 

View Abstract 2952

Co-Author(s)

Douglas Nychka, Colorado School of Mines
Soutir Bandyopadhyay, Colorado School of Mines

First Author

Vivek Pradhan

Presenting Author

Vivek Pradhan

Comparative evaluation of imputation methods for incomplete longitudinal data in clinical trials

Incomplete longitudinal data are commonly encountered in clinical trials. To accurately handle intercurrent events and precisely characterize treatment effects, it is crucial to employ imputation methods that align with the estimand of interest. The magnitude of bias and variance in analysis outcomes depends not only on the chosen imputation methods but also on various factors, including the missing mechanism. Accordingly, conducting a comprehensive evaluation of the impact of imputation methods within a particular estimand framework is essential. This evaluation should involve numerical experiments across diverse simulation scenarios. For practical applicability, it is equally important to compare frequentist and Bayesian versions of imputation methods. In consideration of the treatment policy strategy, our study presents a simulation-based comparison of popular multiple imputation methods, including retrieved-dropout and control-based-mean imputations. This comparison encompasses different analysis models and scenarios of treatment discontinuation. 

Keywords

Estimand

Intercurrent events

Treatment policy strategy

ICH E9 (R1)

Multiple imputation

Missing not at random 

View Abstract 2573

Co-Author

Morshed Alam

First Author

Myeongjong Kang, Merck & Co.

Presenting Author

Myeongjong Kang, Merck & Co.

Distributional Imputation for Control-Based Sensitivity Analyses of Recurrent Events Data

Longitudinal clinical trials for which recurrent events endpoints are of interest are commonly subject to missing event data. Primary analyses in such trials are often performed assuming events are missing at random. Control-based imputation methods are advantageous for performing necessary sensitivity analyses in superiority trials to assess robustness of primary analysis conclusions to missing data assumptions. Multiple imputation (MI) is a popular approach for control-based imputation of recurrent events, but Rubin's variance estimator is often biased for the true sampling variability of the treatment effect estimator in the control-based setting. The nonparametric bootstrap is a common approach to overcome this issue, but can be computationally intensive. We propose distributional imputation (DI) with a corresponding wild bootstrap variance estimation procedure for control-based sensitivity analyses of recurrent events. In simulations, DI produced more reasonable standard error estimates than MI with the standard variance estimator and provided gains in computational efficiency over MI with a nonparametric bootstrap. 

Keywords

recurrent events

missing data

control-based imputation

distributional imputation

multiple imputation

sensitivity analyses 

View Abstract 3721

Co-Author

Shu Yang, North Carolina State University, Department of Statistics

First Author

Sarah Riegel Fairfax, North Carolina State University

Presenting Author

Sarah Riegel Fairfax, North Carolina State University

Performing a meta-analysis when study-level standard deviations are missing

People with heart failure with preserved ejection fraction often experience a reduced or poor quality of life (QOL). Regular exercise therapy potentially may improve their QOL. To date, 6 randomized clinical trials (RCTs) have evaluated this hypothesis; however, all 6 of them had small to moderate sample sizes. Thus, we conducted a meta-analysis of the 6 RCTs with relevant data. A meta-analysis requires each trial provide an estimated treatment effect and an estimated standard deviation (SD) for that treatment effect. Six of the RCTs provided a treatment effect estimate, but 4 of the RCTs did not provide a SD. We discuss how to impute an SD when it is missing. We also discuss when it is reasonable to use a fixed effects meta-analysis as opposed to a random effects meta-analysis and why a random effects meta-analysis may sometimes lead to a non-applicable conclusion. Finally, we make recommendations for reporting study-level data to improve future meta-analyses. 

Keywords

Meta-analysis

missing standard deviations

imputation

heart failure 

View Abstract 3193

Co-Author(s)

Kathryn Flynn, Medical College of Wisconsin
Steven Keteyian, Henry Ford Health
Dalane Kitzman, Wake Forest University School of Medicine
Vandana Sachdev, National Heart, Lung, and Blood Institute

First Author

Eric Leifer, National Heart, Lung, and Blood Institute

Presenting Author

Eric Leifer, National Heart, Lung, and Blood Institute

PrincipalR: Principal Stratification, Made Easy!

The ICH E9(R1) Addendum provides different strategies for addressing intercurrent events (ICEs: events post treatment initiation that may affect the interpretation or existence of clinical outcomes of interest) when defining an estimand and describing the targeted treatment effect (i.e., the estimand). The Principal Stratification (PS) strategy, mentioned in the ICH E9(R1) Addendum, is an appropriate approach to define a causal estimand, classifying subjects according to their potential occurrence of ICEs across treatment groups. Unfortunately, implementations of Principal Stratification have not proliferated the pharmaceutical scientific community. To resolve this, we introduce PrincipalR: An RShinyapp for Principal Stratification using varying models. Users can choose from Frequentist (multiple imputation method (Michael O'Kelly's implementation on Estimating Principal Strata in the DIA Missing Data Working Group), Principal Score Weighting (Ding & Lu, 2015), and Adherers Average Causal Effect (Qu et al., 2019)) to Bayesian (Wang et al., 2022) flavors, assess covariate distributions across observed ICEs and assess sensitivity of causal assumptions (Wang et al., 2022). 

Keywords

Causal Inference

Principal Stratification

Rshiny

Pharmaceutical

Intercurrent Events

ICH 

View Abstract 2911

First Author

Ahmad Hakeem Abdul Wahab, Janssen

Presenting Author

Ahmad Hakeem Abdul Wahab, Janssen