Assessment of Methods for Handling Missing Data Using NIDA Clinical Trials Studies on Substance Use

Amy Hahn Co-Author
The Emmes Company
 
Ashley Vena Co-Author
The Emmes Company
 
Abigail Matthews Co-Author
The Emmes Company
 
Kathryn Hefner Co-Author
The Emmes Company
 
Michael Otterstatter First Author
The Emmes Company
 
Michael Otterstatter Presenting Author
The Emmes Company
 
Tuesday, Aug 5: 10:35 AM - 10:50 AM
2359 
Contributed Papers 
Music City Center 
Missing data are inevitable in clinical trials and may bias analyses. Here we describe an analysis of missing data in 8 trials of substance use disorder (SUD) from the National Institute on Drug Abuse (NIDA). Rates and patterns of missingness in longitudinal urine drug screen (UDS) were compared and predictors assessed. Replicate datasets were synthesized using classification and regression trees and analyzed with maximum likelihood (ML) or first processed with multiple imputation (MI). Missingness in UDS was 33% overall (15-52% per study), with 28% of participants having no missingness but some having up to 90%. Most (83%) participants had only intermittent missingness (p<0.001), but dropouts occurred in all studies. Missingness was more common in females (p=0.042) and younger participants (p<0.001). Based on synthetic data, MI and ML had similar results, although ML had fewer assumptions and was more efficient overall. We show that although missing outcome data occurs through random and non-random mechanisms, consistent predictors of missingness exist and ML is an efficient approach for handling missing values.

Keywords

missing data

maximum likelihood

clinical trials

substance use disorder

imputation 

Main Sponsor

Section on Statistics in Epidemiology