Monday, Aug 4: 2:00 PM - 3:50 PM
0424
Invited Paper Session
Music City Center
Room: CC-104A
Applied
Yes
Main Sponsor
Committee of Presidents of Statistical Societies
Co Sponsors
ENAR
Presentations
Traditionally geospatial analysis has relied on statistical models that explicitly model spatial
correlations in the data. Recently, machine learning algorithms, such as neural networks and random
forests, are increasingly used in geospatial analysis. However, most machine learning algorithms do
not possess the functionality to directly encode spatial correlations. There is limited understanding of
the consequences of ignoring spatial correlations in machine learning algorithms applied to geospatial
data, despite this practice becoming increasingly common. We show empirically and theoretically that
ignoring spatial correlations reduces accuracy of machine learning algorithms for geospatial data.
We then propose well-principled machine learning algorithms for geospatial data that explicitly model
the spatial correlation as in traditional geostatistics. The basic principle is guided by how ordinary least
squares (OLS) extends to generalized least squares (GLS) for linear models to explicitly account for data
covariance. We demonstrate how the same extensions can be done for random forests and neural
networks, presenting the RF-GLS and NN-GLS algorithms. We provide extensive theoretical and
empirical support for the methods and show how they fare better than naïve or brute-force
approaches to use machine learning algorithms for spatially correlated data. We present the software
packages RandomForestsGLS and geospaNN implementing these methods.
Keywords
Neural networks
Geospatial data
Machine learning
Random forests
Gaussian processes
Spatial statistics
The 21st Century Cures Act, enacted in 2016, empowers the FDA to accelerate the development of new treatments by utilizing real-world data (RWD) and evidence. As a result, parallel randomized clinical trials (RCTs) and RWD are becoming increasingly available for evaluating treatment outcomes. Integrating heterogeneous data sources presents a unique opportunity to address clinical questions that cannot be answered by any single data source alone. This talk will explore various objectives and methodologies for conducting integrative analyses of data from RCTs and RWD. By combining the strengths of both RCTs and RWD, researchers can improve the generalizability of RCT findings using the broader representativeness of RWD, increase the efficiency and statistical power of treatment effect evaluations by incorporating comparable RWD, and assess long-term safety and efficacy by utilizing extended real-world follow-up data. Specifically, we will discuss newly developed strategies to mitigate biases and optimize treatment evaluation in hybrid clinical trials with external controls, including approaches such as test-then-pool, selective borrowing, and conformal prediction.
Keywords
Real-world data
Real-world evidence
Hybrid trial designs
Speaker
Shu Yang, North Carolina State University, Department of Statistics
The opioid epidemic remains a major public health crisis. Although evidence-based treatments for opioid use disorder (OUD) exist, most people with OUD do not receive treatment. Pragmatic trial designs have therefore been proposed to evaluate interventions designed to increase OUD treatment within entire clinics or health systems by leveraging health records (EHR) and other real-world data sources. In this talk, we present case studies that illustrate key challenges of using real-world data for evaluating intervention effects, including post-randomization selection bias that arises due to the intervention impacting diagnosis of OUD in the EHR, and observational outcome assessment processes, in which follow-up times from EHRs are irregularly spaced and may be intervention or outcome dependent. We clarify which estimands are being estimated in these settings, present simulation studies to evaluate the performance of methods addressing these challenges, and highlight novel statistical methods that have been developed and which are being implemented in these case studies to provide robust evidence on intervention effects to improve outcomes of people with OUD.
Keywords
pragmatic trials
opioid use disorder
real-world data
electronic health records