Poster Session II

Conference: ASA Biopharmaceutical Section Regulatory-Industry Statistics Workshop 2024
09/27/2024: 9:45 AM - 10:30 AM EDT
Posters 
Room: White Oak 

Presentations

P01 A Comprehensive Review and Shiny Application on the Matching-Adjusted Indirect Comparison (MAIC)

Population-adjusted indirect comparison (PAIC) is an increasingly used technique for estimating the comparative effectiveness of different treatments for the health technology assessments when head-to-head trials are unavailable. Three commonly used PAIC methods include matching-adjusted indirect comparison (MAIC), simulated treatment comparison (STC), and multilevel network meta-regression (ML-NMR). MAIC enables researchers to achieve balanced covariate distribution across two independent trials when individual participant data (IPD) is only available in one trial. In this paper, we provide a comprehensive review of the MAIC methods, including their theoretical derivation, implicit assumptions, and connection to calibration estimation (CE) in survey sampling. We discuss the nuances between anchored and unanchored MAIC, as well as their required assumptions. Furthermore, we implement various MAIC methods in a user-friendly R Shiny application Shiny-MAIC. To our knowledge, it is the first Shiny application that implements various MAIC methods. The Shiny-MAIC application offers choice between anchored or unanchored MAIC, choice among different types of covariates and outcomes, and two variance estimators including bootstrap and robust standard errors. An example with simulated data is provided to demonstrate the utility of the Shiny-MAIC application, enabling a user-friendly approach conducting MAIC for healthcare decision-making. The Shiny-MAIC is freely available through the link: https://ziren.shinyapps.io/Shiny_MAIC/. 

Presenting Author

Ziren Jiang, University Of Minnesota

CoAuthor(s)

Joseph Cappelleri, Pfizer Inc
Margaret Gamalo-Siebers, Pfizer
Yong Chen, University of Pennsylvania, Perelman School of Medicine
Neal Thomas, Pfizer
Haitao Chu, Pfizer

P02 A frequentist approach to dynamic borrowing

There has been growing interest in leveraging external control data to augment a randomized control group data in clinical trials and enable more informative decision making. In recent years, the quality and availability of real-world data have improved steadily as external controls. However, information borrowing by directly pooling such external controls with randomized controls may lead to biased estimates of the treatment effect. Dynamic borrowing methods under the Bayesian framework have been proposed to better control the false positive error. However, the numerical computation and, especially, parameter tuning, of those Bayesian dynamic borrowing methods remain a challenge in practice. In this paper, we present a frequentist interpretation of a Bayesian commensurate prior borrowing approach and describe intrinsic challenges associated with this method from the perspective of optimization. Motivated by this observation, we propose a new dynamic borrowing approach using adaptive lasso. The treatment effect estimate derived from this method follows a known asymptotic distribution, which can be used to construct confidence intervals and conduct hypothesis tests. The finite sample performance of the method is evaluated through extensive Monte Carlo simulations under different settings. We observed highly competitive performance of adaptive lasso compared to Bayesian approaches. Methods for selecting tuning parameters are also thoroughly discussed based on results from numerical studies and an illustration example. 

Presenting Author

Jiangeng Huang, AbbVie

CoAuthor(s)

Ray Lin, Genentech, Inc.
Jiawen zhu, Genentech, Inc.
Lu Tian, Stanford University

P03 A Seamless Phase II/III Design with Dose Optimization (SDDO) for Oncology Drug Development

The US FDA's Project Optimus initiative that emphasizes dose optimization prior to marketing approval represents a pivotal shift in oncology drug development. It has a ripple effect for rethinking what changes may be made to conventional pivotal trial designs to incorporate a dose optimization component. Aligned with this initiative, we propose a novel Seamless Phase II/III Design with Dose Optimization (SDDO
framework). The proposed design starts with dose optimization in a randomized setting, leading to an interim analysis focused on optimal dose selection, trial continuation decisions, and sample size re-estimation (SSR). Based on the decision at interim analysis, patient enrollment continues for the selected dose arm and control arm, and the significance of treatment effects will be determined at final analysis. The SDDO framework offers increased flexibility and cost-efficiency through sample size adjustment, while stringently controlling the Type I error. This proposed design also facilitates both Accelerated Approval (AA) and regular approval in a "one-trial" approach. Extensive simulation studies confirm that our design reliably identifies the optimal dosage and makes preferrable decisions with a reduced sample size while retaining statistical power. 

Presenting Author

Yuhan Liu, The University of Chicago

CoAuthor(s)

Yiding Zhang
Gu Mi, Sanofi
Ji Lin, Sanofi

P04 A simulation study to assess how diagnostic test accuracy affect clinical utility study outcome for MRD tests

Minimal Residual Disease (MRD) is strongly related to cancer recurrence. Early detection of MRD can potentially be useful in providing treatment (e.g., adjuvant chemotherapy) to patients before other forms of clinical evidence (e.g., imaging) become available. The clinical utility of MRD tests as a predictive biomarker has been evaluated in recent or ongoing prospective clinical trials, in which postoperative cancer patients with a positive MRD test result are randomized into an adjuvant treatment arm and a control arm. Clinical outcomes such as disease-free survival are compared between treatment arms. Efficacy of adjuvant treatment in such trials is primarily driven by three classes of parameters: sensitivity and specificity of the MRD assay, effectiveness of treatment and study cohort characteristics such as sample size and recurrence rate.

To evaluate the impact of these parameters to the clinical trial outcome, we performed a simulation study across the parameter space defined by total cohort size (100 - 1000), recurrence rate (30% - 70%), sensitivity (60% - 95%), specificity (80% - 100%), and treatment efficacy hazard ratio (1.5 - 10). For each combination of parameters, 50 simulated clinical trials were created. In each simulated trial, the primary endpoint was the statistical significance of the hazard ratio between the treated and untreated arms. Unsurprisingly the results demonstrate that the most important factor is the efficacy of treatment on the MRD population followed by the number of patients with detected MRD. Results show that high sensitivity values can offset lower sample sizes and MRD rates, as expected, although high sensitivity is not always sufficient for consistent success. High specificity is an important factor as well, especially when the baseline hazard ratio is close to one. The enrollment of MRD-negative patients by mistake can muddle the results dramatically, as MRD-negative patients are less likely to have negative events compared to patients with MRD regardless of treatment.

We believe this simulation analysis can help clinical study planners estimate MRD-guided clinical trial success probability and determine sample size based on MRD test accuracy and study cohort characteristics. The analysis may also help assay developers determine product requirements. This simulation framework can potentially be applied to general biomarker-driven clinical trials. 

Presenting Author

Russell Petry, Foundation Medicine Inc.

CoAuthor

Chang Xu

P05 A two-stage approach to assess treatment effects in clinical trial populations enriched with risk predictions

Background

A randomized controlled trial is a common clinical trial design used to evaluate a treatment effect by comparing outcomes between one group of subjects under the treatment of interest (treatment arm) and the other under placebo or routine treatments (control arm). Typically, the trial populations are selected from subjects with the target disease identified based on certain criteria. However, in practice, the selection process can be challenging due to several factors. For example, the stages or severity levels of a disease may be insufficient or inadequate to identify the patients who would benefit from the treatment in a study. In addition, to achieve the desired sample size, the selection criteria are often defined to possibly include more subjects, and that may cause the risk of disease being lower than expected in the trial population. Consequently, the treatment may show lower effectiveness in the trial population due to the treatment effect is reduced as less contrast is observed in the comparison between two arms. In this study, we propose a two-stage approach in which we first build a risk prediction model using control arm data and then enrich the risk in the trial population by weighting with the risk predictions in both arms for assessing the treatment effect.

Methods

Simulated randomized clinical trial (RCT) data with two (treatment and control) arms are generated based on real-world observational time-to-event data with treatment and non-treatment groups. For simulating the randomization in a RCT, inverse probability weighting (IPW) estimating the average treatment effect in the treated (ATT) is calculated in observational data and applied to balance the distributions of baseline risk factors such that standardized mean differences (SMD) are less than 0.1 between the two arms. Total sample size is simulated at 1,000 (815 in treatment arm and 185 in control arm) based on the ratio observed in the real-world data. Follow-up stops at the earliest adverse event (AE) or censor until end of study at 90 days. The simulated RCT study is aimed to evaluate whether the treatment can provide better protective effect to the incidence of AE compared to the control arm.
In the first stage, a risk prediction model is built by fitting Cox models with AE incidence as the outcome and baseline risk factors as the predictors in the control arm data. Per trial design, the control arm represents the basic risk in the study, thus the individual predicted value calculated from the risk prediction model can be used to evaluate whether a patient is under relatively higher or lower basic risk conditional on the baseline risk factors. In the simulated RCT, a total of nine baseline risk factors are measured, including age, BMI, heart rate, systolic arterial blood pressure (SBP), elevated troponin, saturation of peripheral oxygen (SpO2), disease severity, disease history, and cancer history. To achieve a parsimonious risk prediction model with fair classification effectiveness, we adopted a forward selection with a given threshold defined as concordance improvement of at least 1%.

In the second stage, predicted values of basic risk for subjects of both arms are calculated based on the risk prediction model built in the first stage. In this study, predicted values are calculated in one of the three forms: 1) the linear predictor (LP), where the values are shifted to be all positive by adding an offset constant, 2) the risk score (RS), which is the exponential value of the LP, and 3) the inverse Mills ratio of the minus LP (IMR). For all forms, higher values indicate higher basic risk to AE incidence. Sampling-weighted Cox models are performed incorporating the three forms of risk predictions as weights to assess the treatment effects while adjusting for potential confounding effects from all the baseline risk factors as covariates. When subjects predicted with higher risk are weighted more than those with lower risk in the model, the treatment effect is assessed in a pseudo trial population with enriched basic risk.

Results

Among the 1,000 subjects in the simulated RCT, for continuous data, the baseline mean (standard deviation, SD) age is at 61.3 (14.8) years, BMI at 34.6 (8.8), heart rate at 105.9 (19.3) bpm, SBP at 137.4 (23.2) mmHg, and SpO2 at 93.6 (5.5) %; for dichotomous data, the frequency (percentage) of elevated (versus non-elevated) troponin is at 790 (79%), high (versus non-high) disease severity at 76 (7.6%), positive (versus negative) disease history at 109 (10.9%), and positive (versus negative) cancer history at 208 (20.8%). SMDs of those distributions between two arms are all less than 0.1. The overall 90-day cumulative AE incidence based on Kaplan-Meier (KM) estimates is 10.6% (treatment versus control: 9.6% versus 15.1%).

After the model selection process, the risk prediction model is determined with four predictors, including heart rate, disease history, SBP, and elevated troponin. The model concordance is 72.7% suggesting a fair-good classification performance. Predicted values are calculated for subjects in both arms based on the risk prediction model. After constraining the sum to the sample size, all three forms of risk predictions have means at 1, and the SDs are 0.27, 0.76, and 0.49 in LP, RS, and IMR, respectively. In comparisons of the distributions between the three forms, LP is the most conservative with a range from 0.0004 to 1.75 and appears symmetric around 1. RS is the most aggressive with a linear increasing trend from 0.04 to 1 and then exponential up to 5.76. IMR appears to be the compromise between LP and RS with a range from 0.01 to 2.65. After weighting with the risk predictions, weighted KM estimators show the cumulative AE incidences are increased and the differences in KM curves between the two arms also appear increased compared to the unweighted data. The 90-day cumulative AE incidences for using LP, RS, IMR are 11.2% (treatment versus control:10.2% versus 15.7%), 13% (11.8% versus 19%), and 12% (10.8% versus 17.9%), respectively.
Prior to weighting with risk predictions, the hazard ratio (HR) of treatment is 0.63 (p = 0.072) in a Cox model after adjusting for confounding effects of all baseline risk factors. After weighting using sampling-weighted Cox models, the HRs of treatment are 0.63 (p = 0.07), 0.59 (p = 0.079), and 0.58 (p = 0.044) using weights of LP, RS, and IMR, respectively. Based on the results, significant treatment effect is detected when weighting using IMR, and the interpretation is, the subjects under treatment would be significantly protected from AE incidence and have 42% lower risk compared to those without the treatment.

Discussion

Our results based on simulated RCT show the two-stage approach can help enrich the basic risk in trial populations and thus enhance the treatment effect estimate as the comparison between two arms. However, some important assumptions or limitations may be required to assure the two-stage approach can provide robust results. First, the risk observed in the treatment arm is the same as the risk in the treatment arm when the treatment has no effects (or effects equivalent to placebo or routine treatments). If there existed unmeasured factors which drove certain subjects to be included in one arm over the other, or if the treatment itself introduced risk independent to the basic risk, then the risk predictions would fail to properly enrich the risk for both arms. Second, sufficient sample size may be needed in the control arm to build a reliable risk prediction model. Third, a parsimonious risk prediction model is preferred to avoid overfitting issue. If the risk prediction model too closely fits to the control arm data, it may not be well generalized to the treatment arm. In this study, we simulated RCT based on real-world data as an application example to present and support the approach. In the example, IMR appears to be the best choice of weights, and that could be due to the compromise property between the other two. Note the value of this approach is not to enhance power in general but enrich the basic risk in the trial population based on the risk predictions learned from the control arm data. The treatment effect and significance will not be enhanced if the treatment dose not reduce the risk associated with the important risk factors. A more comprehensive simulation study may be needed to well examine the performance of this approach under various scenarios.

This two-stage approach is simple to apply and requires no additional data beyond the trial population already collected in a study. This approach can be particularly useful when the treatment effect looks promising but fails to show significance, possibly due to the underlying basic risk in the trial population is lower than expected. After enriching the basic risk, the treatment effect may be enhanced and reach statistical significance. 

Presenting Author

Peter Wilson, Inari Medical

CoAuthor(s)

Yu-Hsiang Shu, Inari Medical
Peter Wilson, Inari Medical
Yu-Chen Su

P06 An Objective Review of 3+3, i3+3, and BOIN Designs for Phase I Dose-Finding Trials

Phase I dose-finding trials are essential for determining the safety and maximum tolerated dose (MTD) of new drugs. Statistical designs for phase I trials can be broadly categorized as algorithm-based (e.g., 3+3 and i3+3), model-based (e.g., CRM), and model-assisted designs (e.g., mTPI, mTPI-2/Keyboard, BOIN), each rooted in distinct theoretical frameworks and implementation strategies. The i3+3 design and model-assisted designs all use the "up-and-down" decision rules denoted as E (escalate to the next higher dose), S (stay at the current dose), and D (de-escalate to the next lower dose) for dose finding. Some recent reviews in the literature aim to compare various designs, mostly based on comparing operating characteristics (OCs) from simulated clinical trials, which can be influenced by the specific assumptions and settings used in the simulations.

We suggest comparing simple designs like 3+3 and other model-assisted designs directly via decision tables in addition to the OCs. We show the pros and cons of three popular and simple designs: 3+3, BOIN, and i3+3. We aim to impartially evaluate the three designs using a fair comparison process in which choices of simulation parameters are determined by a third party. We developed an open-source R package, "FIND," accompanied by demonstration examples to facilitate the comparison. This review seeks to offer readers comprehensive and unbiased insights into current mainstream dose-finding designs, along with practical tools for implementation, thereby contributing to the advancement of clinical trial methodology in drug development. 

Presenting Author

Yunxuan Zhang, University of Chicago

CoAuthor

Yuan Ji, The University of Chicago

P07 Bayesian Dynamic Borrowing for Nonparametric Survival Analysis

In certain circumstances (pediatrics, rare disease, early phase) it is becoming increasingly common to use Bayesian Dynamic Borrowing (BDB) to incorporate prior information from one or more data sources in the design of clinical trials of analysis of clinical trial data. Covariate adjusted BDB methods have been proposed and accepted for this analysis which will be based on an ORR endpoint. However, BDB for survival analysis is less well-studied when progression free survival (PFS) is the key endpoint. Existing work on BDB for survival endpoints has typically focused on the use of parametric survival models, such as Weibull proportional hazard model. When the proportional hazard assumption is violated, the results from such models are most likely biased.

We propose a formal non-parametric Bayesian survival analysis, which utilizes a Dirichlet Process Mixture Model (DPMM). Such an approach extends the idea of using a prior distribution on a parameter in a parametric model to using a prior distribution on the CDF that governs the time-to-event distribution. DPMMs are easily implemented using stan through their finite mixture approximation. We observed reduction in bias while providing comparable rMSE as with other methods. In addition, use of DPMMs offers a flexible alternative, allowing nonparametric survival analysis with or without dynamic borrowing from historical data.​ 

Presenting Author

Yuelin Lu, Baylor University

CoAuthor

Matthew Psioda, GSK

P08 Bayesian Meta-Analysis of Predictive Biomarker Studies using Aggregate Data and Individual Participant Data

Predictive biomarkers are instrumental in forecasting therapeutic outcomes. For instance, PD-L1 has been recognized as a key predictor of success in immunotherapy, where patients with high PD-L1 expression are more likely to respond to immune checkpoint inhibitors. The predictive ability of biomarkers is often assessed across multiple studies, but meta-analysis is challenging due to the variability in cut-points used to dichotomize biomarkers into categorical groups in these studies. Multivariate meta-analysis can help synthesize evidence from such studies, elucidating the relationship between predictive biomarkers and treatment effects. Typically, this analysis uses aggregate data (AD), which may lack sufficient data points. Accessing individual participant data (IPD) can enhance synthesis across studies, but researchers may face challenges. Our goal is to develop Bayesian modeling strategies that use both IPD and AD for more efficient data synthesis, aimed at estimating predictive effects for time-to-event outcomes within a nonlinear dose-response context, specifically using the four-parameter log-logistic model. We propose a one-stage Bayesian meta-analysis model that integrates IPD and AD within a unified synthesis model, allowing both data types to contribute to the estimation of all key model parameters (i.e., the minimum, maximum, ED50, and slope). Simulations are conducted to assess the performance of the proposed approach, suggesting that it would improve the accuracy of estimates of the predictive effects compared to the conventional AD-only approach. 

Presenting Author

Wayne Wu, University of Texas, MD Anderson Cancer Center

CoAuthor

J. Jack Lee, University of Texas, MD Anderson Cancer Center

P09 Bayesian Simulation-Guided Designs for Adaptive Clinical Trials: Potential Synergies of Open Source Code and Statistical Software Tools.

Multi-arm multi-stage (MAMS) trials represent an efficient approach in clinical trial design. This design type allows for testing of multiple treatment arms simultaneously within one protocol, assigning patients to the most promising arms in an adaptive manner, all while controlling for type-1 error. Key to this approach is the choice multiplicity comparison procedures (MCPs), and choice of treatment selection rules. In this case study, we focused on assessing multiple treatment selection rules, including posterior probabilities and Bayesian approaches, using custom R coding integrated in commercial statistical software to optimize a MAMS study design. Leveraging the computing capabilities of commercial software, alongside the flexibility of R coding allowed us to assess a variety of treatment selection rules efficiently and comprehensively. Software-native selection algorithms furthered our optimization aims by offering optimized design candidates for comparison. Our simulation-based approach enhanced probability of success by comparing, side-by-side, different novel treatment selection rules, and choosing the best fit rule for the study at hand. We believe that combining custom code alongside statistical software offers a comprehensive approach for complex study designs. 

Presenting Author

haripria Ramesh Babu

CoAuthor

haripria Ramesh Babu

P10 Bayesian two-step procedures to estimate longitudinal hazard ratio changes for immuno-oncology drugs

Given non-proportionality of hazards (NPH) problems in immuno-oncology, the use of alternative testing methods to unweighted log-rank test (LRT) has been actively discussed. Although some methods can be more powerful than LRT in NPH situations, the problem is that sizing for these alternatives needs a correct specification of the NPH structure. Almost all the phase 3 programs for immuno-oncology products do not have enough confidents about longitudinal changes on hazard ratio (HR). Hence, the use of alternative testing methods would become under- or over-powered study under mis-specified NPH structures.
In this article, we motivate an actual phase 3 trial evaluating a PD-L1 inhibitor. At the designing stage, the trial sponsor had historical datasets of past completed phase 3 trials which evaluated similar immune-checkpoint inhibitors for similar patient entities. We propose Bayesian two-step approach for estimating longitudinal changes on HR by using these available historical datasets.
This approach can estimate the number of the change-points in a data-dependent manner using reversible jump Markov Chain Monte Carlo algorithm [1]. We will introduce several simulation results and inform how the analysis results can be used when designing a current phase 3 trial.

References
[1] Green P. Reversible jump Markov chain monte carlo computation and Bayesian model determination. Biometrika. 1996;82(4):711-732. 

Presenting Author

Riku Kajikawa, Biostatistics Division, National Cancer Center

CoAuthor

Shogo Nomura, University of Tokyo

P11 Clarifying the role of the Mantel-Haenszel Risk Difference Estimator in Randomized Clinical Trials

The Cochran-Mantel-Haenszel (CMH) risk difference estimator is widely used for binary outcomes in randomized clinical trials. This estimator computes a weighted average of stratum-specific risk differences and traditionally requires the stringent assumption of homogeneous risk difference across strata. In our study, we relax this assumption and demonstrate that the CMH risk difference estimator consistently estimates the average treatment effect. Furthermore, we rigorously study its properties under two asymptotic frameworks: one characterized by a small number of large strata and the other by a large number of small strata. Additionally, a unified robust variance estimator that improves over the popular Greenland's and Sato's variance estimators is proposed, and we prove that it is applicable across both asymptotic scenarios. Our findings are thoroughly validated through simulations and real data applications. 

Presenting Author

Xiaoyu Qiu

CoAuthor(s)

Jaehwan Yi, University of Washington
Jinqiu Wang, Newark Academy
Yanyao Yi, Eli Lilly and Company
Ting Ye, University of Washington

P12 Driving Efficiency of Clinical Drug Supply Chains Management in Adaptive Clinical Trials

With increasing interest in adaptive clinical trial designs, challenges are present to drug supply chain management which may offset the benefit of adaptive designs. Thus, it is necessary to develop an optimization tool to facilitate the decision making and analysis of drug supply chain planning. The challenges include the uncertainty of maximum drug supply needed, the shifting of supply requirement, and rapid availability of new supply at decision points. In this research project, statistical simulations are designed to optimize the pre-study medication supply strategy and monitor ongoing drug supply using real-time data collected with the progress of study. Particle swarm algorithm is applied when performing optimization, where feature extraction is implemented to reduce dimensionality and save computational cost. 

Presenting Author

Jincheng Pang

CoAuthor(s)

Hong Yan, Servier Pharmaceuticals
Zhaowei Hua, Servier Pharmaceuticals, Inc.

P13 Global test for heterogenous patient population in rare disease clinical trials

In drug development, multiple efficacy endpoints may be used to assess diseases with multiple clinical manifestations. A drug may affect multiple disease aspects or outcomes. Usually, we are not sure which aspect is more likely to show a drug effect before conducting the trial. Many statistical methods, including the global test, have been used to evaluate the treatment effects on multiple endpoints. For rare diseases with small to very small patient populations, there are more challenges to analyze the data. The patient population often has heterogeneous clinical presentation. Different participants may benefit from the drug in different disease aspects. The drug that shows strong effect in endpoint 1 may have no or small effect in endpoint 2 for the same participant. However, for a different participant, the same drug may have the reversed effect (i.e., strong effect in endpoint 2, but no or small effect in endpoint 1). While all the endpoints are important to the rare disease patient population. To overcome such challenges of heterogeneous drug effect on different endpoints, we proposed several stratified global test methods, which are natural extensions of the existing global test approaches, including the O'Brien's ordinary lease squares (OLS) method and multi-domain responder index (MDRI) method. These proposed methods deal with the heterogeneous patient population and the small sample size in rare disease. Simulations of hypothetical trials are conducted to compare the type I error and power between the existing and proposed methods. 

Presenting Author

Zhixing Xu, Sanofi

CoAuthor(s)

Qingcong Yuan, Sanofi
Mengjie Yao
Qi Zhang, Sanofi
Yingwen Dong, sanofi
Hui Quan, Sanofi

P14 Interim Analysis in Sequential Multiple Assignment Randomized Trials for Time-to-event Outcomes

Sequential Multiple Assignment Randomized Trials (SMARTs) have been conducted to mimic the actual treatment processes experienced by physicians and patients in clinical settings and inform comparative effectiveness of dynamic treatment regimes (DTRs). In a SMART design, patients are involved in multiple stages of treatment, and the treatment assignment is adapted over time based on the patient's characteristics such as disease status and treatment history. In this work, we develop and evaluate statistically valid interim monitoring approaches to allow for early termination of SMART trials for efficacy regarding time-to-event outcomes. The development is nontrivial. First, in comparing estimated event rates from different DTRs, log-rank statistics need to be carefully weighted to account for overlapping treatment paths. At a given time point, we can then test for the null hypothesis of no difference among all DTRs based on a weighted log-rank Chi-square statistic. With multiple stages, the number of DTRs is much larger than the number of treatments involved in a typical randomized trial, resulting in many parameters to estimate for the covariance matrix of the weighted log-rank statistics. More challengingly, for interim monitoring, we need to quantify how the log-rank statistics at two different time points are correlated, and each component of the covariance matrix depends on a mixture of event processes which can jump at multiple time points due to the nature of multiple assignments. Efficacy boundaries at multiple interim analyses can then be established using the Pocock and the O'Brien Fleming (OBF) boundaries. We run extensive simulations to evaluate and compare type I error and power for our proposed weighted log-rank Chi-square statistic for DTRs under different boundary specifications. The methods are demonstrated in analyzing a neuroblastoma trial. 

Presenting Author

Zi Wang, University of Pittsburgh

CoAuthor(s)

Yu Cheng, University of Pittsburgh
Abdus Wahed, University of Rochester

P15 IRT Implementation Considerations for Bayesian Response-Adaptive Randomization in Platform Trials

Platform Trials are a classification of Master Protocols, which are perpetual and allow for planned treatment adaptations to occur across the trial's duration. These adaptations include introducing new treatments; opening, closing, pausing, re-opening existing treatments; and applying allocation ratio adjustments. Platform Trials often include multiple subgroups (e.g., sub-protocols, sub-studies, sub-populations) with independent randomization schemes with unique treatment inclusion / exclusion and ratio weights applied.

Due to the randomization's complexity in Platform Trials, an Interactive Response Technology (IRT, also known as Randomization and Trial Management) System is utilized. There are several considerations to account for regarding the Platform Trial IRT's randomization implementation to ensure efficient adaptions. These considerations involve how to manage adaptations for existing treatments, perform allocation ratio adjustments, add new treatments, configure an adaptable randomization scheme, and handle subgroup / site / subject eligibility.

Bayesian Response-Adaptive Randomization (BRAR) is a type of randomization that may be used within Platform Trials. BRAR is a probabilistic randomization methodology where treatment assignment probabilities are updated continuously based on accumulating subject response data. In addition to IRT considerations specific to Platform Trials, there are considerations unique to BRAR. These include the approach to updating treatment assignment probabilities (e.g., user-entered interface or data integration of BRAR algorithm), structure of random numbers used for treatment assignment (e.g., unrestricted, restricted), and handling of eligibility. This poster will describe the IRT implementation considerations for Platform Trials with BRAR, include examples of approaches, and provide guidance for making decisions for the IRT's design. 

Presenting Author

Jennifer Ross, Almac Group

CoAuthor(s)

Kevin Venner, Almac Group
Noelle Sassany, Almac Clinical Technologies

P16 A novel K-step ahead Multiple Endpoint Anomaly Detection through Bayesian Latent Class Modeling

In clinical trials, ensuring the quality and validity of data for downstream analysis and results is paramount, thus necessitating thorough data monitoring. This typically involves employing edit checks and manual queries during data collection. Edit checks consist of straightforward schemes programmed into relational databases, though they lack the capacity to assess data intelligently. In contrast, manual queries are initiated by data managers who manually scrutinize the collected data, identifying discrepancies needing clarification or correction. Manual queries pose significant challenges, particularly when dealing with large-scale data in late-phase clinical trials. Moreover, they are reactive rather than predictive, meaning they address issues after they arise rather than preemptively preventing errors. In this paper, we propose a joint model for multiple endpoints, focusing on primary and key secondary measures, using a Bayesian latent class approach. This model incorporates adjustments for risk monitoring factors, enabling proactive, k-step ahead, detection of conflicting or anomalous patterns within the data.
Furthermore, we develop individualized dynamic predictions at consecutive time-points to identify potential anomalous values based on observed data. This analysis can be integrated into electronic data capture systems to provide objective alerts to clinicians and patients. We will present simulation results and demonstrate the effectiveness of this approach with real-world data. 

Presenting Author

Yuxi Zhao, Pfizer

CoAuthor

Margaret Gamalo, Pfizer

P17 Missing Data in Clinical Trials: Responses Missing Not at Random (MNAR) in a Regression Setup

Randomization provides a fair comparison between treatment and control groups, balancing out, distributions of known and unknown factors among the participants. With missing data, we tend to lose this advantage and end up with biased results. Approaches to analyze data with a large number of missing values tend to be ad hoc and variable. While dealing with missing data, missing at random and missing completely at random are more popular assumptions, and easier to handle compared to missing not at random.
We propose some approaches for response values missing not at random scenarios. If a response value is missing not at random, then the probability that the response value is missing depends on the response value itself, and unbiased estimate of the response value is not possible with the help of observed data only. For example, Quality of Life response values may be missing because patients may become too sick to participate and cannot be estimated without bias.
In this work, we show usage of additional samples to estimate the response values more efficiently. This study shows how external data borrowing techniques can be used to serve this purpose, and how many extra samples will be needed to match the performance of an 'oracle' estimator (a hypothetical estimator with no missing data). 

Presenting Author

Dipnil Chakraborty, Bristol Myers Squibb

P18 Multi-arm and multi-stage superiority clinical trial design for negative binomial count data

Recurrent events or count data frequently serve as primary endpoints to assess treatment effects across diverse disease domains. Recently, statistical methodologies for group sequential design, adaptive design including sample size re-estimation with count data are discussed in the literature. However, the situation of multi-arm and multi-stage design incorporating both sample size re-estimation and arm dropping has not been explored for count data. In this research motivated by the need of designing a pediatric trial, a two-stage seamless phase 2/3 study design with count data as primary endpoint will be discussed. In stage 1, the participants will be randomized to low dose, high dose, and placebo. When pre-planned first part of stage 1 subjects (N11) completed a shorter treatment duration D1, an interim analysis will be conducted for both sample size re-estimation and dose selection. After dose is selected and final updated sample size is determined based on the interim analysis results and desired conditional power, stage 2 enrollment will begin by randomizing stage 2 participants into the selected dose and placebo. In order not to create any enrollment gap, from the cutoff date of interim analysis to start of stage 2 enrollment of N2 participants, additional stage 1 participants (N12) could be enrolled. All stage 1 and 2 participants (N11+N12+N2) will complete a longer treatment duration D2 (where D2>D1) and their data will be combined for the final analysis. The multiplicity adjustment approach accounting for dose selection and sample size re-estimation for the final analysis of comparing selected dose vs placebo for the primary efficacy endpoint using data from both stages will be discussed. Simulation studies will be presented to illustrate the operational characteristics under different enrollment assumptions, dose response curves and interim analysis timings. 

Presenting Author

Qi Zhang, Sanofi

CoAuthor(s)

Qingcong Yuan, Sanofi
Liyong Cui, University of Illinois at Chicago
Zhiying Qiu, Sanofi US
Yingwen Dong, sanofi
Hui Quan, Sanofi

P19 PK/PD Assisted Dose Optimization in the Multiple Arms Dose Expansion Early Phase Clinical Trials

Bayesian hierarchical models have emerged as a powerful tool in the field of dose expansion studies, especially when there are multiple arms or multiple indications. The goal of this abstract is to expand our research to guide dose optimization (>= 2 dose levels) in the dose expansion phase after FIH dose escalation using PK, PD, efficacy, and toxicity data. This project will evaluate the following three objectives: reasonable false positive rate, optimize the efficacy, with more manageable toxicity.

To meet FDA's currently thinking on the recommendation of dose optimization strategy in clinical trials, the development of statistical methods for multiple arms dose optimization is critical. Industry has developed Clinical Utility Score (CUS) approach to include both PK/PD, efficacy, and safety information in the dose selection. Pfizer and BMS both have well-established methods in this field. However, the PK/PD information introduces uncertainty for the dose-exposure relationship, and a single score cannot represent the variation of real-world dose-induction of safety, efficacy profile.

This project is planned to focus on the Bayesian utility function (Berry, 2002) to implement the dose optimization process. It will provide more information on the dose selection with a feasible small sample size, which is very practical and with an urgent need. We hope we can also compare different approaches, including two stage Clinical Utility Score, Bayesian HM, to innovatively guide multiple arms to choose the right dose based on the efficacy, safety, and PK/PD information.

We have derived and simulated Bayesian HM with Dirichlet process to choose the admissible doses, regarding to the false go rate at equal sample size scenario. It will expand the approach combines prior information about dose-response relationships, safety, and efficacy with real-time patient data from an ongoing clinical trial. This framework allows for continual learning and adaptation as the trial progresses.

In the Bayesian utility function, we are assuming all efficacy and safety as an ordinal data (e.g., complete response, partial response, minimum response). Two dimensional proportional odds logistic regression will be modeled for the efficacy and safety in the trial, and log-transformed PK/PD will be assumed following normal distribution (Lin, 2023), including patient-specific random effect. It aims to optimize dose selection quickly and efficiently by identifying the optimal dose with fewer patients, reducing the time and resources needed for dose expansion studies.

After we have the utility function posterior probability, we will deploy the decision-making procedure, and operating characteristics (OC) of this approach will be studied through simulation.

There are multiple research objectives for this research project,
1. Develop a Bayesian utility function that is more feasible for small sample sizes, which is very practical and with an urgent need. Perform Bayesian dose optimization simulation, especially for the boundary value for decision making in the dose expansion with reasonable OC false positive rate.
2. Assess the performance of Bayesian utility function (dose cohort number is ≥ 2) in the dose expansion stage. It is a computational tool to iteratively update the dose selection process, even providing a comprehensive sample size justification.
3. Evaluate the new methods in BMS blood disorder study (e.g., Sickel Cell Disease). In this study, PK/PD information will be simultaneously available, together with efficacy, safety data. 

Presenting Author

Wencong Chen, Bristol Myers Squibb

CoAuthor

Yiming Cheng, Bristol Myers Squibb

P20 Propensity Score Weighted Restricted Mean Survival Time (RMST) Model for Marginal Causal Effect in Observational Data

The Restricted Mean Survival Time (RMST) is a preferred marginal effect measure in causal survival inference since it can provide valid causal interpretations and is robust to nonproportional hazard situations. Propensity score adjustments are popular methods to adjust for observed confounding, which usually includes stratification, matching and weighting. In comparison to propensity score stratification and matching methods, propensity score weighting demonstrates the potential to yield a doubly robust estimator under specific semiparametric theory. Furthermore, the weighting procedure can be seamlessly integrated with regression models, enabling convenient adjustment for baseline covariates.

In this paper, we propose a new estimation method that uses the RMST difference as a marginal causal effect measurement and incorporates propensity score weighting adjustment to address observed confounding. The general idea is to construct an augmented inverse probability of treatment weighting (IPTW) estimator based on estimating equations. We consider both the propensity score model and outcome model in the estimating equations, while adjusting for censoring using inverse probability of censoring weighting (IPCW).

The proposed propensity score weighted RMST estimation strategy yields a doubly robust causal effect estimator with adjustment for measured confounding bias, baseline covariates, and censoring. It exhibits robust performance in our simulation evaluation, accompanied by technical proof of its doubly robust property. To enhance practical understanding, we also apply the proposed method to examine the causal effect of Direct Oral Anticoagulants (DOACs) compared to Warfarin in reducing the risk of cardiovascular events. 

Presenting Author

Zihan Lin, Bristol Myers Squibb (BMS)

CoAuthor(s)

Ai Ni, The Ohio State University
Bo Lu, The Ohio State University
Macarius Donneyong, The Ohio State University

P21 Robust covariate adjustment for randomized clinical trials when covariates are subject to missingness

In randomized clinical trials, often the primary goal is to estimate the treatment effect. Robust covariate adjustment is a preferred statistical method since it improves efficiency and is robust to model misspecification. However, it is still underutilized in practice. One practical challenge is the missing covariates. Though missing covariates have been studied extensively, most of the existing work focuses on the relationship between outcome and covariates, with little on robust covariate adjustment for estimating treatment effect when covariates are missing. In this article, we recognize that the usual robust covariate adjustment could be directly generalized to the scenario when covariates are missing with the additional assumption that missingness is independent of treatment assignment. We also propose three different implementation strategies in order to handle the increased dimensionality in working models caused by missingness. Simulations and data application demonstrate the performance of proposed strategies. Practical recommendations are presented in the discussion. 

Presenting Author

Jiaheng Xie

P22 ROMI: A Randomized Two-Stage Basket Trial Design to Optimize Doses for Multiple Indications

Optimizing doses for multiple indications is challenging. The pooled approach of finding a single optimal biological dose (OBD) for all indications ignores that dose-response or dose-toxicity curves may differ between indications, resulting in varying OBDs. Conversely, indication-specific dose optimization often requires a large sample size. To address this challenge, we propose a Randomized two-stage basket trial design that Optimizes doses in Multiple Indications (ROMI). In stage 1, for each indication, response and toxicity are evaluated for a high dose, which may be a previously obtained MTD, with a rule that stops accrual to indications where the high dose is unsafe or ineffective. Indications not terminated proceed to stage 2, where patients are randomized between the high dose and a specified lower dose. A latent-cluster Bayesian hierarchical model is employed to borrow information between indications, while considering the potential heterogeneity of OBD across indications. Indication-specific utilities are used to quantify response-toxicity trade-offs. At the end of stage 2, for each indication with at least one acceptable dose, the dose with highest posterior mean utility is selected as optimal. Two versions of ROMI are presented, one using only stage 2 data for dose optimization and the other optimizing doses using data from both stages. Simulations show that both versions have desirable operating characteristics compared to designs that either ignore indications or optimize dose independently for each indication. 

Presenting Author

Shuqi Wang, The University of Texas MD Anderson Cancer Center

CoAuthor(s)

Peter Thall, University of Texas, MD Anderson Cancer Center
Kentaro Takeda, Astellas Pharma Global Development, Inc.
Ying Yuan, University of Texas, MD Anderson Cancer Center

P23 Sample Size Determination for Cardiodynamic ECG Assessment Using the Concentration-QTc Analysis Method

Concentration-QTc (C-QTc) analysis was accepted to serve as an alternative to the by-time point analysis with intersection-union test (IUT) as the primary basis for decisions to classify the risk of a drug by E14 Q&As (R3) in December 2015. Since then, this analysis method has been widely applied by the industry, since it significantly reduces the sample size to achieve the same power as with IUT. It has the advantage of using the PK-ECG pair data from subjects across all dose levels and time points for the evaluation, while IUT is evaluated by dose and time point, separately. There is still no standard method to determine the sample size for C-QTc analysis to exclude a small effect on the QTc interval. Clario has developed a simple method to determine the sample size for different study designs using the C-QTc analysis and applied it to hundreds of studies.

At a 1-sided 5% significance level, an underlying effect of the study drug (or true mean difference in change-from-baseline QTcF [ΔQTcF] between the study drug and placebo) of 3 msec, a standard deviation (SD) of the ΔQTcF of 8 msec for both the study drug and placebo treatment groups, and the correlation (ρ) between ΔQTcF (study drug) and ΔQTcF (placebo) of zero are assumed. To achieve 90% power to exclude that the study drug causes ≥ 10 msec QTc effect at clinically relevant plasma levels, as shown by the upper bound of the 2-sided 90% confidence interval (CI) of the model-estimated QTc effect (placebo-corrected change-from-baseline QTcF: ∆∆QTcF) at the observed geometric mean Cmax of the study drug, a sample size of at least 24 evaluable subjects with continuous ECG data from all treatment periods is required for a crossover study. Considering a dropout rate of 15%-20%, a total of 28-30 subjects should be enrolled. This power is estimated using a paired-sample t-test at one time point. By using a highly precise ECG method, such as Early Precision QT Analysis (EPQT) at Clario, the SD of ΔQTc will be substantially lower and the sample size can be further reduced with maintained power.

Applying similar assumptions to a parallel study with nested crossover design or to a parallel TQT study, a sample size of 24 evaluable subjects per group will provide 91% power. For a standard single ascending dose (SAD) study, a sample size of 64 evaluable subjects (48 taking the study drug and 16 taking placebo) will provide 91% power. These powers are estimated using a two-sample t-test at one time point.

This calculation is conservative, since it does not take into account any gain in precision due to the use of all data from each subject across all time points in the linear mixed-effects model. 

Presenting Author

Hongqi Xue, Clario

CoAuthor(s)

Georg Ferber
Ellen Freebern, Clario
Borje Darpo, Clario

P24 Seamless Phase II/III Design With Treatment Selection Using Surrogate Endpoint for Early Interim Decision

Seamless phase II/III designs are attractive trial design options to clinical development teams when there is a strong desire to expedite the development process, to lower operational burden, and to reduce overall costs. The most common seamless design involves selecting promising treatment group(s) out of multiple treatment groups at the end of the first stage and confirming the selected treatment group(s) at the end of the second stage, by inferentially combining data of both stages in the final analysis. To further speed up the treatment selection, one can consider using an endpoint strongly correlated with the primary endpoint to make an early selection so as to plan the design and to begin enrollment for the second stage even before completion of the first stage. The correlation size and timing of selection must be carefully examined and calibrated to enable efficient use of the seamless design. In the context of a 2-stage seamless phase II/III design with early selection of one promising treatment group based on a correlated endpoint, we will investigate three potential analytical methods (group sequential, p-value combination, and adaptive Dunnett methods) for final analysis of the primary endpoint with respect to both Type I error control and statistical power via simulations. We will also characterize the impact of different correlation sizes and timing on treatment selection. We will discuss and compare the versatility of these three analytical methods. This discussion may have wide applications across different therapeutic areas.  

Presenting Author

yu xia, Abbvie

CoAuthor(s)

Joseph Wu, Pfizer
Cunshan Wang, Pfizer
Margaret Gamalo-Siebers, Pfizer

P25 Smooth Skygrid: Bayesian coalescent-based inference ofpopulation dynamics

Coalescent-based inference methods are essential in estimating population genetic parameters directly from gene sequence data under a variety of scenarios. In the last two decades, there have been several non-parametric expansions of the coalescent model for more flexible treatment towards demographic changes. The Bayesian Skygrid model is currently the most popular nonparametric coalescent model that discretizes continuous effective population size changes over an array of predefined time epochs. The effective population size in an epoch is constant and represented by a single parameter. Therefore, the change points of the effective population size parameters introduce discontinuities with respect to time and cause difficulties in the application of dynamic-integration-based samplers such as the Hamiltonian Monte Carlo method. In this poster, we introduce the original Skygrid coalescent prior, demonstrate the aforementioned discontinuities and introduce our preliminary thoughts on solving them with a new smoothed version of the Skygrid coalescent prior, and demonstrate the applicaDon on viruses such as West Nile Virus. 

Presenting Author

Yuwei Bao, Tulane University

CoAuthor

Xiang Ji, Tulane University

P26 Systemic and Behavioral Determinants of Bilateral Cataract Formation: Insights from the CATT Study

Cataracts stand as a primary cause of visual impairment and blindness globally, with prevailing research often emphasizing unilateral development while overlooking the bilateral dynamics and the systemic factors that may influence cataract formation across both eyes. This study aims to delve into the bilateral nature of cataract development, placing a spotlight on the roles played by demographic traits, behavioral patterns, and comorbid health conditions. Through the meticulous analysis of the Comparison of Age-Related Macular Degeneration Treatments Trials (CATT) dataset, we aim to elucidate the systemic and lifestyle elements contributing to the occurrence of cataracts in both eyes.

Adopting a random effects model tailored for analyzing clustered bivariate binary outcomes, the study scrutinized CATT data to detect clustering patterns in cataract development and assess the influence of various risk factors while accounting for the inherent correlation between eyes. This methodological approach facilitated a nuanced exploration of the bilateral development of cataracts.

The investigation unveiled a significant clustering of cataract occurrences within individuals, indicating a strong bilateral linkage in their development. Smoking emerged as a critical behavioral risk factor, particularly affecting the right eye, albeit with notable variability that underscores the intricate nature of its influence. Concurrently, other variables such as obesity and skin dryness were identified as contributors to increased cataract risk, displaying variability in their impact across both eyes.

In conclusion, this study highlights the necessity of a bilateral perspective in evaluating and managing cataract risk, underscoring the significance of systemic health and lifestyle choices in the bilateral progression of cataracts. The insights garnered advocate for an integrated approach to cataract prevention and management, calling for targeted public health interventions and lifestyle adjustments to alleviate the global cataract burden. The findings also prompt further investigation with larger, more heterogeneous cohorts to deepen our understanding of these associations and uncover additional systemic factors that influence cataract development 

Presenting Author

Edmund Ameyaw, Howard University

P27 The flaw of averages: Bayes factors as posterior means of the likelihood ratio

As an alternative to the Frequentist p-value, the Bayes factor (or ratio of marginal likelihoods) has been regarded as one of the primary tools for Bayesian hypothesis testing. In recent years, several researchers have begun to re-analyze results from prominent medical journals, as well as from trials for FDA-approved drugs, to show that Bayes factors often give divergent conclusions from those of p-values. In this poster, we investigate the claim that Bayes factors are straightforward to interpret as directly quantifying the relative strength of evidence. In particular, we show that for nested hypotheses with consistent priors, the Bayes factor for the null over the alternative hypothesis is the posterior mean of the likelihood ratio. By re-analyzing 39 results previously published in the New England Journal of Medicine, we demonstrate how the posterior distribution of the likelihood ratio can be computed and visualized, providing useful information beyond the posterior mean alone. 

Presenting Author

Charles Liu

CoAuthor(s)

Ron Yu, Gilead Sciences
Murray Aitkin, University of Melbourne

P28 Update Efficacy and Futility Bounds at Time of Analysis in Group Sequential Designs

At the time of analyses of a group sequential design, there are scenarios where the efficacy/futility bounds need to be updated. This can occur when (1) the observed number of events differs from the planned numbers, (2) additional analyses are requested or skipped, or (3) the total alpha is changed due to an updated multiplicity graph. In such cases, the timing of the analyses deviates from the originally planned timing. While the literature covers the updating of boundaries under proportional hazards (PH) in gsDesign by Keaven Anderson, this poster will focus on updating the timing during analyses under non-proportional hazards (NPH). This topic is of significant interest to pharmaceutical practitioners who frequently observe delayed treatment effects. 

Presenting Author

Yujie Zhao, Merck & Co., Inc.

CoAuthor

Keaven Anderson, Merck & Co., Inc.

P29 Variable Selection and Prediction for Longitudinal Data Using Bayesian Transfer Learning

The analysis of contemporary longitudinal health-science data can involve high-dimensional measurements of time-course gene expression data or brain images collected during different scanning sessions which are collected on a small number of patients. The estimation of such high-dimensional longitudinal models with limited sample sizes in a target study population poses substantial challenges and can lead to highly variable parameter estimates, poorly calibrated and unstable predictions, and low power in the identification of pivotal covariates that predict the outcomes. In many cases, multiple additional source datasets from related, but not necessarily identical, populations are available. Some of these sources datasets can be substantially larger than the target longitudinal dataset. In such instances, it becomes natural to borrow information from source datasets with similar covariates-outcome relations to improve the precision of the parameter estimates in the target data.
Transfer learning seeks to adeptly borrowing information from different source data cohorts for different data settings. We propose a novel Bayesian transfer learning model for longitudinal data using Bayesian mixed effects models. To enhance the accuracy of the parameter estimates and enable data-adaptive information borrowing, we leverage Bayesian mixture models for the discrepancies between fixed effects regression coefficients in source and target studies and for the means of the observation-specific time-trajectories and covariate effects. We define our Bayesian mixture models with the aim of minimizing the transfer of information from external source studies than would introduce large bias (i.e. these with large discrepancies). Through extensive simulation studies and real-data applications show that, compared to several alternative data analysis approaches, our Bayesian transfer learning model improves precision of the parameter estimates in the target study substantially and reduces the risk of bias when one or several source data sets are generated under population specific longitudinal models that are substantially different from the target mixed-effects longitudinal model. 

Presenting Author

Jialing liu, University of Minnesota

CoAuthor

Steffen Ventz, University of Minnesota