Tuesday, Aug 5: 8:30 AM - 10:20 AM
0583
Topic-Contributed Paper Session
Music City Center
Room: CC-101B
This session represents a collaboration among members from five different Centers for AIDS Research (CFARs) across the United States and focuses on modern statistical methods in HIV research. Talks will focus on innovative statistical methods, including machine learning, causal inference, and Bayesian approaches, that were motivated by and applied to real-world challenges in HIV research.
HIV
Applied
Yes
Main Sponsor
Biometrics Section
Co Sponsors
ENAR
Section on Statistics in Epidemiology
Presentations
HIV drug resistance is most commonly assessed by means of genotyping. However, these sequence-based predictions can be discordant with those of phenotype-based assays, which are often considered the gold-standard measure for resistance. Owing to cost and accessibility constraints, phenotyping is infeasible in many resource-limited settings with the highest prevalence of drug resistance. The publicly available genotype-phenotype data at the Stanford HIV Drug Resistance Database offer a unique opportunity to understand which mutations contribute to this discordance and potentially improve the predictive accuracy of the genotypic algorithm.
We have developed a statistical procedure that uses phenotype information to adjust the weights of the genotype-based prediction algorithm with the goal of reducing discordance between genotype- and phenotype-based drug resistance predictions. For each specific medication, we first model the relationship between phenotypic and genotypic scores via semiparametric regression, thus eliminating variance in phenotypic score attributable to genotypic score. We then regress residuals from this model on the mutation-indicator matrix to quantify each mutation's contribution to unexplained phenotypic variance.
Statistical challenges include censoring at the upper bound of the phenotype score, and properly accounting for false discovery rate in identifying mutations that drive unexplained variation. We account for these by implementing a Bayesian mixture model in the first step to allow for the probability of right-censoring, followed by a Bayesian regression in the second step which enables model selection via examination of posterior probabilities for each mutation coefficient.
We illustrate using data from the Stanford HIV Drug Resistance Database. Current work involves comparing adjusted scores with unadjusted scores in their ability to predict clinical endpoint data.
Keywords
recalibrated genotype measures
Cytometry data, including flow cytometry and mass cytometry, are now standard in numerous immunological studies, such as HIV vaccine trials. These data enable the monitoring of an individual's peripheral immune status over time, providing detailed insights into immune cells and their role in clinical outcomes. However, traditional analyses relying on summary statistics, such as cell subset proportions and mean fluorescence intensity, may overlook critical single-cell information. To address this limitation, we introduce cytoGPNet, a novel approach that harnesses extensive cytometry data to predict individual-level outcomes. cytoGPNet is designed to address four key challenges: (1) accommodating varying numbers of cells per sample; (2) analyzing the longitudinal cytometry data to understand temporal relationships; (3) maintaining robustness under the constraints of limited individual samples in HIV vaccine trials; and (4) ensuring interpretability to facilitate biomarker identification. We apply cytoGPNet to data from four diverse studies, each with unique characteristics. Despite these differences, cytoGPNet consistently outperforms other popular methods in terms of prediction accuracy. Moreover, cytoGPNet provides interpretable results at multiple levels, offering valuable insights.
Keywords
Outcome Prediction Accuracy
Co-Author
Lynn Lin, Duke University
Speaker
Jingxuan Zhang, Duke University School of Medicine Dept. of Biostatistics & Bioinformation
More commonly, randomized clinical trials of new agents for HIV pre-exposure prophylaxis (PrEP) compare against standard of care regimens without a placebo group. In the absence of a placebo group, it becomes intractable to estimate the efficacy of these new PrEP regimens. To remedy this, we propose a Bayesian modeling approach to estimate the counterfactual background HIV incidence (bHIV) in the context of randomized, double-blind, double-dummy, noninferiority trials to compare the novel interventions with the standard of care daily oral regimen of TDF/FTC for prevention of HIV infection in at-risk populations, where tenofovir diphosphate levels in dried blood spots (DBS) are assessed using case-cohort sampling. We construct a Poisson-based likelihood, incorporating DBS drug level data from case-cohort samples and imputing expected time in adherence categories for participants not selected in the case-cohort based on individual-level characteristics. Our model utilizes priors based on the adherence-efficacy relationship for TDF/FTC to back-calculate the counterfactual bHIV and uses Markov Chain Monte Carlo to estimate the comparative efficacy of the PrEP agents against bHIV.
Many interventions are both beneficial to start and harmful to stop. For example, data from a recent trial ("Adaptive Strategies for Preventing and Treating Lapses of Retention in HIV Care" [ADAPT-R]; NCT02338739) showed that conditional cash transfers (CCTs) for HIV care adherence were beneficial to initiate, though harmful to discontinue, on average. Traditionally, to determine whether to deploy that intervention in a time-limited way depends on if, on average, the increase in the benefits of starting it outweigh the increase in the harms of stopping it. We propose a novel causal estimand that provides a more nuanced understanding of the effects of such treatments – particularly, how response to an earlier treatment (e.g., treatment initiation) modifies the effect of a later treatment (e.g., treatment discontinuation) – thus learning if there are effects among the (un)affected. Specifically, we consider a marginal structural working model summarizing how the average effect of a later treatment varies as a function of the (estimated) conditional average effect of an earlier treatment. We allow for estimation of this conditional average treatment effect using machine learning, such that the causal estimand is a data-adaptive parameter. We show how a sequentially randomized design can be used to identify this causal estimand, and we describe a targeted maximum likelihood estimator for the resulting statistical estimand, with influence curve-based inference. Throughout, we use the ADAPT-R as an illustrative example, showing that discontinuation of CCTs was most harmful among those who most had an increase in benefits from them initially.
Keywords
sequential multiple assignment randomized trial
conditional average treatment effect
data-adaptive parameter
marginal structural model
targeted maximum likelihood estimation
Speaker
Lina Montoya, University of North Carolina at Chapel Hill