Monday, Aug 5: 2:00 PM - 3:50 PM
6024
Contributed Posters
Oregon Convention Center
Room: CC-Hall CD
Main Sponsor
Biometrics Section
Presentations
In many clinical contexts, there are several competing definitions for a single clinical diagnosis. This creates difficulties for synthesizing published research results. When using existing methods for meta-analysis, researchers must either ignore the different definitions or split the analyses based on definition. We propose a model that not only enables meta-analysis across different diagnostic thresholds, but leverages overlapping regions to indirectly estimate parameters.
We propose a likelihood approach for meta-analysis of aggregate results that parameterizes the observed data as mixtures from regions that partition a latent diagnostic variable. We consider difficulties with parameter identifiability and propose a solution using augmentary data. We illustrate our approach with a worked example that estimates the risk of cardiovascular disease among individuals with prediabetes, under two commonly-used definitions. We present extensive simulation models that consider the bias and coverage of our approach in relatively small samples. This approach is a first step toward a more general model for meta-analysis in the face of varying clinical definitions.
Keywords
mixture model
meta-analysis
prediabetes
cardiovascular disease
Abstracts
Adaptive enrichment design has become increasingly important in modern clinical trials and drug discovery by enhancing resource efficiency and potentially accelerates scientific discoveries. Compared with traditional randomized controlled trials, adaptive enrichment designs allow the flexible revision of enrollment criteria based on interim data, focusing future enrollment on subpopulations that show positive responses. However, traditional adaptive enrichment designs potentially compromise statistical power and faces challenging statistical inference due to the dependency of enrollment on previous selections. In this work, we propose a unified adaptive enrichment framework offering two primary benefits: First, our design not only allows the revision of enrollment criteria but also the adjustment of treatment allocation strategy. Second, our framework integrates design and inference seamlessly, ensuring valid statistical analysis throughout the process. Through theoretical investigations and simulation studies, we demonstrate the effectiveness of the framework in maintaining type I error rates and enhancing statistical power.
Keywords
Randomized trial
Adaptive design
Subgroup selection
Abstracts
With advances in translational and biomedical sciences, and the availability of the sophisticated biotechnology, biomarkers continue to be important prognostic factors for risks of diseases that provide insights into mechanisms of treatment effectiveness and disease progression. Evaluating their effectiveness and predictive utility is often complicated by censoring due to failure of the medical instrument to measure a biological marker that falls below a certain threshold typically referred to as limit of detection (LOD). Values that fall below the detection limit are considered noise and unreliable, and are thus missing and left censored. If left censoring is ignored in the analysis then this will cause bias, inefficiency and inaccurate estimates. Model: We propose here a new approach for joint modeling of a covariate with values that are left censored due to falling below the level of detection of the measured biomarker. The left censoring process is jointly modelled with longitudinal measures that has informative right censoring and a discrete survival process. Interest is in assessing if the biomarker that has below detection level values along with other covariates of interest
Keywords
Biomarkers
Discrete Survival
Left Censoring
Limit of Detection (LOD)
Longitudinal Model
Abstracts
The vaccine efficacy is defined as the reduction of relative risk of disease 1-(Rv/Rp), with Rv and Rp as the incidence rates for the disease of interest in the vaccine and placebo groups, respectively. A conditional exact method proposed by Chan and Bohidar (1998) is often used to estimate the vaccine efficacy and its confidence interval. In this poster, we will compare the conditional exact method with two alternative approaches: Poisson regression and modified Poisson regression proposed by Zou (2004) by using trial data with and without follow-up time adjustment. Dengue epidemiologic distribution across Brazil will be provided via a QR code.
Keywords
Vaccine efficacy
conditional exact method
Poisson regression
modified Poisson regression
Abstracts
In the summer cattle body temperature is affected by a culmination of various weather and environmental variables, such as air temperature, soil surface temperature, temperature heat index, relative humidity, wind speed, and incoming and outgoing short and long wave radiation. With rising temperatures in summer, the prolonged amount of thermal stress put on animals, specifically farm cattle, at risk. Various methods have been used to model the cattle body temperature including but not limited to multiple regression with correlated error and transfer function methods. However, these models are not suitable to reveal various components with known structures that jointly affect the dynamic of cattle body temperature such as linear local trend and seasonality. The objectives of this study are two folds. First, to implement the Bayesian Structural Time Series methods as a better alternative to model and forecast the dynamic of core body temperature in heat-stressed animals and compare the results with classical time series methods. Second, to detect thermal stress in animal by decomposing the observed body temperature to its various components.
Keywords
Bayesian Structural Time Series (BSTS)
Heat Stress
Environmental Variables
Hysteresis
Abstracts
Clinicians are increasingly interested in discovering computational biomarkers from short-term longitudinal 'omics data sets. Existing methods in the high-dimensional setting use penalized regression and do not offer uncertainty quantification. This work focuses on Bayesian high-dimensional regression and variable selection for longitudinal 'omics datasets, which can quantify uncertainty and control for false discovery.
We adopt both empirical Bayes as well as hierarchical Bayes principles for hyperparameter selection. Our Bayesian methods use a Markov Chain Monte Carlo (MCMC) approach and a novel Expectation Maximization (EM) algorithm for posterior inference. We conduct extensive numerical experiments on simulated data to compare our method against existing frequentist alternatives. We also illustrate our method on a pulmonary tuberculosis (TB) study consisting of 4-time point observations for 15 subjects, each with measured sputum mycobacterial load.
Keywords
Disease Progression
EM algorithm
Feature selection
Mixed Models
Mixture model
Uncertainty Quantification
Abstracts
This poster details a retrospective cohort study conducted at a single center, analyzing 33 patients (11 receiving Bethanechol treatment, 22 controls) hospitalized from 2017 to 2022. The study examines the effect of Bethanechol on individuals with Tracheobronchomalacia (TBM) and bronchopulmonary dysplasia. A case-control match (1:2 ratio) factored in gestational age, sex, TBM and bronchopulmonary dysplasia severity, and respiratory support. The primary focus was Bethanechol's impact on the Pulmonary Severity Score. Unique to this study, Bethanechol administration dates varied over a 105-day period for each case. The poster will display various statistical approaches used to navigate the study's complexities, with a particular emphasis on advanced visualization techniques that clarify the treatment's time-dependent effects. A comparison and discussion of different statistical methods will also be featured.
Keywords
case control
mixed model
longitudinal
data visualization
clinical study
dependent
Abstracts
When one has a dichotomous outcome for study, logistic regression continues to be the most widely used model. With this model, one is using the logit link to relate the outcome to a set of explanatory variables. Logistic regression permits the estimation of various functions of log odds including odds ratios. In the literature, one often sees odds interpreted as though they are probabilities. There are many concerning issues with such interpretations. For example, the odds ratio is further from the null than the comparable risk ratio. It is well known that the logit link is the canonical link, but recent research is enabling the use of non-canonical links like the log link. With the log link, one obtains the so-called log-binomial model which permits the direct estimation of log probabilities and risk ratios. We are exploring here the situations where the estimates of odds ratios and risk ratios from these two models are close and when the estimates are meaningfully different. This provides insight into the choice of link. We extend these results to ordinal outcomes again comparing the logit link and log link.
Keywords
logit link
log link
logistic regression
log-binomial model
dichotomous outcomes
ordinal outcomes
Abstracts
In nowadays biomedical research, there has been a growing demand for making accurate predictions at subject levels. In many of these situations, data are collected as longitudinal curves and display distinct individual characteristics. Thus, prediction mechanisms accommodated with functional mixed effects models (FMEM) are useful. In this paper, we developed a classified functional mixed model prediction (CFMMP) method, which adapts classified mixed model prediction (CMMP) to the framework of FMEM. Performance of CFMMP against functional regression prediction based on simulation studies and the consistency property of CFMMP estimators are explored. Real-world applications of CFMMP are illustrated using real world examples including data from the hormone research menstrual cycles and the diffusion tensor imaging.
Keywords
Classification
CMMP
functional mixed effects model
mean squared prediction error
Abstracts
The log-rank test is widely used in 2-arm clinical trials to compare survival distributions between groups. Design of clinical trials utilizing the log-rank test requires power calculations under the alternative hypothesis. Various distributional approximations for the log-rank test statistic under the alternative hypothesis have been proposed for power determination. Bernstein and Lagakos (1978) use an exponential MLE test for power calculation. Schoenfeld's (1981) method depends on a normal approximation with variance derived under local alternatives. Luo et al. (2019) and Yung and Liu (2020) derive a different asymptotic variance which also requires local alternatives. In practice, a local alternative assumption may unreasonable for large effects and modest sample sizes. Practical guidance is needed to guide selection of a power calculation method. We conduct a comprehensive simulation study comparing power calculation methods for the log-rank test and compare the underlying distributional assumptions of the mean and variance. This work provides guidance for practitioners and highlights the need for deriving the distribution of the log-rank test under general alternatives.
Keywords
log-rank test
asymptotic theory
clinical trial design
survival analysis
Abstracts
Background: Care coordination (CC) for Veterans with inpatient admissions in the community is an important element of the MISSION Act of 2018, considered vital for reducing poor outcomes due to fragmented care. Few studies have evaluated early impacts of CC. Methods: We compared several ways to extract CC activity from the VA Corporate Data Warehouse and created a matched cohort using propensity scores for the probability of receiving CC. We used generalized linear models with GEE to account for clustered data to estimate the risks for 30-day readmission, 30-day ED visits, 90-day and 1-year mortality. We also developed facility level models by aggregating patient-level data and accounted for facility complexity and percentage of community admissions involving CC. Results: Patient-level models estimated CC was associated with higher risk for each outcome; for example, those assigned 'complex' coordination had 28% greater risk for 30-day readmission (RR:1.28,95%CI:1.26, 1.3) and 21% greater risk of 90-day mortality (RR:1.21,95%CI:1.17, 1.3). Facility level models indicated increased CC activity was not associated with a significant risk change for any outcome.
Keywords
VA care coordination
MISSION Act
Healthcare outcomes research
Effectiveness research
Abstracts
For a confidence interval of a parameter in the binomial distribution, the coverage probability is a variable function of the parameter. The confidence coefficient is the infimum of the coverage probabilities and is an important behavior of the confidence interval. However, the exact confidence coefficient and average coverage probability of interval for two independent binomial distributions have not been accurately derived in the literature. In this study, we propose methodologies for calculating the exact confidence coefficients and average coverage probabilities of confidence intervals for a difference of the binomial proportions. Therefore, using these methodologies, we illustrate the performance of existing intervals and provide recommendations.
Keywords
Binomial distribution
Confidence coefficient
Confidence interval
Coverage probability
Difference of proportions
Risk difference
Abstracts
Alzheimer's dementia (AD) is of increasing concern as populations attain longer and longer life spans. Prediction of conversion to AD from a cognitively normal state remains difficult and is generally poorly understood. We used state space models- specifically a factor multivariate local linear trend model- to identify latent factors of cognitive function derived from a standard battery of neuropsychological tests. Using National Alzheimer's Coordinating Center data, we performed two separate structured factor analyses in individuals who ultimately converted to dementia and individuals who did not. There was substantially higher correlation between cognitive domains in those who transitioned to dementia (range: 0.329-0.863) compared to those who did not (range: 0.087-0.202). These findings suggest a more uniform underlying cognitive process in dementia converters than in non-converters since the domains remain relatively distinct in the latter group. Next, we plan to jointly model the longitudinal factor scores with a time to dementia outcome in patients who are cognitively normal. We aim to predict risk of dementia conversion at 1, 2, and 3 years post-cognitive testing.
Keywords
Alzheimer's disease
Factor analysis
State-space models
Joint models
Abstracts
Vaping (e-cigarette smoking) has been on the rise among middle and high school students for the last two decades. We want to investigate what made a pupil take to vaping. To address this question, we use the published data set called NYTS (National Youth Tobacco Survey) spanning several years of surveys. More importantly, we focus on pupils who smoke e-cigarettes exclusively. We use a logistic regression model to answer our questions.
On October 12, 2021, FDA (Food and Drug Administration) approved Vuse Solo, an e-cigarette smoking facilitator, for selling along with cartridges and refills. Vuse Solo is manufactured by RJ Reynolds Vaping Co. In addition, FDA gave permission to let RJ Reynolds Vaping Co., advertise its products. The approval is a blow. A benefit is cited for the approval. Vaping might help cigarette smokers to get weaned from cigarette smoking. We want to identify pupils who smoke cigarettes and investigate what proportion of these pupils use e-cigarettes. We examine also e-cigarette trend over several years.
Keywords
E-cigarettes
Middle and High Schools
Nicotine
Cigarettes
Logistic Regression
Abstracts
Gaussian graphical models, essential for depicting relationships among variables via conditional independence, face challenges in high-dimensional spaces where sparse associations are common. Traditional methods struggle with stability, leading to the adoption of sparsity-enhancing techniques. Unlike penalization-based frequentist approaches, our proposed Bayesian method focuses on efficiency and scalability by leveraging parallelizable Bayesian neighborhood regressions. Our method introduces Horseshoe shrinkage prior for sparsity and an innovative variable selection process that leverages the marginal likelihoods from the ranking of predictors. This strategy not only streamlines the estimation of complex relationships but also ensures computational efficacy. By synthesizing regression coefficients into coherent graph and partial correlation matrix estimates, our approach facilitates robust inference. Evaluated through FDR and TPR metrics, it demonstrates superior performance in diverse applications, notably in analyzing genetic expressions in triple-negative breast cancer, showcasing its applicability and effectiveness in real-world scenarios.
Keywords
Bayesian
Gaussian graphical models
Horseshoe prior
Sparse graph estimation
Abstracts
In oral-health epidemiological studies, protocols using partial-mouth periodontal examination (PMPE), where pocket depth and tooth attachment are not assessed at all potential measurement sites, can reduce research costs and participant response burden. But without considering the PMPE structure, simple data summaries tend to underestimate the extent and severity of periodontal disease. Viewing the PMPE structure as inducing a missing-data problem, we outline methods for estimating periodontal disease prevalence using multiple imputation. Specifically, we apply Centers for Disease Control/ American Academy of Periodontology (CDC-AAP) periodontal-disease criteria to data from newly recruited methamphetamine users who received either partial or full-mouth periodontal examinations, making use of a sample with similar background characteristics from the National Health and Nutrition Examination Survey (NHANES) where participants all had full-mouth examinations. Estimates that did not account for PMPE data collection were biased downward, while the proposed strategy succeeded in mitigating bias in prevalence estimates, underscoring the utility of the multiple-imputation framework.
Keywords
Dentistry
Oral health
Missing data
Epidemiology
Periodontitis
Public health
Abstracts
Co-Author(s)
Danielle LaVine, University of California, Los Angeles
Thomas Belin, University of California-Los Angeles
Vivek Shetty, Section of Oral & Maxillofacial Surgery, Department of Biomedical Engineering, University of Califor
First Author
Lauren Harrell, Google
Presenting Author
Danielle LaVine, University of California, Los Angeles
Ordinal longitudinal data on patient health status have been widely collected as an outcome in COVID-19 clinical trials. However, published analyses commonly simplify the outcome by neglecting either the ordinal or longitudinal components. Examples include time-to-event analysis based on reaching a particular ordinal state, and analysis of ordinal outcomes at a single timepoint. We instead advocate for the use of the ordinal transition model (OTM), an extension of the proportional odds model to longitudinal outcomes using transition modeling, to analyze ordinal longitudinal data because it leverages the full information within the outcomes. We conducted a comprehensive simulation study to assess the power and statistical efficiency of OTM models compared to simpler methods. Our simulations include scenarios where the assumptions of the OTM are satisfied as well as those where they are violated. For a representative example where assumptions were satisfied, power increased from 0.43 using the time-to-event model to 0.84 using the OTM model. We also present an R package for conducting power calculations using simulation to enable the design of clinical trials using OTM models.
Keywords
Ordinal longitudinal data
Statistical efficiency
Power
Transition models
Ordinal models
Simulation
Abstracts
The composition of the intestinal microbiome has a significant impact on children's health, starting from the prenatal period. Transformations in the microbiome during infancy have been associated with the development of chronic illnesses such as asthma and inflammatory bowel disease. However, the scientific investigation of the gut microbiome is complicated by certain aspects of the data such as compositionality and zero-inflation. Furthermore, causal discovery and causal inference in this space, without resorting to transformations of the data that alter mathematical properties in the underlying geometry is a developing area of research. In this work we develop novel methodological and statistically sound tools that recover causal relations and networks among bacteria in the developing gut microbiome without distorting the underlying geometry and expand this framework to allow for variation over time
Keywords
Compositional Data
Longitudinal Data
Zero-Inflation
Aitchison Geometry
Graphical Models
Abstracts
The Long Life Family Study (LLFS) enrolled 5,089 individuals from 593 two-generation families, selected from the top 1% of the Family Longevity Selection Score. LLFS families, on average, exhibited superior aging outcomes, though with notable variation among pedigrees. The heritability of key healthy aging indicators, both short and long-term, underscores a genetic influence on protection against aging. This project introduces a robust methodology to analyze longitudinal changes in cognitive function, accounting for genetic relations, and other correlated biomarkers. We adopt a nonparametric hierarchical functional model to address the familial structure inherent in LLFS. Departing from conventional approaches that consider the time from baseline as the longitudinal indicator, this model utilizes age as the natural temporal variable, offering advantages in handling limited observations and facilitating the integration of diverse study data. This innovative approach enhances the understanding of cognitive aspects related to exceptional longevity within the LLFS cohort by pooling the shared information from the subjects in a family, even under less than three data per subject.
Keywords
Hierarchical functional model
Long-life family study
Functional Principal Component analysis
Generalized additive model
Abstracts
Dynamic treatment regimes have increasingly become more popular as they allow the personalization of treatments based on patient characteristics and medical history. Selecting the optimal dynamic treatment regime hinges on identifying the best marginal structural model. The assessment of the "goodness-of-fit" among different marginal structural models commonly involves utilizing the risk associated with each model. However, calculating the risk of a marginal structural model involves unobserved potential outcomes under a treatment regime, prompting the frequent use of the inverse probability weighting method. While inverse probability weighting is easy to implement, it is inefficient and yields biased estimates if the propensity model is incorrectly specified. To overcome these limitations, we propose a multiply robust estimator. The multiply robust estimator is efficient. It also remains unbiased even with misspecified propensity score models. Despite the advantages, the multiply robust estimator is challenging to compute in practice because of the substantial number of nuisance parameters. In order to overcome this issue, we propose a procedure to enhance the inverse probability.
Keywords
Marginal Structural Model
Model Selection
Inverse Probability Weighted Estimator
Undersmoothing
Causal Inference
Dynamic Treatment Regimes
Abstracts
The concordance statistic is an index to evaluate the discriminant performance of a model, first proposed in logistic regression and frequently used in survival analysis nowadays. When dealing with survival data, these are attributed to Mann-Whitney-type statistics in scenarios without censoring, but usually, censoring will be an issue. Furthermore, a normal approximation has been commonly used to obtain confidence intervals for the estimator, but it may not work well in small-sample situations. In this study, we propose a new method of constructing confidence intervals by considering these statistics within the framework of the Stress-Strength model, a measure used in reliability engineering that considers two random variables: "Stress" and "Strength". This model estimates the probability of failure when Stress surpasses Strength. Its advantages include the ability to calculate probabilities directly concerning scientific interests, as well as facilitating flexible modeling and estimation. Performance evaluation by simulation will be presented on the same day.
Keywords
Survival analysis
concordance statistics
stress strength model
Abstracts
With the widespread RNA-seq applications of different sequencing platforms in biomedical science research in recent years, a systematic evaluation of RNA-seq data quality is crucial and timely. The Sequencing Quality Control (SEQC) project is a large-scale community effort for assessing the performance of RNA-seq technology across different platforms and multiple laboratories, where reference RNA samples with multiple replicates were sequenced at twelve laboratories using three sequencing platforms. Different from the SEQC project, we performed an independent and comprehensive analysis of RNA-seq data of the SEQC project to assess sequencing reproducibility across platforms, sequencing sites, sample replicates, and FlowCells respectively. With the employment of graphical tools and statistical models, our systemic analysis supports a distinctive conclusion that reproducibility across platforms and sequencing sites are not acceptable, while reproducibility across sample replicates and FlowCells are acceptable.
Keywords
SEQC
Reproducibility
Abstracts
First Author
Lianbo Yu, The Ohio State University
Presenting Author
Lianbo Yu, The Ohio State University
The increasing demand for drug efficacy research underscores the importance of addressing the hypothesis testing problem in both primary and secondary endpoints during the designing phase of clinical trials. This necessitates for an adjustment of multiplicity, particularly in hierarchical endpoints, a concept known as multiple endpoint testing. However, in group sequential designs, the statistical procedure for adjustment of the overall type I error rate of multiple hypotheses becomes complex. While the existing literature on gatekeeping procedures and alpha splitting effectively controls the family-wise error rate (FWER) and ensures sufficient power for testing primary endpoints, it falls short of providing adequate power for testing the secondary endpoint. This article introduces a model-based method for comparing a treatment's primary and secondary endpoints with those of a control. Additionally, a flexible approach for selecting the critical boundary of the secondary endpoint is developed to enhance the power of the corresponding hypothesis. Simulation results demonstrate that the proposed model-based method provides better adequate power with smaller samples while strictly con
Keywords
Hierarchical Endpoints
Group Sequential Design
FWER Control
Power Analysis
Abstracts
Spatial transcriptomics has gained significant interest since 2020 due to its ability to provide spatial data with gene expression information. According to the spatial information, more hidden tissue structures and biological functions are revealed. Numerous studies have focused on detecting spatial domains by effectively combining spatial and gene expression data. However, due to the intricate nature of spatial domains, many existing methods fall short, often limited by their focus on smaller neighboring areas. In this paper, we introduce the Spatial T-SNE, which also takes the cell type proportion of the spatial domain into account. Our method uniquely differentiates between spatial domains based on varying cell type proportions, employing an iterative updating algorithm. We test the performance of Spatial T-SNE with several popular spatial domain detecting methods on three published datasets. The results demonstrate that Spatial T-SNE more accurately reflects annotated spatial patterns, highlighting its effectiveness in spatial transcriptomic analysis.
Keywords
Spatial Transcriptomics
Spatial Domain Detection
T-SNE
Abstracts