Latest Research Topics in High Dimensional Regression or Survival Analysis

Inkoo Lee Chair
University of Georgia
 
Thursday, Aug 7: 8:30 AM - 10:20 AM
4210 
Contributed Papers 
Music City Center 
Room: CC-202B 

Main Sponsor

Biometrics Section

Presentations

A new class of distributions on the positive real line transformable to a Normal distibution

Let X be a positive random variable with support on the positive real line. The log normal distribution for X is an example of transformation giving us Normal distribution. Technically, ln(X) is normally distributed. So we want to develop a class of 3 parameter distributions on the positive real line that can be transformed into a normal distribution. The transformation we want to consider is the Box-Cox transformation. It was shown no Box-Cox transformation of X can be normally distributed. By modifying the Box-Cox transformation slightly, we show that our new class of distributions is a transformable into a Normal distribution. In addition, we examine several properties of the new class of distributions algebraically and graphically. 

Keywords

Box - Cox transformations

Log normal distribution

Survival Analysis 

Co-Author(s)

Marepalli Rao, University of Cincinnati
Zhaochong Yu

First Author

Nisha Sheshashayee

Presenting Author

Nisha Sheshashayee

Dynamic Risk-Adjusted Survival Time Monitoring for Medical Performance Surveillance

Effective monitoring of medical performance is crucial for improving healthcare quality. By identifying deteriorating performance early, prospective monitoring systems enable prompt investigations and timely corrective actions, ultimately reducing complications and mortality rates. Given this importance, post-treatment outcomes, such as survival times, are typically collected over time, leading to continuous data streams. Many existing methods for monitoring survival times focus on detecting proportional increases in hazard rates, which limits their ability to identify a broader range of performance changes, including non-proportional increases and changes in the relationships between survival times and risk factors. To address this gap, we develop a dynamic risk-adjusted survival time monitoring method for medical performance surveillance. Its key feature is the use of a newly proposed dynamic Cox model, which allows both the baseline hazard and the regression coefficients to vary over time, providing an accurate representation of the temporal dynamics in medical processes. Both theoretical and numerical studies demonstrate the effectiveness of our method in practice. 

Keywords

Medical performance

Survival times

Monitoring

Dynamic Cox model

Risk adjustment

Healthcare quality 

Co-Author

Kai Yang, Medical College of Wisconsin

First Author

Haoran Teng, Medical college of Wisconsin

Presenting Author

Haoran Teng, Medical college of Wisconsin

Likelihood-based Inference under Non-Convex Boundary Constraints

Likelihood-based inference under non-convex constraints on model parameters has become increasingly common in biomedical research. In this paper, we establish large-sample properties of the maximum likelihood estimator when the true parameter value lies at the boundary of a non-convex parameter space. We further derive the asymptotic distribution of the likelihood ratio test statistic under non-convex constraints on model parameters. A general Monte Carlo procedure for generating the limiting distribution is provided. The theoretical results are demonstrated by five examples in Anderson's stereotype logistic regression model, genetic association studies, gene-environment interaction tests, cost-constrained linear regression, and fairness-constrained linear regression. 

Keywords

Likelihood ratio test

Metric projection

Non-standard condition 

Co-Author(s)

Zhisheng Ye, National University of Singapore
Yong Chen, University of Pennsylvania, Perelman School of Medicine

First Author

Jinyang Wang

Presenting Author

Jinyang Wang

Mediation analysis of the effect of depression on Alzheimer's disease risk in older adults

Depression and Alzheimer's Disease (AD) are both highly prevalent among older adults, yet the causal relationship between them remains underexplored. Using datasets from the Alzheimer's Disease Neuroimaging Initiative (ADNI) study, we examine whether geriatric depression has a significant causal effect on the risk of AD and investigate the mediating role of key biological and clinical mediators. To estimate these causal effects consistently, we control for ultra-high-dimensional potential confounders, including DNA methylation levels, applying a ball correlation-based method for confounder selection within the mediation analysis. To ensure robustness against model misspecification, we adopt a robust mediation analysis framework. Our findings indicate a significantly positive causal effect of geriatric depression on AD risk. Based on these insights, new prevention and treatment strategies for geriatric depression and Alzheimer's disease can be proposed by targeting the identified confounders and mediators. 

Keywords

mediation analysis

geriatric depression

Alzheimer's disease

causal inference

DNA methylation 

Co-Author(s)

Yubai Yuan, Pennsylvania State University
Fei Xue, Purdue University
Kecheng Wei, Fudan University
Jin Zhou, UCLA
Annie Qu, University of California At Irvine

First Author

Yuexia Zhang, The University of Texas at San Antonio

Presenting Author

Yuexia Zhang, The University of Texas at San Antonio

Statistical Modeling Challenges in Large-scale Population Database: United States Renal Data System

The United States Renal Data System (USRDS), funded by the National Institute of Diabetes and Digestive and Kidney Diseases, is national data system that collects, analyzes, and disseminate information on chronic kidney disease (CKD) and end-stage kidney disease (ESKD) in the United States (usrds.org). It includes data on nearly all patients on dialysis in the US. In this talk we will discuss several challenges in modeling CKD and ESKD patient outcomes: 1) profiling health-care providers; 2) joint model including multivariate joint modeling of longitudinal, recurrent, and terminal outcomes and spatiotemporal modeling of patient outcomes, including longitudinal hospitalization and mortality. We will present several frequentist and Bayesian approaches to addressing large data size and high-dimensional parameters associated with modeling spatial effects and/or parametrization of time-varying/dynamic effects of risk factors on patient outcomes. The discussion will highlight opportunities and open challenges in modeling patient outcomes using the USRDS database. 

Keywords

Joint modeling

High-dimensional parameters

Time-varying coefficients

Large population database

End-stage kidney disease

Chronic kidney disease 

Co-Author

Damla Senturk, University of California-Los Angeles

First Author

Danh Nguyen, University of California-Irvine

Presenting Author

Danh Nguyen, University of California-Irvine

Variable Selection in Functional Linear Cox Model (FLCM) with Multiple Functional and Scalar Covariates

As biomedical studies increasingly gather complex, high-dimensional physiological data, effective variable selection methods are essential to manage this complexity and enhance accuracy in survival models. We propose a flexible penalized variable selection method for a functional Cox model with multiple functional and scalar covariates, utilizing the group minimax concave penalty (MCP) which automatically integrates smoothness into the estimation of functional coefficients. Additionally, we introduce a novel framework for selecting smoothing parameters within the Extended Bayesian Information Criteria (EBIC), distinguished by a new method for calculating degrees of freedom. Through a simulation study, we demonstrate the method's ability to perform accurate variable selection and parameter estimation. The method is applied to National Health and Nutrition Examination Survey (NHANES) data, identifying the key temporally varying distributional features of physical activity and demographic predictors related to all-cause mortality. This analysis sheds light on the intricate relationship between physical activity and all-cause mortality among older US adults. 

Keywords

Functional data analysis

Survival analysis

Variable selection

NHANES 

Co-Author(s)

Stella Self, University of South Carolina
Yichao Wu, University of Illinois At Chicago
Jiajia Zhang, University of South Carolina
Rahul Ghosal, John Hopkins University

First Author

Yuanzhen Yue, University of South Carolina

Presenting Author

Yuanzhen Yue, University of South Carolina