Monday, Aug 4: 8:30 AM - 10:20 AM
0586
Topic-Contributed Paper Session
Music City Center
Room: CC-101C
Applied
Yes
Main Sponsor
International Chinese Statistical Association
Co Sponsors
International Statistical Institute
Section on Statistical Learning and Data Science
Presentations
Cancer patients often also suffer from other disease conditions. For more effective management and treatment, it is crucial to understand the "big picture". Human disease network (HDN) analysis provides an effective way for describing the interrelationships among diseases. The goal of this study is to mine the SEER-Medicare data and construct the HDNs for major cancer types for the elderly. For network construction, we adopt penalized deep neural networks (pDNNs). The DNNs can be more flexible than the regression-based and other analyses, and penalization can effectively distinguish important disease interconnections from noises. As a "byproduct", we establish the asymptotic properties of pDNNs. The constructed cHDNs are carefully analyzed in terms of node, module, and network properties.
Keywords
human disease network
deep learning
SEER-Medicare
cancer
Neyman-Pearson classifiers, which aim to maximize the clinical benefit while adhering to risk constraints, are crucial in many practical fields, including early cancer detection. However, applying these classifiers can be challenging due to discrepancies between the data distributions of the source and target populations. The potential impact can be disproportionally severe on the under-represented groups. We propose a semi-parametric model-based approach for adapting NP classifier decision rules to different populations while equitably controlling classification errors specific to clinical applications. Our method involves a shift-adjustment strategy that leverages from the target population a small unlabeled sample and minimal auxiliary information alongside the labeled source data. This approach enhances the fairness of the learned decision rules and ensures they are consistently tailored for the target population. We demonstrate the performance through theoretical studies and simulations and illustrate the approach with an example of a prostate cancer study.
Keywords
Algorithm fairness
Data shift
Neyman-Pearson Classifier
Speaker
Yingqi Zhao, Fred Hutchinson Cancer Research Center
In observational studies, propensity score (PS)-based causal inference techniques are commonly utilized to address selection bias in treatment assignment. Most existing PS research focuses on time-invariant treatments within a cross-sectional design. Limited attention has been given to PS processes in a longitudinal context involving survival endpoints, and even less work exists regarding time-varying treatments. Note that time-varying propensity score matching methods, as proposed by Lu (2005), have addressed time-dependent treatment receipt but have primarily been limited to continuous outcome measures, with only modest extensions. These methods consider pretreatment characteristics at a specific time point t without fully leveraging historical hazard information preceding time t. To bridge this gap, we introduce the dynamic propensity trajectory (DPT) framework and DPT-based matching (DPTM) techniques. These approaches achieve covariate balance across the entire study period, encompassing both time-invariant and time-varying covariates leading up to treatment initiation. In the primary analysis after matching, we quantify the causal treatment effects for time-to-event outcomes following treatment initiation. We apply the proposed methods to the Chronic Renal Insufficiency Cohort (CRIC) study to investigate the effects of antihypertensive medications in reducing the risk of cardiovascular disease among patients with chronic kidney disease. Additionally, we evaluate these methods in simulation studies, where our approaches outperform existing ones and result in the smallest bias.
Keywords
Causal treatment Effect
Cox Proportional Hazard model
Observational Study
Propensity Score
Time-dependent Confounders
Speaker
Ming Wang, Case Western Reserve University
In addition to the primary outcome, secondary outcomes are gaining prominence in contemporary biomedical research. These can be easily derived from traditional endpoints in clinical trials (source 1) and from compound or risk prediction scores in large-scale cohort studies or real-world data analysis (source 2). Despite being termed 'secondary,' these outcomes have significant potential to enhance estimation and inference in primary outcome analysis. This is particularly true when the primary outcome is a summary score derived from secondary outcomes, which may lack the detailed information specific to each secondary outcome. This talk will summarize the challenges of integrating information from secondary outcomes into primary data analysis and will describe recently developed tools to address these challenges. We will begin with an early version that considers only one secondary outcome (Tool1.0) and then move on to a more updated version that can handle multiple secondary outcomes (Tool2.0). Building on the first two versions, we will describe the latest version (Tool3.0), which facilitates more robust information integration in a data-driven manner and has great potential applications in the era of big data. Real data examples will be provided, and future directions toward Tool4.0 will be discussed at the end of the talk.
Keywords
Data integration
Statistical learning
Secondary outcomes