Tuesday, Aug 5: 2:00 PM - 3:50 PM
4129
Contributed Papers
Music City Center
Room: CC-212
Causal inference and machine learning in mental health
Main Sponsor
Mental Health Statistics Section
Presentations
Disparities in health or well-being experienced by racial and sexual minority groups can be difficult to study using the traditional exposure-outcome paradigm in causal inference, since potential outcomes in variables such as race or sexual minority status are challenging to interpret. Decomposition analysis addresses this gap by considering causal impacts on a disparity via interventions to other, intervenable exposures that may play a mediating role in the disparity. Moreover, decomposition analyses are conducted in observational settings and require untestable assumptions that rule out unmeasured confounders. Using the marginal sensitivity model, we develop a sensitivity analysis for unobserved confounders in studies of disparities. We use the percentile bootstrap to construct valid confidence intervals for disparities and causal effects on disparities under given levels of confounding under mild conditions. We also explore amplifications that give insight into multiple confounding mechanisms. We illustrate our framework on a study examining disparities in youth suicide rates among sexual minorities using the Adolescent Brain Cognitive Development Study.
Keywords
causal inference
sensitivity analysis
weighting
health disparities
observational studies
causal decomposition analysis
Identifying patient subgroups with distinct clinical profiles can help personalize treatment, address unmet needs, and improve outcomes. To uncover these latent subgroups, we developed a 4-step machine-learning (ML) analytical framework to real-world claims data, including: (1) Automated feature extraction; (2) K-prototype clustering for subgroup identification; (3) XGBoost for risk factor selection; and (4) Advanced visualizations for clinical interpretability. We identified 3 schizophrenia patient subtypes initiating oral olanzapine, each with distinct characteristics, adherence patterns, and treatment outcomes. A high-risk subgroup with poor adherence had severe psychiatric comorbidities, heavier healthcare resource burden, more substance uses yet showed the strongest treatment effectiveness, suggesting a treatment option facilitating better adherence could improve outcomes. In contrast, the older multimorbid patient subgroup with better adherence had limited effectiveness. This study highlights the power of ML-driven analytical framework in uncovering patient heterogeneity using real-world data, providing a guidance for optimizing schizophrenia treatment in clinical practice.
Keywords
machine learning
unsupervised clustering
feature engineering
real-world data
schizophrenia
personalized treatment
Among all racial groups in the US, the suicide rate among Black youth has increased the fastest in the past two decades, rising from 3.05 per 100,000 in 2001 to 5.99 per 100,000 in 2020. This alarming trend underscores the urgent need to study and prevent Black youth suicide as a top public health priority. Durkheim's Social Integration Theory posits that individuals are vulnerable to suicide when social integration is either extremely low or excessively high. The theory has been evaluated across various populations using separate measures of marital stability, residential stability, and religiosity in analytical models. However, to our knowledge, it has not yet been examined in the Black youth population. To address this gap, we propose a data-driven approach to develop a composite measure that captures a neighborhood's level of social integration. We then apply this measure to evaluate Durkheim's theory among Black children and youth (ages 10-17.9) with a mental health-related diagnosis between 10/1/2016 and 9/30/2022 in the INSIGHT Clinical Research Network (n=116,757), controlling for suicide attempt risk and protective factors identified through machine learning models.
Keywords
social integration
Black youth
suicide attempts
electronic health records
machine learning
neighborhood effects
Generalized latent factor analysis not only provides a useful latent embedding approach in statistics and machine learning, but also serves as a widely used tool across various scientific fields, such as psychometrics, econometrics, and social sciences. Ensuring the identifiability of latent factors and the loading matrix is essential for the model's estimability and interpretability, and various identifiability conditions have been employed by practitioners. However, fundamental statistical inference issues for latent factors and factor loadings under commonly used identifiability conditions remain largely unaddressed, especially for correlated factors and/or non-orthogonal loading matrix. In this work, we focus on the maximum likelihood estimation for generalized factor models and establish statistical inference properties under popularly used identifiability conditions. The developed theory is further illustrated through numerical simulations and an application to a personality assessment dataset.
Keywords
Maximum likelihood estimation
Generalized factor model
Limiting distributions
Personalized medicine encounters substantial challenge in mental health due to the subjective and diversified nature of the disease symptoms measured through multi-domain outcomes. Relying on a single summary measure for decision-making risks improving one symptom domain at the expense of another, underscoring the need for reliable effect estimation across multiple outcomes and various factors simultaneously. We propose a novel framework for learning individualized treatment effects with item response outcomes. This approach employs factor analysis to extract key disease factors from observed outcomes, leveraging them to construct a distributionally robust learning procedure. By jointly evaluating multi-domain treatment effects, the framework guarantees robust performance across a wide range of clinically relevant outcomes. Our method offers a computationally efficient algorithm with theoretical justification for simultaneously estimating factor loadings and treatment effects. Demonstrated in a randomized clinical trial for Major Depressive Disorder, it exhibits superior generalizability to external outcomes, underscoring its potential for advancing precision psychiatry.
Keywords
Adversarial learning
Distributional robust
Item response data
Latent factor model
Mental disorders
Precision medicine
Hepatitis C and HIV have drivers that interact to exacerbate each outcome. We used structural equation modeling (SEM) to characterize the hepatitis C and HIV syndemic among Medicaid beneficiaries.
We used CMS data to identify beneficiaries with chronic hepatitis C, defined as having an HCV RNA test code index date from 2016 to 2020 followed by an ICD-10 chronic code ≥1 day after the index date. We included persons aged 18-64 enrolled in Medicaid for ≥12 months before and after the index date not dually enrolled in Medicare. SEM quantified relationships of factors before the index date with HIV diagnosis afterward. Each factor was a continuous construct representing number of overdoses, substance use disorders (SUDs), and mental health disorders (MHDs). The model allowed for correlation between constructs to estimate odds ratios (ORs), controlling for age, sex, and state.
A total of 467,340 beneficiaries with chronic hepatitis C were included. Each construct was significantly associated with HIV: MHDs (OR= 1.11), overdoses (OR=1.14), and SUDs (OR=1.29). Future modeling will include beneficiaries without hepatitis C and social latent factors to better characterize the syndemic.
Keywords
structural equation modeling
factor model
hepatitis C
syndemic
Medicaid
Co-Author(s)
Michelle Van Handel, Office of the Director, National Center for HIV, Viral Hepatitis, STD, and Tuberculosis Prevention
Hasan Symum
William Thompson, Center for Disease Control & Prevention
Taiwo Abimbola, Office of the Director, National Center for HIV, Viral Hepatitis, STD, and Tuberculosis Prevention
First Author
Angela Estadt, CDC
Presenting Author
Angela Estadt, CDC