Generative Transformer for Longitudinal Biomarker and Diet Quality Data Representation

Hua Fang Co-Author
 
Honggang Wang Co-Author
Yeshiva University
 
Ashikur Nobel First Author
Yeshiva University
 
Ashikur Nobel Presenting Author
Yeshiva University
 
Tuesday, Aug 5: 11:05 AM - 11:20 AM
2052 
Contributed Papers 
Music City Center 
Modeling multidimensional longitudinal RCT data is inherently complex due to temporal dependencies, missing values, and dynamic variations in behavioral responses and outcomes over time. Traditional analysis methods often fall short in capturing the intricate temporal and group-specific patterns present in such datasets. To overcome these limitations, we introduce MITransformer, a generative pretrained transformer framework enhanced with multiple imputations for robust contextual representation learning from longitudinal biomarker data, incorporating diet quality measurements. MITransformer reconstructs input features across time points, effectively capturing temporal patterns and inter-variable relationships, while addressing the issue of missing data through multiple imputation. By applying dynamically scaled positional embeddings within the attention mechanism, the model preserves temporal relationships without distorting continuous data distributions. A gated integration mechanism selectively emphasizes input subsets, allowing the model to differentiate the importance of various input types. The contextual embeddings generated by MITransformer improve representation quality across time, facilitating better clustering and regression/classification outcomes. Our results demonstrate that these embeddings preserve biological and behavioral variation, enabling the model to distinguish between demographic subgroups such as gender without the need for explicit labels. This approach enhances interpretability and analytical performance, laying the foundation for advanced applications such as digital twins, individualized health monitoring, and diet-related outcome prediction, thereby expanding the capabilities of conventional disease diagnosis and prognosis using biomarker data.

Keywords

Biomarker

Diet Quality Index

Contextual Representation

Longitudinal Data Modeling

Generative Pretrained Transformer

Multiple Imputation 

Main Sponsor

Section on Statistical Learning and Data Science