Tuesday, Aug 5: 2:00 PM - 3:50 PM
0553
Invited Paper Session
Music City Center
Room: CC-208A
Applied
Yes
Main Sponsor
Biometrics Section
Co Sponsors
ENAR
Section on Statistical Learning and Data Science
Presentations
The 20th century's digital progress has significantly transformed scientific research. Biomedical research in particular has benefitted from new measurement technologies that make it possible to observe molecular entities we previously could not. A common feature of these new technologies is that they generate large and complex datasets. For example, instead of focusing on one gene at a time; now, we can examine all genes together, shifting the approach from testing specific hypotheses to exploring and discovering new insights. However, the complexity of these data demand statistical expertise to discern meaningful patterns from chance or subtle systematic error, underscoring the critical role of statistics in biomedical research. Unfortunately, biomedical education has not kept pace with the demand for data analysis skills. In this talk I will showcase various examples to illustrate the vital role of statistical analysis and effective data visualization in the realm of genomics.
Large language models (LLMs) are rapidly becoming the cornerstone of natural language processing due to their advanced capability to process unstructured text. In the context of rare disease diagnosis, LLMs have the potential to support clinical decision-making by automatically generating differential diagnoses or extracting granular disease phenotypes from patients' clinical notes in electronic health records (EHRs). However, variability in LLM outputs can arise from multiple sources, including differences in pre-training data, the models' probabilistic nature, and parameter settings, potentially impacting the consistency and reliability of downstream decision-making. Despite the growing popularity of using LLMs in biomedical research, their reproducibility in the context of rare disease diagnosis remains underexplored. Therefore, this study aims to evaluate the reproducibility of foundational LLMs, including OpenAI's ChatGPT and Meta's Llama models, in analyzing unstructured clinical notes from EHRs for rare disease diagnosis. Results from this study can provide insight into the reproducibility and robustness of LLMs to inform their reliable application in rare disease research and clinical decision support.
Keywords
Large language model
Artificial intelligence
Reproducibility
Speaker
Cathy Shyr, Vanderbilt University Medical Center
Drug repurposing, the process of identifying new applications for existing approved drugs, offers a time- and cost-efficient approach to drug development. The explosive growth of biomedical data provides significant opportunities to advance drug repurposing and precision medicine. However, effectively integrating complex, heterogeneous data to uncover meaningful repurposing signals remains challenging. In this presentation, I will introduce our research group's work on AI-driven knowledge graph models, which systematically integrate genomic, phenotypic, pharmacological and patient data and leverage deep learning algorithms to identify candidate drugs for repurposing and personalized treatment strategies. Through case studies, I will illustrate the application of our approach in uncovering potential therapeutic signals and enabling personalized medicine.