Large Language Models for Statisticians and Data Scientists: Opportunities and Research Topics, Part 1

Weijie Su Instructor
University of Pennsylvania
 
Emily Getzen Instructor
 
Linjun Zhang Instructor
Rutgers University
 
Monday, Aug 4: 8:30 AM - 12:00 PM
CE_16 
Professional Development Course/CE 
Music City Center 
Room: CC-107B 
Large Language Models (LLMs) have emerged as revolutionary AI tools for a wide range of machine learning tasks. However, harnessing their potential for statistical decision-making requires a comprehensive understanding of the associated risks and limitations. Key questions arise: How can we differentiate LLM-generated text from human-written text with statistical guarantees? How can we evaluate the uncertainty and confidence levels of LLM predictions for downstream tasks? How can we leverage LLMs for important applications in biomedical research? The rise of LLMs presents both challenges and intriguing opportunities for contemporary statisticians. This one-day short course aims to familiarize statisticians with recent research topics on LLMs and equip them with skills to integrate inferential ideas into LLMs, both for methodology development and practical applications.

Course topics include: (1) A concise introduction to LLM fundamentals, tailored for those new to transformers and deep learning; (2) A primer on statistical inference techniques specifically designed for text data using LLMs; and (3) An in-depth exploration of LLM applications in medical domains and the broader data science field.
While this course offers a comprehensive examination of the intersection between statistics and advanced AI, no prior knowledge of LLMs is required. Participants will gain valuable insights into this rapidly evolving field, positioning them to contribute to and benefit from the integration of LLMs in statistical science.

Main Sponsor

Section on Statistical Learning and Data Science