Harnessing Large Language Models: Opportunities and Challenges for Statistics

Abstract Number:

1770 

Submission Type:

Topic-Contributed Paper Session 

Participants:

Weijie Su (1), Qi Long (2), Qi Long (2), Kamalika Chaudhuri (3), Jiancong Xiao (4), Ludwig Schmidt (5), Qi Lei (6), Jiachen Wang (7)

Institutions:

(1) University of Pennsylvania, N/A, (2) N/A, N/A, (3) UCSD, N/A, (4) Department of Biostatistics and Epidemiology, University of Pennsylvania, N/A, (5) Department of Computer Science, University of Washington, N/A, (6) New York University, N/A, (7) Princeton University, N/A

Chair:

Qi Long  
N/A

Co-Organizer:

Qi Long  
N/A

Session Organizer:

Weijie Su  
University of Pennsylvania

Speaker(s):

Kamalika Chaudhuri  
UCSD
jiancong xiao  
Department of Biostatistics and Epidemiology, University of Pennsylvania
Ludwig Schmidt  
Department of Computer Science, University of Washington
Qi Lei  
New York University
Jiachen Wang  
Princeton University

Session Description:

Large language models have recently emerged as a transformative technology with vast potential for myriad machine learning and data science applications. More recently, the integration of the Code Interpreter into ChatGPT underscores the ever-evolving potential for enhancing and broadening statistical and data science applications. However, the immense promise of these models is counterbalanced by challenges, notably the risk of data hallucination that can inadvertently perpetuate misinformation. Such challenges echo the core theme of JSM 2024 and highlight a timely opportunity for statisticians to develop mechanisms to counter misinformation leveraging statistical rigor.

This session aims to convene a diverse cohort of scholars deeply engaged in large language model research. Drawing from disciplines like statistics, biostatistics, and machine learning, our speakers will delve into the statistical perspectives of large language models, casting light on the vast opportunities awaiting the statistics community. Our intent is to foster a vibrant discourse that resonates with both theoretical and applied statisticians keen on harnessing the power of large language models to elevate the field of statistics.

Sponsors:

IMS 2
Section on Nonparametric Statistics 3
Section on Statistical Learning and Data Science 1

Theme: Statistics and Data Science: Informing Policy and Countering Misinformation

Yes

Applied

Yes

Estimated Audience Size

Medium (80-150)

I have read and understand that JSM participants must abide by the Participant Guidelines.

Yes

I understand and have communicated to my proposed speakers that JSM participants must register and pay the appropriate registration fee by June 1, 2024. The registration fee is nonrefundable.

I understand