Future of Statistics and Data Science in the Era of ChatGPT and LLMs

Siyuan Ma Chair
 
James Zou Panelist
Stanford University
 
Xijin Ge Panelist
South Dakota State University
 
Frauke Kreuter Panelist
 
Anna-Carolina Haensch Panelist
LMU Munich
 
Ali Rahnavard Panelist
The George Washington University
 
Himel Mallick Organizer
Cornell University
 
Frauke Kreuter Organizer
 
Wednesday, Aug 7: 10:30 AM - 12:20 PM
1036 
Invited Panel Session 
Oregon Convention Center 
Room: CC-251 
Since ChatGPT's inception in November 2022, we have entered a new era of AI-powered technologies that are revolutionizing every aspect of our lives. With the ability to source, collect, and analyze data, ChatGPT can create content, answer questions, and even write code with lightning-fast speed, providing personalized insights and recommendations in real-time. As AI technology advances, more and more people are recognizing the potential long-term consequences, ethical implications, and privacy concerns of ChatGPT and related Generative AI (GAI) technologies and large language models (LLMs), which are poised to have a significant impact on our society. These revolutionary changes are affecting every line of business, and we are only beginning to understand the full extent of their impact.

As an example, ChatGPT's ability to generate introduction and abstract sections for scientific articles has raised ethical questions. Some scientific journals have even banned the use of ChatGPT as a co-author, as several papers have already listed it as such. In the healthcare industry, the potential uses and concerns surrounding ChatGPT are currently under scrutiny by professional associations and practitioners. Likewise, in the education sector, there are concerns about students using ChatGPT to outsource their writing assignments, as well as the risk of unintentional plagiarism resulting from the use of an AI tool that may output biased or nonsensical text.

These ethical concerns have also extended to the field of statistics and data science, where the use of ChatGPT and other AI tools can potentially amplify biases in data analysis and decision-making. As the use of AI in various industries becomes more prevalent, it is crucial to address these ethical implications and work towards developing responsible AI practices. In this late-breaking panel, we will delve into the background of ChatGPT and examine its impact on society as a whole, with a particular focus on its implications for the future of statistics and data science. Additionally, we will explore the potential for ChatGPT to enhance efficiency in our work processes.

Our aim is to gain a deeper understanding of the benefits and potential risks associated with ChatGPT's impact on the field of data science by analyzing its various applications. In this panel discussion, we intend to explore how ChatGPT can be leveraged to optimize efficiency and productivity while ensuring ethical and responsible use. By doing so, we hope to identify areas where this rapidly developing technology can be most effectively utilized for the greater good.

The panelists have diverse experiences: Dr. James Zhou has published several papers on ChatGPT, focusing both on its performance over time and its role in data science education. Dr. Xijin Ge is the founder of RTutor.ai, which uses ChatGPT as the backend to translate natural language into R code. Drs. Frauke Kreuter and Anna-Carolina Haensch have jointly published a paper on the impact of ChatGPT in a classroom setting. Finally, Dr. Ali Rahnavard's research team is developing LLMs to analyze sequencing data and infer biological insights from omics data. As a result, the viewpoints of diverse perspectives will be covered in this panel discussion, which will not only focus on the benefits and potential risks of ChatGPT but also address a combination of pre-specified and live questions.

Abstracts


Applied

Yes

Main Sponsor

Section on Statistics and Data Science Education

Co Sponsors

Committee on Membership Retention and Recruitment
International Indian Statistical Association
Section on Statistical Computing
Section on Statistical Learning and Data Science