Thursday, Aug 7: 8:30 AM - 10:20 AM
0436
Invited Paper Session
Music City Center
Room: CC-207D
Interactive data graphics has a long history of innovations and impact in statistical practice. It is a critical area of research in data science, where statistics researchers provide the important consideration of variation when making data visualisation. There is also a huge potential for developments in AI to advance to a point of providing better support for visual methods. This session focuses on acknowledging Deborah F. Swayne's contributions to the field, being the vehicle that launches a new award in her honor. It is also a chance to reflect on where the field is and where interactive graphics methodology might grow in the coming years.
data visualization
interactive graphics
applied statistics
data science
software
Applied
Yes
Main Sponsor
Section on Statistical Graphics
Co Sponsors
History of Statistics Interest Group
Section on Statistical Computing
Section on Statistical Learning and Data Science
Presentations
Large Language Models (LLMs) present new opportunities for interactive
data analysis. LLM-based approaches may help resolve the tension
between accessibility and flexibility that challenges the design of
effective data analysis interfaces. Graphical user interfaces present
simple, intuitive controls at the expense of flexibility, while
programmatic interfaces support arbitrarily complex tasks, but only
for those with sufficient time and skill. By allowing the user to
express tasks in natural language, LLMs can interpolate between the
two extremes. We present a prototype of an agent-based system for
analyzing genomic data that is targeted at users with varying levels
of computational skill.
Interactive statistical graphics started as specialized tools for data analysis, often requiring specific software or environments. With the advent of browsers and unification of web standards it became possible to leverage graphical capabilities more broadly. Specific areas such as map tools built on the technology and became ubiquitous, effectively becoming available to everyone. Although some attempts at web-based interactive visualization were made, they were often implemented as enhancements of existing static approaches or without taking into account the body of work on statistical interactive graphics. In this talk we want to showcase the key points in the parallel evolution, propose and demonstrate a framework that attempts to make statistical interactive framework ubiquitous by implementing the interactions required by statistical graphics using web-based technologies.
Keywords
interactive graphics
visualization
statistical graphics
Interactive data graphics have long empowered users to explore complex datasets through visual manipulation. However, there are barriers to entry for complex interactions. The Shiny Team presents {querychat}, a multilingual package that enables natural language interaction with data visualizations in both R and Python Shiny applications. Users can pose questions such as "Show only data from 2008 with highway MPG greater than 30" or "What's the average city MPG for SUVs vs compact cars?" and see immediate visual results.
{querychat} works by translating natural language into SQL queries that filter or transform data frames, making the resulting data available as reactive objects. This approach enhances reliability (LLMs excel at SQL generation), transparency (SQL queries can be displayed to users), and reproducibility (queries can be saved and reused). Shiny's reactive programming model allows for seamless propagation of LLM-generated transformations through its dashboard visualization pipelines. The same properties that make Shiny effective for rapid prototyping (event-driven architecture, reactive expressions, and component model) create a natural framework for embedding LLM capabilities that dynamically respond to a user's conversations.
Keywords
direct manipulation graphics
R
shiny
data
graphics