Tuesday, Aug 6: 2:00 PM - 3:50 PM
1854
Topic-Contributed Paper Session
Between the small datasets of classical statistical analysis and the massive databases of distributed systems lies "big-ish" data: datasets that can be read directly into R on a personal computer, but that are large enough to make common data operations slow. This session highlights recent work in developing and testing R tools designed to speed up analysis of such large in-memory datasets, such as {arrow}, {data.table}, and {vroom}. We will share insights into the design, development, and maintenance of such tools; as well as examples of their use in real-world applications.
Applied
Yes
Main Sponsor
Section on Statistical Computing
Co Sponsors
Section for Statistical Programmers and Analysts
Presentations