Large-Scale Spatial Data Science

Marc Genton Instructor
King Abdullah University of Science and Technology
 
Sameh Abdulah Instructor
King Abdullah University of Science and Technology
 
Mary Lai Salvaña Instructor
University of Connecticut
 
Monday, Aug 4: 8:30 AM - 5:00 PM
CE_12 
Professional Development Course/CE 
Music City Center 
Room: CC-107A 
The course, designed for data scientists, geospatial analysts, and researchers, will provide a comprehensive understanding of advanced methods in large-scale geospatial data science. The focus will be on three key topics: large-scale data modeling and prediction, accelerating geospatial data processing with multi- and mixed-precision techniques on modern hardware architectures, and parallelizing related R codes using the first parallel runtime system package in R. Participants will first explore ExaGeoStatCPP, a parallel framework for high-performance geostatistical computations. It enables efficient modeling and prediction of large-scale geospatial datasets within C++ and R environments. The course will also focus on the MPCR package, which provides multi- and mixed-precision support on CPUs and GPUs. Attendees will learn to integrate MPCR functions into their R workflows to optimize performance and precision trade-offs in computational tasks. Participants will also be introduced to RCOMPSs, a new runtime system designed to parallelize R code across HPC systems. The course will demonstrate how RCOMPSs can be used to accelerate R code execution in high-performance computing environments, providing hands-on experience in parallelizing computations effectively. Hands-on sessions will provide practical examples of parallelizing computations. By the end of the course, participants will have gained advanced skills in large-scale geospatial data science and be ready to apply them in their professional roles.