Tuesday, Aug 4: 2:00 PM - 3:50 PM
1727
Topic-Contributed Paper Session
Applied
Yes
Main Sponsor
SSC (Statistical Society of Canada)
Co Sponsors
Section on Physical and Engineering Sciences
Section on Statistical Learning and Data Science
Presentations
The detection of repetition from a fast radio burst (FRB) source instantly ruled out cataclysmic models for at least the repeating FRB population. Repeaters also offered the first opportunities for high-precision localization and multi-wavelength burst follow-up. In the ten years since this discovery, we have identified and characterized a diverse population of repeating FRB sources, their host galaxies, and their burst properties. Still, the question, "Do all FRBs repeat?" remains elusive. The Canadian Hydrogen Intensity Mapping Experiment's FRB search (CHIME/FRB) has discovered 95% of the known repeaters to date. In this talk, I'll walk through our repeater detection pipeline, which employs a Bayesian hierarchical non-homogeneous Poisson point-process framework, before introducing an upcoming sample of 30 new repeaters drawn from our catalog of more than 3,600 FRBs. Interestingly, the first FRB localized using the full CHIME/FRB Outrigger array appears energetically inconsistent with the known repeater population. I will discuss this result, along with other emerging lines of evidence, in our ongoing effort to address the question of whether all FRBs ultimately repeat.
Keywords
point processes
hierarchical bayesian
astrostatistics
transients
spatial point processes
There exists a tight scaling relation between a galaxy's stellar mass (M_*) and the number of globular clusters (N_GC) -- a class of old, massive, and dense star cluster -- that it hosts. However, in the astronomy literature, this relationship is often modeled as linear instead of with count models. In this work, we test the utility of Poisson and negative binomial models for describing the scaling relation. We introduce the use of zero-inflated versions of these models, which allow for larger zero populations (e.g., galaxies without GCs). To determine the value of these models, we evaluate them with a variety of predictive model comparison methods, including predictive intervals and the leave-one-out cross-validation criterion. We also develop a novel posterior predictive comparison method.
We find that the NB model is consistent with our data, but the naive Poisson is not. Moreover, we find that zero inflation of the models is not necessary to describe the large population of low-mass galaxies that lack GCs, suggesting that a single formation and evolutionary process acts over all galaxy masses. Under the NB model, there does not appear to be anything unique about the lack of GCs in many low-mass galaxies; they are simply the low-mass extension of the larger N GC‑M * scaling relation.
Spacecraft sensor data frequently exhibit heteroskedastic noise, with measurement uncertainty varying across operating conditions, energy regimes, and signal strengths. Accurately modeling this variability is critical for reliable denoising, estimation, and downstream scientific inference. We present a scalable Gaussian process (GP) framework that extends the Vecchia approximation to accommodate input-dependent noise variance while retaining computational efficiency.
We demonstrate the approach using real spacecraft data from instruments aboard the International Space Station, where strong heteroskedasticity arises in low–signal-to-noise and high-energy regimes. Compared to homoskedastic GP models, the proposed method provides improved uncertainty calibration and more robust identification of physically meaningful signal. The framework scales to large datasets and is well suited for automated processing pipelines. Beyond spacecraft applications, the method is broadly applicable to remote sensing and other large, noisy scientific datasets with spatially or temporally varying measurement error.
Keywords
Gaussian processes
Scalable inference
Heteroskedasticity
Uncertainty quantification
Variance surface estimation
Vecchia approximation
Calibrating spacecraft plasma instruments is challenging when direct overlap between sensors is sparse, irregular, or confined to short time windows. We present a statistical framework for calibrating ionospheric plasma parameters derived from the Electric Propulsion Electrostatic Analyzer Experiment (ÈPÈE) to reference measurements from the Floating Potential Measurement Unit (FPMU) under such conditions. ÈPÈE, deployed on the International Space Station from March 2023 through April 2024, provides ion energy spectra during the peak of Solar Cycle 25, capturing highly variable topside ionospheric conditions. Our approach models each ÈPÈE energy spectrum as a Gaussian profile whose amplitude, mean energy, and width evolve smoothly in time. These latent parameters are coupled across adjacent time steps using an empirically weighted smoothness prior, allowing robust estimation even in the presence of noise and irregular sampling. Derived plasma quantities—density, floating potential, and temperature—are then statistically mapped to FPMU observations using regression models that incorporate nonlinear interactions and logarithmic scaling where physically appropriate. This framework enables calibration despite limited temporal overlap, reduces sensitivity to noise-driven dropouts, and preserves physically consistent temporal evolution. The resulting calibrated products improve data continuity and support studies of ionospheric variability and space-weather impacts on spacecraft systems and communications.
Practitioners often take datasets they are given and run downstream analyses "plugging in" any estimates and/or uncertainties provided, regardless of origin (eg, constructing a histogram of the estimated values, selecting a subset based on the estimates and their uncertainties). This behaviour can lead to various biases, especially when dealing with estimates derived from Bayesian or machine learning-driven approaches (where the priors can have outsized impact, especially when data are out of distribution) and when underlying estimators are multimodal. To address some of these challenges, I will present ongoing work on hybrid "Frasian" (Frequentist-Bayesian) inference approaches that can recalibrate Bayesian predictions under various marginal/conditional coverage settings using a separate calibration set. An application of these strategies to data on stars from the Gaia and APOGEE surveys will also be provided.
This work was done in collaboration with James Carzon, Luca Masserano, Joshua D. Ingram, Alex Shen, Antonio Carlos Herling Ribeiro Junior, Tommaso Dorigo, Michele Doro, Rafael Izbicki, and Ann B. Lee, as well as Ricardo Baptista.