Astrostatistics Interest Group: Student Paper Award

David Stenning Chair
Simon Fraser University
 
Aarya Patil Organizer
Max Planck Institute for Astronomy
 
Sunday, Aug 3: 2:00 PM - 3:50 PM
0769 
Topic-Contributed Paper Session 
Music City Center 
Room: CC-202B 

Applied

Yes

Main Sponsor

Astrostatistics Interest Group

Co Sponsors

Section on Bayesian Statistical Science
SSC (Statistical Society of Canada)

Presentations

A data-driven approach to stellar flare detection

We present a hidden Markov model (HMM) for discovering stellar flares in light curve data of stars. HMMs provide a framework to model time series data that are non-stationary; they allow for systems to be in different states at different times and consider the probabilities that describe the switching dynamics between states. In the context of stellar flares discovery, we exploit the HMM framework by allowing the light curve of a star to be in one of three states at any given time step: Quiet, Firing, or Decaying. This three-state HMM formulation is designed to enable straightforward identification of stellar flares, their duration, and associated uncertainty. This is crucial for estimating the flare's energy, and is useful for studies of stellar flare energy distributions. We combine our HMM with a celerite model that accounts for quasi-periodic stellar oscillations. Through an injection recovery experiment, we demonstrate and evaluate the ability of our method to detect and characterize flares in stellar time series. We also show that the proposed HMM flags fainter and lower energy flares more easily than traditional sigma-clipping methods. Lastly, we visually demonstrate that simultaneously conducting detrending and flare detection can mitigate biased estimations arising in multi-stage modelling approaches. Thus, this method paves a new way to calculating stellar flare energy. We conclude with an example application to one star observed by TESS, showing how the HMM compares with sigma-clipping when using real data. 

Speaker

J. Arturo Esquivel F., University of Toronto

ChronoFlow: A Data-Driven Model for Gyrochronology

Stellar ages are critical to astronomy on a wide range of scales, but challenging to measure for low mass main sequence stars. One method that is promising for such stars is gyrochronology, which uses the evolution of their rotation rates, or "spindown". However, analytical gyrochronology models have historically struggled to capture the observed rotational dispersion in stellar populations. To properly understand this complexity, we have developed ChronoFlow: a flexible data-driven model built using a conditional normalizing flow. We show that it accurately captures observed rotational dispersion in open clusters, and we also apply ChronoFlow within a Bayesian inference framework to infer stellar ages. We recover cluster ages with a statistical uncertainty of 0.06 dex (≈ 15%), and individual stellar ages with a statistical uncertainty of 0.7 dex. Additionally, we conducted robust systematic tests to analyze the impact of extinction models, cluster membership, and calibration ages on ChronoFlow's performance. Our results show that ChronoFlow can estimate the ages of coeval stellar populations to the precision of the best literature models, and that it performs better for clusters of ages ~50-200 Myr than existing data-driven models. ChronoFlow is publicly available at https://github.com/philvanlane/chronoflow. 

Speaker

Phil Van-Lane, University of Toronto

Discovery of Two Ultra-Diffuse Galaxies with Unusually Bright Globular Cluster Luminosity Functions via a Mark-Dependently Thinned Point Process (MATHPOP)

We present \textsc{Mathpop}, a novel method to infer the globular cluster (GC) counts in ultra-diffuse galaxies (UDGs) and low-surface brightness galaxies (LSBGs). Many known UDGs have a surprisingly high ratio of GC number to surface brightness. However, standard methods to infer GC counts in UDGs face various challenges, such as photometric measurement uncertainties, GC membership uncertainties, and assumptions about the GC luminosity functions (GCLFs). \textsc{Mathpop} tackles these challenges using the mark-dependent thinned point process, enabling joint inference of the spatial and magnitude distributions of GCs. In doing so, \textsc{Mathpop} allows us to infer and quantify the uncertainties in both GC counts and GCLFs with minimal assumptions. As a precursor to \textsc{Mathpop}, we also address the data uncertainties coming from the selection process of GC candidates: we obtain probabilistic GC candidates instead of the traditional binary classification based on the color--magnitude diagram. We apply \textsc{Mathpop} to 40 LSBGs in the Perseus cluster using GC catalogs from a \textit{Hubble Space Telescope} imaging program. We then compare our results to those from an independent study using the standard method. We further calibrate and validate our approach through extensive simulations. Our approach reveals two LSBGs having GCLF turnover points much brighter than the canonical value with Bayes' factor being $\sim4.5$ and $\sim2.5$, respectively. An additional crude maximum-likelihood estimation and simulation study show that their GCLF TO points are approximately $0.9$~mag and $1.1$~mag brighter than the canonical value, with $p$-value $\sim 10^{-8}$ and $\sim 10^{-5}$, respectively. 

Speaker

Dayi Li, University of Toronto

Prediction Intervals for Astronomy Data with Covariate Error

Astronomers often deal with data where the covariates and the dependent variable are measured with heteroscedastic non-Gaussian error. For instance, while TESS and Kepler datasets provide a wealth of information, addressing the challenges of measurement errors and systematic biases is critical for extracting reliable scientific insights and improving machine learning models' performance. Although techniques have been developed for estimating regression parameters for these data, few techniques exist to construct prediction intervals with finite sample coverage guarantees. To address this issue, we tailor the conformal prediction approach to our application. We empirically demonstrate that this method gives finite sample control over Type 1 error probabilities under a variety of assumptions on the measurement errors in the observed data. Further, we demonstrate how the conformal prediction method could be used for constructing prediction intervals for unobserved exoplanet masses using established broken power-law relationships between masses and radii found in the literature. 

Speaker

Naomi Singer, North Carolina State University

Quantifying the Clustering Probability in Noisy Nonhomogeneous Spatial Data to Identify New Repeating Fast Radio Burst Sources from CHIME/FRB

In this paper, I introduce an approach to analyze nonhomogeneous Poisson processes (NHPP) observed with noise, focusing on previously unstudied second-order characteristics of the noisy process. Utilizing a hierarchical Bayesian model with noisy data, I estimate hyperparameters governing a physically motivated NHPP intensity. I perform simulation studies to demonstrate the reliability of this methodology in accurately estimating hyperparameters. Leveraging the posterior distribution, I then infer the probability of detecting a certain number of events within a given radius, the k-contact distance. I demonstrate the methodology with an application to observations of fast radio bursts (FRBs) detected by the Canadian Hydrogen Intensity Mapping Experiment's FRB Project (CHIME/FRB). This approach allows us to identify repeating FRB sources by bounding or directly simulating the probability of observing k physically independent sources within some radius, or the probability of coincidence (PC). The new methodology improves the repeater detection PC in 86% of cases when applied to the largest sample of previously classified observations, with a median improvement factor (existing metric over PC from our methodology) of ∼ 3000. 

Speaker

Amanda Cook, University of Toronto