Modeling Spatially Correlated Failure-time Data Under Two Distance Functions with an Application to Titan GPU Data

Jared Clark Speaker
 
Monday, Aug 4: 3:05 PM - 3:25 PM
Topic-Contributed Paper Session 
Music City Center 
One common approach to the statistical analysis of spatially correlated data relies on defining a correlation structure based solely on the physical distance between the locations of observed values. However, some data have a complex spatial structure that cannot be adequately described with the physical distance alone. In this line of research, the spatial failure-time data of focus contains information on GPUs that are linked through a series of wired connections, where it is expected that the failure-times of GPUs with few connections between them will be highly correlated. The proposed lifetime regression model includes random effects capturing the dependency due to physical location as well as random effects explaining the dependency due to the number of logical connections between GPUs. The analysis of this GPU dataset serves as an example of models with multiple spatial random effects and the ideas presented can be extended to other applications with complex spatial structures. A Bayesian modeling scheme is recommended for this class of analyses. The examples in this presentation use the software package, Stan, to produce Markov chain Monte Carlo draws for parameter estimation. This modeling effort is validated through simulation which demonstrates the accuracy of statistical inference. We also apply the developed framework to the large-scale Titan GPU failure-time data.