Print Close

CS022 Sports Applications

Conference: Symposium on Data Science and Statistics (SDSS) 2023

05/25/2023: 3:45 PM - 5:15 PM CDT
Refereed

Room: Grand Ballroom B

Chair

Katie Bakewell, NLP Logix

Tracks

Practice and Applications

Symposium on Data Science and Statistics (SDSS) 2023

Presentations

Leveraging Sequential Play-by-Play Data to Adjust for Complementary Unit Performance in American College Football

American football is unique in the sense that the same team's offensive and defensive units typically consist of separate player sets that don't share the field simultaneously, which tempts one to evaluate them independently. Yet, some aspects of your team's defensive (offensive) performance may directly impact the complementary unit, a concept that is typically referred to as "complementary football". For example, turnovers forced by your defense could lead to easier scoring chances for your offense, while your offense's ability to control the clock may in turn help your defense. Moreover, the ability to objectively rank team's offenses and defenses could be of elevated importance in American college football (CFB) specifically, due to heavy title and playoff implications thereof. Our main goal is to identify the most consistently influential features of complementary football in a data-driven way, subsequently adjusting each team's offensive (defensive) performance for that of their complementary unit. To achieve that, for the 2014-2021 CFB seasons, we proceed to leverage sequential play-by-play data to alleviate the issue of reverse causality which permeates the game totals, focusing on how the complementary unit's (e.g. defense) performance on the preceding drive might be affecting the other unit's (offense) performance on the current drive. Variable selection methodologies are implemented to pick the complementary football features of utmost importance that we would be subsequently adjusting for, combined with strength of schedule and home-field factor considerations (both shown to be especially pivotal in the college game). All of that would lead to a better understanding of each team's offensive and defensive rankings, and a more considerate evaluation of their strengths and weaknesses.

Presenting Author

Andrey Skripnikov, New College of Florida

First Author

Andrey Skripnikov, New College of Florida

Hypothesis Testing for Multiple Groups of Compositional Data

Compositional data consist of multiple components that are parts of a whole. The proportions of observations within each component must sum to 1. We show how such data can be modeled with a nested Dirichlet distribution and present a test to examine differences in the means of compositional data among G > 2 populations.

Presenting Author

Monnie McGee, Southern Methodist University

First Author

Bianca Luedeker

CoAuthor(s)

Monnie McGee, Southern Methodist University
Jacob Turner, Southern Methodist University