CS022 Sports Applications
Conference: Symposium on Data Science and Statistics (SDSS) 2023
05/25/2023: 3:45 PM - 5:15 PM CDT
Refereed
Room: Grand Ballroom B
Chair
Katie Bakewell, NLP Logix
Tracks
Practice and Applications
Symposium on Data Science and Statistics (SDSS) 2023
Presentations
American football is unique in the sense that the same team's offensive and defensive units typically consist of separate player sets that don't share the field simultaneously, which tempts one to evaluate them independently. Yet, some aspects of your team's defensive (offensive) performance may directly impact the complementary unit, a concept that is typically referred to as "complementary football". For example, turnovers forced by your defense could lead to easier scoring chances for your offense, while your offense's ability to control the clock may in turn help your defense. Moreover, the ability to objectively rank team's offenses and defenses could be of elevated importance in American college football (CFB) specifically, due to heavy title and playoff implications thereof. Our main goal is to identify the most consistently influential features of complementary football in a data-driven way, subsequently adjusting each team's offensive (defensive) performance for that of their complementary unit. To achieve that, for the 2014-2021 CFB seasons, we proceed to leverage sequential play-by-play data to alleviate the issue of reverse causality which permeates the game totals, focusing on how the complementary unit's (e.g. defense) performance on the preceding drive might be affecting the other unit's (offense) performance on the current drive. Variable selection methodologies are implemented to pick the complementary football features of utmost importance that we would be subsequently adjusting for, combined with strength of schedule and home-field factor considerations (both shown to be especially pivotal in the college game). All of that would lead to a better understanding of each team's offensive and defensive rankings, and a more considerate evaluation of their strengths and weaknesses.
Presenting Author
Andrey Skripnikov, New College of Florida
First Author
Andrey Skripnikov, New College of Florida
Compositional data consist of multiple components that are parts of a whole. The proportions of observations within each component must sum to 1. We show how such data can be modeled with a nested Dirichlet distribution and present a test to examine differences in the means of compositional data among G > 2 populations.
Presenting Author
Monnie McGee, Southern Methodist University
First Author
Bianca Luedeker
CoAuthor(s)
Monnie McGee, Southern Methodist University
Jacob Turner, Southern Methodist University
You have unsaved changes.