Regression models for estimation of park effects in Major League Baseball

Richard Levine Co-Author
San Diego State University
 
Jason Osborne First Author
North Carolina State University
 
Jason Osborne Presenting Author
North Carolina State University
 
Tuesday, Aug 6: 9:20 AM - 9:35 AM
3657 
Contributed Papers 
Oregon Convention Center 

Description

It is well-known that some ballparks in Major League Baseball are are more conducive to scoring than others. Estimation of "park factors" that quantify these differences has received considerable attention in industry and in the literature, but has not been without criticism. We make two contributions towards the improvement of estimating these effects. We compute, for each ballpark, runs and home runs achieved by all players (home and visiting) with plate appearances at the park, when visiting all other parks. This "elsewhere" measure of performance can be used to quantify offensive strength-of-schedule observed at each park. Secondly, we fit generalized linear models to test data to estimate probabilities of a variety of outcomes (e.g. home runs, doubles, foulouts) that are specific to batter-pitcher handedness combinations. These regression models use handedness-specific relative frequencies of events computed using training data as explanatory variables. The models are fit using test data and used to compute handedness-specific event probabilities adjusted to league averages of event probabilities which we define as park factors.

Keywords

Generalized linear models.

Regression. Analysis of covariance. Covariate-adjustment.

Baseball. Park Factors. 

Main Sponsor

Section on Statistics in Sports