For everyone really into baseball AND Bayes’ Theorem, this post is for you. I finally got around to posting the python code implementing the algorithm described in my paper, A Point-Mass Mixture Random Effects Model for Pitching Metrics. Sabrmetrics, the study of statistical patterns in baseball, is a huge mess. Everyone is proposing new metrics and those evaluating them usually don’t have the credentials or skills to do so correctly. In this paper, we take a statistically rigorous approach and argue that metrics must (i) have a large fraction of players which are different from the league average and (ii) give high conﬁdence about which players are not league average. We of course rigorously define these requirements within the paper.
The .tar.gz contains 5 files:
- bb-main.py – the main function that runs the sampler
- bb.py – class and function definitions
- BABIP.csv – a data file for the BABIP (batting average on balls in play) metric
- pitching_column_info.csv – index file needed because we were doing all runs concurrently on the Wharton Grid
- BABIP.sh – shell script for running the sampler under some default parameters for BABIP
A companion manuscript, A Bayesian Variable Selection Approach to Major League Baseball Hitting Metrics, is currently under review.