Bayesian computation is so hot right now!

Posted: March 23rd, 2011 | Author: | Filed under: Python, Statistics | No Comments »

For everyone really into baseball AND Bayes’ Theorem, this post is for you. I finally got around to posting the python code implementing the algorithm described in my paper, A Point-Mass Mixture Random Effects Model for Pitching Metrics. Sabrmetrics, the study of statistical patterns in baseball, is a huge mess. Everyone is proposing new metrics and those evaluating them usually don’t have the credentials or skills to do so correctly. In this paper, we take a statistically rigorous approach and argue that metrics must (i) have a large fraction of players which are different from the league average and (ii) give high conīŦdence about which players are not league average. We of course rigorously define these requirements within the paper.

The .tar.gz contains 5 files:

  1. – the main function that runs the sampler
  2. – class and function definitions
  3. BABIP.csv – a data file for the BABIP (batting average on balls in play) metric
  4. pitching_column_info.csv – index file needed because we were doing all runs concurrently on the Wharton Grid
  5. – shell script for running the sampler under some default parameters for BABIP

A companion manuscript, A Bayesian Variable Selection Approach to Major League Baseball Hitting Metrics, is currently under review.