A few days ago, I took a broad look at the 24 possible base/out states that determine the course of an individual inning. Now I’d like to go a little bit more in-depth and take a look at how we can use these states to quantify player performance. First, let’s take a look at our major tool: a number called Run Expectancy. This number, as its name suggests, shows the expected amount of runs from a certain base/out state to the end of the inning. It is important to note that RE does not take into account previous runs scored in the inning. Here is a table of approximate run expectancies for the 2008 NL.
As common sense would imply, RE is much higher with fewer outs and with more men on base. That goes along with what any person with any baseball knowledge would say, of course. The question is really what we can use this for. Let’s take a look at how run expectancy works for our sample inning from my last post.
BOTTOM 8TH INNING:
Baserunners: ___ Outs: 0 Runs Scored: 0 RE: .500
Sabathia was called out on strikes;
Baserunners: ___ Outs: 1 Runs Scored: 0 RE: .264 (Change: -.236)
Cameron singled to left;
Baserunners: 1__ Outs: 1 Runs Scored: 0 RE: .524 (Change: +.260)
Durham flied to right;
Baserunners: 1__ Outs: 2 Runs Scored: 0 RE: .230 (Change: -.294)
Braun homered [Cameron scored];
Baserunners: ___ Outs: 2 Runs Scored: 2 RE: .104 (Change: -.126 + 2 runs scored = +1.874)
Fielder struck out; 2 R,
Outs: 3 Runs Scored: 2 RE: .000 (Change: -.104)
Inning Summary: 2 H, 0 E, 0 LOB. Cubs 1, Brewers 3.
So here we can assign run values to each event. Sabathia’s K was worth -.236 runs, Cameron’s single was worth .260, Durham’s fly out worth -.294, Braun’s HR worth 1.874, and Fielder’s K worth -.104. The important thing to remember is that these values are being compared to the average. The theoretical average ML player would have a 0.00 RE24 (this is what FanGraphs calls the stat) at the end of the season (closest example this year was Hunter Pence with a -0.23 RE24). RE24 is only for hitting, and does not take defense into account, so it is not a total player value stat either. For pitchers, the pitcher is assigned the negative value of the play result. So the pitcher, here, received +.236, then -.260, then +.294, then -1.874, then +.104. The important thing to remember here is that the total RE24 between offense and defense for each play (and thus each game) is 0.
One usage of RE is to create something known as Win Expectancy. This is how FanGraphs live game graphs work, and what WPA and WPA/LI are derived from. A table showing all the numbers is contained in The Book, but it’s far too large to reproduce here. Win Expectancy, or WE, is derived from Run Expectancy and also empirical data of how often teams win based on the current scoring margin. WPA, like RE24, also has the same property in which total WPA for each play is 0, and win expectancies are all compared to an average baseline as well.
WPA is almost completely a retrodictive stat - that is, it is very good at telling you what happened then, but will not do a good job at telling you what happens next. However, we can still use RE to produce a very good predictive stat, and that is Linear Weights. Basically, Linear Weights ask this question: how does RE change in the average 1B, 2B, 3B, HR, etc. (this can cover as many events as you can think of - bunt, pickoff, sac fly, balk, whatever). This is generally found using retrosheet play-by-play data, and it is also what wOBA and WAR are based off.
Hopefully this doesn’t create more questions than it answers, and feel free to comment or email me at firstname.lastname@example.org if you have any questions on this stuff. I also recommend reading Tango, Lichtman, and Dolphin’s The Book, which completely changed my view on baseball.