Sabermetrics

Measuring the Accuracy of Baseball's Win-Percentage Estimators

Zach FeinAnalyst ISeptember 6, 2008

There are several formulas out there that can be used to estimate a team's "real" record: Pythagorean Formula, Pythagenport, Pythagenpat, etc. Some use run differential and some use a run-to-runs-allowed ratio.

The question is: Which is the most accurate? The least?

Using the Lahman Database, I ran tests of 13 different methods on every team since 1921 (the end of the Dead-Ball era) to find the most accurate way to measure a team's expected record, with 1981 and 1994 excluded for obvious reasons.

The RMSE (root-mean-square-error) in the table is calculated by squaring the error (in this case, the difference of the team's actual wins and expected wins), averaging all of those numbers, then finding the square root of the average.

The formulas for each method are at the end of this article, to save space.

Here are the results.

RMSE of each method since 1921
MethodRMSE
Pythagenport3.990
Pythagenpat3.992
Palmer-RPW4.015
Tango-RPW4.021
Pythag-1.834.022
Ben V-L4.024
Pythag-24.096
RPW=104.104
Soolman4.111
RPW=RPG4.156
E.Cook4.537
Double Edge4.606
Kross5.124


What's funny is that Clay Davenport, inventor of Pythagenport, denounced his method in favor of Pythagenpat, yet it is in reality the best method when compared to actual record.

Earnshaw Cook may have been the first to create a win-percentage estimator, and the Double Edge method created by Bill James was never actually used, so their finishing near last can both be forgiven.

The Kross method, on the other hand, cannot be, as it was supposedly a precise way to estimate winning percentage.

Using the Pythagenport formula, we can find out teams that have been lucky and unlucky, by comparing their actual wins to expected wins based on Pythagenport.

TeamWinsExp.WinsDiff.
Atlanta6268.56.5
Cleveland6873.95.9
Toronto7580.05.0
San Diego5458.04.0
Seattle5558.93.9
Philadelphia7780.33.3
Baltimore6365.72.7
Boston8385.72.7
Chicago Cubs8587.72.7
Oakland6466.52.5
Detroit6768.71.7
LA Dodgers7172.61.6
Arizona7172.01.0
Washington5454.70.7
Chicago Sox7979.60.6
St. Louis7575.40.4
Minnesota7878.30.3
NY Mets7979.10.1
NY Yankees7574.7-0.3
Cincinnati6361.4-1.6
Colorado6765.4-1.6
Milwaukee8178.5-2.5
Pittsburgh6057.4-2.6
San Francisco6057.2-2.8
Kansas City6057.1-2.9
Florida7267.6-4.4
Texas6964.4-4.6
Tampa Bay8579.1-5.9
Houston7467.7-6.3
LA Angels8575.8-9.2


Because the Angels have won so many close games, their closer gets more save opportunities, and they have won more games than expected. Tampa Bay is also at the bottom for "lucky" teams—and guess what, the Blue Jays should have more wins than them!

—————————————————————————————————————————————

Differential formulas

W% = X * (R - RA) / G + .5

Where X is for different methods...

Palmer-RPW: 1 / (10 * sqrt(runs per inning))

Tango-RPW: 1 / (RPG / 2 + 5), where RPG = (runs allowed + runs scored)/(games played)

RPW=10 : 0.1

RPW=RPG : 1 / RPG

~~~

Ratio formulas:

W% = (RS^x)/(RS^x + RA^x)

Where x is for different methods...

Pythagenport: 1.5 * log(RPG) + .45

Pythagenpat: RPG^.287

Pythag-1.83: 1.83

Pythag-2: 2

~~~

Others

Ben V-L: W% = 0.91 * (RS-RA) / (RS+RA) + .5

Soolman: W% = (0.102 * RS - 0.103 * RA) / G + .505

E.Cook: W% = 0.484 * RS / RA

Double Edge: W% = (RS / RA * 2 - 1) / (RS / RA * 2)

Kross: For teams with RS>RA, W% = RS / (2 * RA) , and for teams with RA>RS, W% = 1 - RA / (2 * RS) . I used the first formula for teams with an equal number of runs scored and runs allowed.

Where can I comment?

Stay on your game

Latest news, insights, and forecasts on your teams across leagues.

Choose Teams
Get it on the App StoreGet it on Google Play

Real-time news for your teams right on your mobile device.

Download
Copyright © 2017 Bleacher Report, Inc. Turner Broadcasting System, Inc. All Rights Reserved. BleacherReport.com is part of Bleacher Report – Turner Sports Network, part of the Turner Sports and Entertainment Network. Certain photos copyright © 2017 Getty Images. Any commercial use or distribution without the express written consent of Getty Images is strictly prohibited. AdChoices