Sign up or login to track your favorite teams

Sign Up for Bleacher Report

As a registered user you can subscribe to your favorite teams, post comments, write your own articles, and much more.

You must register in order for that functionality to work!








Validating sign up form ...

Bleacher Report articles are written by fans like you

Do you want to cover your favorite sports, teams, and leagues?

Processing writing preferences ...

Great, , you're signed up!

i.e. Big 10, LeBron James, USC Football

Selected Tags:

Logging in ...

There are several formulas out there that can be used to estimate a team's "real" record: Pythagorean Formula, Pythagenport, Pythagenpat, etc. Some use run differential and some use a run-to-runs-allowed ratio...

Measuring the Accuracy of Baseball's Win-Percentage Estimators

by Zach Fein (Analyst)

2

641 reads

Stats

September 06, 2008


There are several formulas out there that can be used to estimate a team's "real" record: Pythagorean Formula, Pythagenport, Pythagenpat, etc. Some use run differential and some use a run-to-runs-allowed ratio.

The question is: Which is the most accurate? The least?

Using the Lahman Database, I ran tests of 13 different methods on every team since 1921 (the end of the Dead-Ball era) to find the most accurate way to measure a team's expected record, with 1981 and 1994 excluded for obvious reasons.

The RMSE (root-mean-square-error) in the table is calculated by squaring the error (in this case, the difference of the team's actual wins and expected wins), averaging all of those numbers, then finding the square root of the average.

The formulas for each method are at the end of this article, to save space.

Here are the results.

RMSE of each method since 1921
Method RMSE
Pythagenport 3.990
Pythagenpat 3.992
Palmer-RPW 4.015
Tango-RPW 4.021
Pythag-1.83 4.022
Ben V-L 4.024
Pythag-2 4.096
RPW=10 4.104
Soolman 4.111
RPW=RPG 4.156
E.Cook 4.537
Double Edge 4.606
Kross 5.124


What's funny is that Clay Davenport, inventor of Pythagenport, denounced his method in favor of Pythagenpat, yet it is in reality the best method when compared to actual record.

Earnshaw Cook may have been the first to create a win-percentage estimator, and the Double Edge method created by Bill James was never actually used, so their finishing near last can both be forgiven.

The Kross method, on the other hand, cannot be, as it was supposedly a precise way to estimate winning percentage.

Using the Pythagenport formula, we can find out teams that have been lucky and unlucky, by comparing their actual wins to expected wins based on Pythagenport.

Track this Article on My B/R
Flag This Article
Share This Article

2 comments Last one added 10 months ago — Leave a Comment

  1. ...

    Very interesting Zach. You always provide some very intriguing insight, keep it up.

    5 stars and a favorite for me.

    Edit Comment Cancel

    ...

    Reply
    Great Comment (
    0
    )
    ...
  2. ...

    Great article!

    Edit Comment Cancel

    ...

    Reply
    Great Comment (
    0
    )
    ...

Leave a Comment

  • You must register to post a comment.

  • Want to write for Bleacher Report

    We are a community of fans who write about sports. And we're growing.

    Learn More and Sign Up »



    Certain photos copyright © 2009 by Getty Images.
    Any commercial use or distribution without the express written consent of Getty Images is strictly prohibited.