Post originally published at the author's blog, with various code removed.
It is the favorite time of year for many a sports nerd like myself: the time when the Baseball Writers Association of America will make their picks for the Hall of Fame, and when the blogosphere is best equipped to mock and ridicule the inconsistent logic of many esteemed writers.
It is also a favorite time for anyone who has ever said, "I cannot believe they voted for...," or how someone was robbed (see Whitaker, Lou).
Last year's epic battle, in my mind, was the one about Edgar Martinez. Supporters cited his batting numbers that were comparable to legends while playing in a mediocre hitting park, and his career that shows no signs of PED use. His detractors cited that he was a designated hitter, and that his career is short.
When all was said and done, Martinez received a mere 36.2 percent of the vote, less than half what one needs to reach the Hall of Fame. So, what would Edgar's chance of reaching Cooperstown, knowing this?
Would you believe a 69.09 percent chance?
It seems counter-intuitive that when one is yet to convince almost two-thirds of the remaining voting base of his greatness, six years after his career ended, that anything would change so rapidly.
However, it occurs constantly, as only two men from 1976-97 received a higher share of the vote on their first ballot and missed out on the writer's election. One of which, Jim Bunning, eventually gained access through the Veterans Committee.
As we saw from yesterday's post, the logit model can provide a powerful probability estimator given a dummy dependent variable. In this case, we test whether someone reached the Hall of Fame (y=1) or not (y=0).
To perform this analysis, I looked at all Hall of Fame votes from 1976-97, and took the percent share of the vote obtained by all players on their first ballot, excluding those who received less than five percent of the vote (indicating a probability of being elected to the Hall of 0, and a small chance of being elected by the veterans' committee). Through this process, I obtained a data sheet of 59 players, as can be seen here.
Right away, one can make general assumptions. Thirty-three of the 59 players listed were eventually elected to the Hall, a 55.93 percent success rate. Additionally, three more were elected by the veterans' committee, leaving the total success rate of the group at 61.02 percent.
Simply clearing the first obstacle of making it past the first ballot seems to bode well for the eventual success of candidates.
However, this analysis is imperfect. The success rate includes players who were elected on the first ballot, and had no resistance in making the Hall. Once again, though, we can easily run a logit model regression on the data.
After performing the data analysis, one comes up with two equations, referred to within the blog post.
The results are probability moderately encouraging for my fellow Edgar fans. Not even including the Veterans Committee option, Edgar currently stands as better than a coin flip's chance of reaching the Hall, at 56.32 percent. With the Veterans Committee, this probability spikes to 69.09 percent.
So where are the break-even (50-50) points for both equations? For just the BBWAA vote, it is at around 33.8 percent of the initial vote. For overall Hall of Fame chances, it is at around 28.6 percent.
So for anyone concerned with various parts of the ballot, such as Jeff Bagwell's low initial support, rest assured, the numbers give him a shot that far exceeds a coin flip for being enshrined.
Also note that Barry Larkin has a 91.9 percent chance of reaching Cooperstown, and heck, even Fred McGriff has a 32.2 percent chance of reaching the Hall someday given his first ballot performance.