Numbers and baseball go hand in hand like peanut butter and jelly. Think about the history of the game and there is usually a number attached. You have numbers like 56, 714, 755, 511, and 59. Poll any group of regular baseball fans and they would be able to tell you the significance of all of those numbers. They could probably tell you the year in which those events happened (for the most part).
Stretch that to the NBA and NFL and you would be hard-pressed to find such a number. There are great moments in those sports to be sure and they can certainly claim their piece of the American sports landscape. The NFL is arguably the most popular sport in America today.
However, there is no connection between the fans and individual numbers. Quick, what is the record for rushing yards in a season? What is the record for touchdown passes? Who has the most receiving yards in a season? They just don’t have the same ring to them.
So, now that we have established statistics’ importance to the baseball fan, we have to establish the importance to the baseball teams and analysts. We start with the basics.
When we describe players, we usually attach a statistic to that description. Someone is a .300 hitter or a 30 home run hitter. Those descriptions help us identify the quality of a player. They were established more than a century ago to distinguish between good, bad, and mediocre.
Skeptics of the use of in-depth statistics often claim that statistics are a great way of predicting the past. I prefer to think of statistics like tools to paint a portrait of a team or player. Some people are able to draw stick figures while others can give those players and teams body, color, and perspective. The same is true with statistics. Some are very general and not particularly descriptive while others are remarkable. The key is distinguishing between the two.
No artist can turn their subject into something real on the canvas. It can be lifelike, but not alive. The same is true with statistics. I can go into excruciating detail on the hair, eyes, nose, facial features, and body shape, but I can’t make them real.
Statistics will never replace watching the game. I can see a picture of the Grand Canyon, Westminster Abbey, or the Great Wall of China. Those pictures will never replace seeing them in person no matter how vivid they are. Studying statistics will never replace the joy of watching your team come from behind in the ninth, throw a no-hitter, or win a pennant.
Sports always comes with some degree of unpredictability. That is what makes them entertaining in the first place. From a management standpoint, we can manage that unpredictability and limit its negative effects with good information. So, with the case made, we need to look at the key elements of statistics we will all look for.
I’m not a mathematician or statistician, and I didn’t stay at a Holiday Inn Express last night. What I am is someone that studies sabermetrics religiously. I can tell you what they are supposed to measure, I can tell you whether they measure it well. What I can’t do is explain the math behind it all of the time.
In statistical terms, reliability refers to whether results can be reproduced. For instance, if a hitter hits 30 home runs this year, can we reasonably expect him to do the same next year? We could do the same with batting average, ERA, slugging percentage, etc.
Simply put, if those numbers aren’t reliable then they have little predictive value. It doesn’t matter whatever use we might have for those numbers. They are unreliable.
So, our first task is to find those numbers which are reproduced regularly. If you want to look at players’ statistics you can quickly find those numbers. We will look for those numbers first and then move on to the next step in the process.
A number may be reliable, but it may not be valid. For instance, a pitcher may hit five hitters every year in 200 innings. Yet that statistic has very little to do with how successful he is over that many innings. So, when we find those statistics that are reliable we must then test those statistics to see how much they correlate with success in a particular category (say runs scored or runs allowed).
When you find both, you can use numbers from the past to predict the future. Of course, no one can predict the future 100 percent of the time. You are simply moving the odds in your favor.
People in all walks of life do it. They look at research in their field and use that research to apply better practices and techniques. Engineers do it, lawyers do it, doctors certainly do it, and teachers do it.
Shouldn’t those that make decisions within baseball and analyze the game do it as well?