My r-squared value for this regression was .916565.
Translated, this means that 91.66% of run output could be explained through these basic count statistics for each team. However, problems exist in this data. For one, four different count statistics' absolute value of their t-score (The 1st row of this output divided by the 2nd) of under 2 (CS, SB, SF, IBB), meaning that the data lacked proof to suggest that any of these statistics have a real correlation with the data.
For the more casual reader of this information, the real nonsense comes from the coefficient of triples being higher than the coefficient of Home Runs. Are triples really more valuable than Home Runs? Of course not. You do not need to be a math major or a baseball guru to know this.
Because of this, I realized the need to expand my data. And as I improved my model by continually adding team-seasons, I continued to realize a need to move back further. Eventually, I stopped in 1967, after compiling 1,064 team seasons of data. Using these numbers, I came up with this result.
| CS | SB | SO | SH | SF | HBP | IBB | UBB | HR | 3B | 2B | 1B | NSOBIP | b |
| -0.1334 | 0.1603 | -0.0974 | -0.0063 | 0.6773 | 0.2990 | 0.2216 | 0.3256 | 1.4479 | 1.1813 | 0.6054 | 0.4977 | -0.1043 | 2.0437 |
| 0.0581 | 0.0226 | 0.0084 | 0.0390 | 0.0899 | 0.0543 | 0.0479 | 0.0109 | 0.0240 | 0.0728 | 0.0233 | 0.0125 | 0.0072 | 28.0662 |
| 0.9509 | 21.2318 | ||||||||||||
| 1565.527 | 1050 | ||||||||||||
| 9174373 | 473327.4 |





We're going to send you the most entertaining MLB articles, videos, and podcasts from around the web.










3 Comments
Loading more comments...
This comment and all replies have been deleted This comment has been deleted Undo delete