Here's a Thought: Two Variables That Affect Pitching (and Many That Don't)

Nathaniel StoltzJul 12, 2009

If you haven't already, definitely check out the last edition of "Here's a Thought."

Basically, that article is a statistical study that proves that fastball velocity and usage has absolutely no impact on the effectiveness of the fastball.

I got a few comments right away, so I figured that this stuff interests people. So I decided to look around to try to find any sort of relationship between two variables in pitching.

TOP NEWS

So, I've looked at the relationship between a number of different things, and here's what I've found:

Fastball Velocity vs. Slider Effectiveness: This has slightly more correlation than fastball velocity and fastball effectiveness, but the low R-squared (.026) means that they have little, if anything, to do with each other.

Fastball Velocity vs. Curveball Effectiveness: R-squared of .008. Completely inconsequential.

Fastball Velocity vs. Changeup Effectiveness: R-squared of .002. Completely inconsequential.

So, there you have it.

Fastball velocity has no impact on the effectiveness of anything.

Frustrated with that, I decided to look at something else.

Fastball-Changeup Velocity Differential vs. Changeup Effectiveness: Nope, just a .015 R-squared. I guess velocity differential has nothing to do with it, and it's all just about arm speed and movement, which we can't quantify right now.

So that went nowhere.

I remembered from this article the other day that I had seen that five of the six pitchers who threw sub-70 mph curves got great results, so I ran a Curve Velocity vs. Curve Effectiveness graph, hoping to see a negative trend.

I didn't get any trend.

None of this is going anywhere.

But pitching can't be random. There's a good reason why good pitchers are good and bad pitchers are bad.

But what the hell is it?

To investigate this, I created a quick-and-easy measure of pitcher effectiveness. It looks like this:

(Fastball Usage*Fastball Effectiveness)+(Slider Usage*Slider Effectiveness)+...same for curveballs, changeups, splitters, cutters, and knuckleballs.

It's a pretty good measure of who's pitching well. The top 10 pitchers:

Jeremy Affeldt
Joe Nathan
Rafael Soriano
Ryan Franklin
Nick Masset
Trevor Hoffman
Matt Guerrier
Chris Carpenter
Mark DiFelice
Dan Haren

And the bottom 10:

Mike Lincoln
Rafael Rodriguez
Daisuke Matsuzaka
Kris Benson
Yusmeiro Petit
Chien-Ming Wang
Casey Janssen
Chris Ray
Ervin Santana
Hayden Penn

Anyway, with that variable in place, I started testing out other different variables to see if any correlated with it.

Suffice it to say, without me going into R-squared details, that it simply doesn't matter how hard you throw or how often you throw it.

I decided that, since velocity and usage seemingly have no effect, to turn my attention to control.

Now, I don't have data for control of certain pitches, but I do have one useful stat: Zone Percentage, which simply measures how often a pitcher's pitch found the strike zone, no matter the outcome.

So I ran Zone Percentage vs. Overall Effectiveness.

Zero correlation.

Fine, I thought, maybe it has something to do with just strikes, not pitches in the zone. I looked at first-pitch strike percentage.

I got a stronger relationship between that and Overall Effectiveness, but if the two variables are related, it's only slight.

I decided to try one last thing before giving up and declaring pitching to be random.

Another nifty Fangraphs stat is Contact Percentage, which is simply the percentage of swings against the pitcher that make contact.

And finally, we have some semblance of correlation.

The R-squared value is just .118, but after staring at .00somethings for four hours, that's great.

So what that means is that as contact against a pitcher goes down, the pitcher's effectiveness goes up.

You're probably thinking, "Well, duh!"

Wouldn't you have said that pitching in the zone matters, velocity matters, and first-pitch strikes matter?

Those seemed pretty "Well, duh!" to me before today.

They all apparently don't matter, but contact does.

Rather than ending the study there, I decided to look at contact in the zone and contact out of the zone to see if one correlates more strongly.

I found that contact in the zone is a much better indicator of effectiveness, but that overall contact is better than either component independently.

I also found one other variable that correlates: percentage of pitches outside the zone swung at, or O-Swing percentage. While Z-Swing percentage (percentage of pitches in the zone swung at) has no correlation at all with Overall Effectiveness, O-Swing percentage has a nice R-squared of .102.

So, after looking through all the velocity, usage, contact, and swing variables, I've found two that indicate the overall effectiveness of a pitcher: Contact percentage and O-Swing percentage.

So those two variables affect the quality of pitching. That's what I've found so far. I'll be looking at whatever data I can find in the coming weeks, and will keep everyone posted as to what I find.