Cookin’ With Gas

Statistical analysis of the Baltimore Orioles on an almost weekly basis.

Archive for November, 2007

Pitch Data – What Is It Good For?

Posted by cookinwithgas on November 3, 2007

Have you ever checked out the Pitch Data provided by Baseball-Reference?  Have you ever wondered what exactly all the data means, or how best to use it?  I’ve spent a lot of time running the numbers and trying to see what useful information can be gleaned from it.  Hopefully this post will help to clear up some things. 

The first thing I felt I needed to know is the impact of each of the pitch stats on the performance of a pitcher.  I decided to determine this by running correlations between each stat and ERA.  Since ERA has its issues I did the same thing with FIP ERA.  Here are some correlations (ERA listed first, then FIP ERA):

Strike%                        -.42      -.47
In-Play Strike%             .35       .52
Swinging Strike%        -.35      -.53
Contact%                      .34       .53
1st Pitch Strike%          -.34      -.36

No, the correlations aren’t that high, but they’re high enough for the purpose of this analysis.

The average correlation between HR-Rate and the two ERAs is .67. The three highest pitch stat correlations to HR-Rate are Contact% (.25), Swinging Strike% (.23), and In-Play Strike% (.23).

The average correlation between K/BB and the two ERAs is -.59. The three highest pitch stat correlations to K/BB are Strike% (.77), 1st Pitch Strike% (.63), and Swinging Strike% (.33).

The average correlation between K-Rate and the two ERAs is -.49. The three highest pitch stat correlations to K-Rate are In-Play Strike% (-.94), Contact% (-.87), and Swinging Strike% (.85).

The average correlation between BB-Rate and the two ERAs is .43. The three highest pitch stat correlations to BB-Rate are Strike% (-.87), 1st Pitch Strike% (-.75), and 0-2 Strike% (-.22).

The average correlation between BABIP and the two ERAs is .29. Interestingly enough, the correlations were pretty far apart (.55 to ERA; .03 to FIP). That ERA is so heavily influenced by a stat as inconsistent as BABIP (0.097 year-to-year correlation in this particular study) is a good indicator of the problems associated with ERA. The three highest pitch stat correlations to BABIP are 0-2 Hit% (.32), Strike% (-.08), and Called Strike 3% (-.07).

Compare the findings. The first set of pitch data stats accounted for 12 of the 15 stats listed in the next batch. These are the five pitch data stats on which I’m going to focus.

First, let me back up a bit. I have the pitch data stats and selected other stats from the last three years for every pitcher who appeared in a game for an AL team this year. I ended up with a database of 740 pitcher seasons, in which 164 of those seasons consisted of at least 500 batters faced.

Strike Percentage (Str%) is the percentage of pitches that are strikes.  The overall Str% for the 740 pitcher seasons was 63%.  The average for the 164 pitchers with at least 500 batters faced was 64%.  I determined the Standard Deviation, and then compared the stats of the pitchers at each extreme and in the middle.

                       ERA     FIP       K/9       K/BB    HR/9    BABIP
hi-Str%            3.99     3.98     6.37     3.84     1.06     .304
mid-Str%         4.27     4.33     6.30     2.34     1.03     .304
lo-Str%            5.02     4.96     6.18     1.49     1.10     .308

In-Play Strike Percentage (StIn%) is the percentage of strikes thrown that are put into play.  The overall StIn% was 31%.  The average of the pitchers in this study was 32%.  Please note that B-R.com uses StI%, while I added an “n” to make it a little easier to read.  The comparison:

                        ERA     FIP       K/9       K/BB    HR/9    BABIP
hi-StIn%           4.68     4.77     4.45     2.03     1.07     .304
mid-SIn%        4.45     4.47     6.20     2.41     1.10     .306
lo-StIn%           3.69     3.67     8.62     3.17     0.85     .304

Swinging Strike Percentage (StS%) is the percentage of strikes thrown that resulted in a swing and a miss.  The overall StS% was 14%.  The average of the pitchers in this study was 14%.  The comparison:
                       
ERA     FIP       K/9       K/BB    HR/9    BABIP

hi-StS%           3.85     3.80     8.17     3.03     0.90     .304
mid-StS%        4.52     4.50     6.17     2.30     1.08     .309
lo-StS%           4.55     4.73     4.70     2.25     1.13     .299

Contact Percentage (Cntc%) is the percentage of times the batter made contact when swinging.  The overall Cntc% was 80%.  The average of the pitchers in this study was 81%.  The comparison:

                        ERA     FIP       K/9       K/BB    HR/9    BABIP
hi-Cntc%         4.59     4.80     4.31     2.17     1.13     .300
mid-Cntc%      4.52     4.53     5.98     2.31     1.09     .307
lo-Cntc%         3.85     3.60     8.17     3.03     0.90     .304

1st Pitch Strike Percentage (1st%) is the percentage of times the first pitch to a batter was a strike.  The overall 1st% was 60%.  The average of the pitchers in this study was 59%.  The comparison:

                        ERA     FIP       K/9       K/BB    HR/9    BABIP
hi-1st%            4.01     4.00     6.50     3.63     1.03     .308
mid-1st%         4.28     4.38     6.25     2.43     1.07     .302
lo-1st%            4.88     4.71     6.24     1.60     0.99     .310

So now we have an idea of the types of numbers a pitcher might be expected to put up based on his pitch data.  

Strike%

I have a pitcher with a Strk% of less than 60%.  How likely is it that he’ll raise it to an acceptable level?  There are 82 sets of pitcher seasons in my data set in which the pitcher faced at least 450 batters in back-to-back seasons.  Of those, two pitchers were able to raise their Str% by four points the next season, and one pitcher raised his five points.  The kicker is that these pitchers were Mussina, Sabathia, and Shields – and each had a season 1 Str% of at least 62%.   

Six of the 82 pitchers had a season 1 Str% of 60% or lower.  Their season 1 and season 2 rates:

Wright (2005)              60.3     60.2
Trachsel (2006)          60.0     58.7
Meche (2005)              59.5     61.1
Santos (2005)              59.1     59.5
Cabrera (2005)           59.1     57.2
Cabrera (2006)           57.2     58.1

Yes, four of the five pitchers listed there pitched for the Orioles in 2007.  No, these numbers don’t appear to bode well for Loewen, either Cabrera, Burres, Olson, Liz, Hoey, Cherry, Leicester, or Doyne.  On the bright side, Meche was able to raise his rate to 64% last year. 

Based on my admittedly small sample size, there is a 91.5% chance that a pitcher will raise or lower his Str% by three points or less in season 2. 

In-Play Strike %

I have a pitcher with a StIn% higher than 35%.  How likely is it that he’ll lower it to an acceptable level?  Only Erik Bedard and AJ Burnett were able to lower their StIn% by four points the next season (and they had season 1 rates of 28% and 31%, respectively).   

16 of the 82 pitchers had a season 1 StIn% of 35% or higher.  Their season 1 and 2 rates:

Silva (2005)                 41        39
Wang (2005)               41        40
Wang (2006)               40        37
Silva (2006)                 39        36
Rogers (2005)             37        35
Westbrook (2006)       36        33
Buerhle (2006)            36        34
Jam Wright (2005)      36        36
Batista (2006)             35        32
Byrd (2005)                 35        34
Halladay (2006)           35        34
Ra Ortiz (2005)           35        34
Robertson (2005)        35        34
Pineiro (2005)             35        34
Blanton (2006)            35        34
Westbrook (2005)       35        36

No former Orioles on that list, which I find to be promising.  Only one of these pitchers was able to get his rate close to the league average of 31% in year two.  These numbers don’t bode well for Liz, Bradford, or Leicester.  Actually, I doubt this will prove to be an issue for Liz. 

Based on this analysis, there is a 92.3% chance that a pitcher will raise or lower his StIn% by three points or less in season 2. 

Swinging Strike %

I have a pitcher with a StS% lower than 11%.  How likely is it that he’ll raise it to an acceptable level?  Only Gil Meche has raised his StS% by more than 3 points in year 2.   

10 of 82 pitchers had a season 1 StS% of 10% or less.  Their season 1 and 2 rates:

Byrd (2005)                 10        9
Garland (2006)            10        10
Rogers (2005)             10        11
Buehrle (2006)            10        12
Meche (2005)              10        15
Trachsel (2006)          9          9
Byrd (2006)                 9          10
Wang (2005)               9          11
Silva (2006)                 8          9
Silva (2005)                 7          8

This is one reason I didn’t like the signing of Trachsel.  Only Meche on this list was able to raise his StS% to at least league average the following season.  None of the 11 pitchers with a season 1 StS% of 11% was able to post a league average rate the following season.  Bradford, Doyne, and Leicester probably won’t care too much for this stat. 

Based on this analysis, there is a 92.3% chance that a pitcher will raise or lower his StS% by three points or less in season 2. 

Contact %

How likely is it that a pitcher with a Cntc% of 85% or higher will lower it to an acceptable level the following season?  Only Meche, Bedard (twice), and Burnett have lowered their Cntc% by more than 3 points in year 2.   

12 of 82 pitchers had a season 1 Cntc% of 85% or higher.  Their season 1 and 2 rates:

Silva (2005)                 90        89
Silva (2006)                 89        88
Wang (2005)               88        86
Byrd (2006)                 88        86
Garland (2006)            87        87
Trachsel (2006)          87        87
Buehrle (2006)            86        84
Byrd (2005)                 86        88
Meche (2005)              86        79
Rogers (2005)             86        84
Garland (2005)            86        87
Wang (2006)               86        84

Does anyone wonder why I’ve never wanted Silva, Byrd, or Garland on my team, and why I’ve never been sold on Wang?  By the way, it is looking as if the Gil Meche 2005 to 2006 transformation is what we’re hoping to see out of a few Orioles pitchers.  This list is one reason I’m not a fan of Leicester, and why I’m hoping Doyne was hindered by injury. 

Based on this analysis, there is an 87.8% chance that a pitcher will raise or lower his Cntc% by three points or less in season 2. 

1st Pitch Strike %

The stat that so many announcers allude to as being important.  How likely is it that a pitcher with a 1st% of 56% or lower will raise it to an acceptable level the following season?  Only Padilla, Wakefield, Mussina, Kazmir, Robertson, F Hernandez, and Sabathia have raised their 1st% by more than 4 points in year 2.   

10 of 82 pitchers had a season 1 1st% of 55% or lower.  Their season 1 and 2 rates:

Fossum (2005)           55        50        55
Batista (2006)             55        54
Wang (2005)               55        56        55
Padilla (2005)              55        60        58
Wakefield (2006)         55        60
Meche (2005)              54        56        60
Kazmir (2005)             53        59        57
Cabrera (2005)           52        52        55
Cabrera (2006)           52        55
Santos (2005)             51        48        53

The third column shows how each pitcher did in the third year.  None of these pitchers has been able to get his 1st% up to an acceptable level.  This is another reason I’m losing confidence in the ability of Cabrera to turn things around.  This doesn’t bode well for Loewen, Burres, Olson, Liz, Hoey, Cherry, Doyne, Leicester, and F Cabrera.  Fortunately, Kazmir and Wang are pretty good examples of pitchers having success with a fairly low 1st%. 

Based on this analysis, there is a 74.4% chance that a pitcher will raise or lower his 1st% by three points or less in season 2.  This is actually somewhat promising.  In fact, five pitchers in the survey (6.1%) actually had a 6 point increase from year 1 to year 2. 

Hopefully now you have a better idea of which pitch stats on which to place your focus. 

I would like to add one more thing.  I would recommend only using pitch data to help support conclusions as opposed to using the data to come to definitive conclusions. 

Thanks for reading. 

Thanks to Baseball-Reference for the great data.

Posted in Baseball Analysis | Leave a Comment »