Cookin’ With Gas

Statistical analysis of the Baltimore Orioles on an almost weekly basis.

What Pitcher Stats Should I Look At?

Posted by cookinwithgas on December 29, 2007

What if I told you that I knew of a series of stats that could predict with 92% accuracy that a pitcher will finish with a sub 4.50 ERA?  What if I told you the same stats could predict with 75% accuracy that a pitcher will finish with a sub 4.00 ERA?  Would that pique your interest?

Let me back up.  Someone at Orioles Hangout asked me what stats I look for in a pitcher.  This is how I answered the question.  After posting that I realized that I didn’t set the parameters.  So I did some studying.  These are the boundaries that I came up with:

Strikeout % (Strikeouts / Batters Faced) of at least 17%

K/BB (Strikeouts / Unintentional Walks) of at least 2.5

Groundball % ( Groundballs / Balls in Play) of at least 45%

Strike % (Strikes / Pitches Thrown) of at least 64%

Swinging Strike % (Swinging Strikes / Strikes) of at least 15%

So what did I do once I settled on the parameters?  I have the traditional, batted ball, and pitch data stats for the last three years for each of the last three seasons saved in an Excel file.  First, I looked at the 378 pitcher seasons in which the pitcher faced at least 500 batters in a single season in the three years.  Next, I looked at the 380 pitchers who faced a total of at least 500 batters total in the three seasons.

The following shows how the single season pitchers fared in each category (the number meaning the number of categories in which the pitcher exceeded the minimum):

Five – 22 Pitchers – 4,654 IP – 3.54 ERA – 3.53 FIP ERA

Four – 53 Pitchers – 10,396 IP – 3.62 ERA – 3.57 FIP ERA

Three - 60 Pitchers – 11,355 IP – 4.00 ERA – 3.99 FIP ERA

Two - 65 Pitchers – 11,622 IP – 4.49 ERA – 4.37 FIP ERA

One - 110 Pitchers – 18,373 IP – 4.70 ERA – 4.65 FIP ERA

Zero - 68 Pitchers – 11,354 IP – 4.88 ERA – 4.94 FIP ERA

And the same stats for the 3-year pitchers:

Five - 21 Pitchers – 8,405 IP – 3.45 ERA – 3.46 FIP ERA

Four - 50 Pitchers – 15,095 IP – 3.51 ERA – 3.55 FIP ERA

Three - 64 Pitchers – 16,526 IP – 4.02 ERA – 4.03 FIP ERA

Two - 91 Pitchers – 23,011 IP – 4.43 ERA – 4.41 FIP ERA

One - 91 Pitchers – 26,675 IP – 4.70 ERA – 4.62 FIP ERA

Zero - 63 Pitchers – 18,427 IP – 4.86 ERA – 4.89 FIP ERA

What I liked about the above was the consistency of the ERA and FIP ERA between the 3-year and single year stats. 

Onto some other stats:

75 single season (SS) pitchers met at least four of the stat criteria.  Of these 75 pitchers, 69 (92%) finished with an ERA below 4.50, and all of them finished with a FIP ERA of 4.43 or lower.  56 of the 75 (75%) finished with an ERA of less than 4.00, while 65 (87%) finished with a sub 4.00 FIP ERA.

71 three-year (3Y) pitchers met at least four of the stat criteria.  Of these 71 pitchers, 67 (94%) finished with an ERA below 4.50, and all of them finished with a FIP ERA of 4.30 or lower.  58 of the 71 (82%) finished with an ERA of less than 4.00, while 64 (90%) finished with a sub 4.00 FIP ERA.

How about the inverse? 

178 (SS) pitchers met only one or none of the stat criteria.  Of these 178 pitchers, 114 (64%) finished with an ERA above 4.50, and 121 (69%) finished with a FIP ERA of 4.50 or higher.  71 of the 178 (40%) finished with an ERA of at least 5.00, while 49 (28%) finished with a FIP ERA of 5.00 or higher.

154 (3Y) pitchers met only one or none of the stat criteria.  Of these 154 pitchers, 104 (68%) finished with an ERA above 4.50, and 98 (64%) finished with a FIP ERA of 4.50 or higher.  66 of the 154 (43%) finished with an ERA of at least 5.00, while 41 (27%) finished with a FIP ERA of 5.00 or higher.

Food for thought.

Here’s hoping everyone has a great 2008.

Thank you

Posted in 1 | Leave a Comment »

Pitch Data – What Is It Good For?

Posted by cookinwithgas on November 3, 2007

Have you ever checked out the Pitch Data provided by Baseball-Reference?  Have you ever wondered what exactly all the data means, or how best to use it?  I’ve spent a lot of time running the numbers and trying to see what useful information can be gleaned from it.  Hopefully this post will help to clear up some things. 

The first thing I felt I needed to know is the impact of each of the pitch stats on the performance of a pitcher.  I decided to determine this by running correlations between each stat and ERA.  Since ERA has its issues I did the same thing with FIP ERA.  Here are some correlations (ERA listed first, then FIP ERA):

Strike%                        -.42      -.47
In-Play Strike%             .35       .52
Swinging Strike%        -.35      -.53
Contact%                      .34       .53
1st Pitch Strike%          -.34      -.36

No, the correlations aren’t that high, but they’re high enough for the purpose of this analysis.

The average correlation between HR-Rate and the two ERAs is .67. The three highest pitch stat correlations to HR-Rate are Contact% (.25), Swinging Strike% (.23), and In-Play Strike% (.23).

The average correlation between K/BB and the two ERAs is -.59. The three highest pitch stat correlations to K/BB are Strike% (.77), 1st Pitch Strike% (.63), and Swinging Strike% (.33).

The average correlation between K-Rate and the two ERAs is -.49. The three highest pitch stat correlations to K-Rate are In-Play Strike% (-.94), Contact% (-.87), and Swinging Strike% (.85).

The average correlation between BB-Rate and the two ERAs is .43. The three highest pitch stat correlations to BB-Rate are Strike% (-.87), 1st Pitch Strike% (-.75), and 0-2 Strike% (-.22).

The average correlation between BABIP and the two ERAs is .29. Interestingly enough, the correlations were pretty far apart (.55 to ERA; .03 to FIP). That ERA is so heavily influenced by a stat as inconsistent as BABIP (0.097 year-to-year correlation in this particular study) is a good indicator of the problems associated with ERA. The three highest pitch stat correlations to BABIP are 0-2 Hit% (.32), Strike% (-.08), and Called Strike 3% (-.07).

Compare the findings. The first set of pitch data stats accounted for 12 of the 15 stats listed in the next batch. These are the five pitch data stats on which I’m going to focus.

First, let me back up a bit. I have the pitch data stats and selected other stats from the last three years for every pitcher who appeared in a game for an AL team this year. I ended up with a database of 740 pitcher seasons, in which 164 of those seasons consisted of at least 500 batters faced.

Strike Percentage (Str%) is the percentage of pitches that are strikes.  The overall Str% for the 740 pitcher seasons was 63%.  The average for the 164 pitchers with at least 500 batters faced was 64%.  I determined the Standard Deviation, and then compared the stats of the pitchers at each extreme and in the middle.

                       ERA     FIP       K/9       K/BB    HR/9    BABIP
hi-Str%            3.99     3.98     6.37     3.84     1.06     .304
mid-Str%         4.27     4.33     6.30     2.34     1.03     .304
lo-Str%            5.02     4.96     6.18     1.49     1.10     .308

In-Play Strike Percentage (StIn%) is the percentage of strikes thrown that are put into play.  The overall StIn% was 31%.  The average of the pitchers in this study was 32%.  Please note that B-R.com uses StI%, while I added an “n” to make it a little easier to read.  The comparison:

                        ERA     FIP       K/9       K/BB    HR/9    BABIP
hi-StIn%           4.68     4.77     4.45     2.03     1.07     .304
mid-SIn%        4.45     4.47     6.20     2.41     1.10     .306
lo-StIn%           3.69     3.67     8.62     3.17     0.85     .304

Swinging Strike Percentage (StS%) is the percentage of strikes thrown that resulted in a swing and a miss.  The overall StS% was 14%.  The average of the pitchers in this study was 14%.  The comparison:
                       
ERA     FIP       K/9       K/BB    HR/9    BABIP

hi-StS%           3.85     3.80     8.17     3.03     0.90     .304
mid-StS%        4.52     4.50     6.17     2.30     1.08     .309
lo-StS%           4.55     4.73     4.70     2.25     1.13     .299

Contact Percentage (Cntc%) is the percentage of times the batter made contact when swinging.  The overall Cntc% was 80%.  The average of the pitchers in this study was 81%.  The comparison:

                        ERA     FIP       K/9       K/BB    HR/9    BABIP
hi-Cntc%         4.59     4.80     4.31     2.17     1.13     .300
mid-Cntc%      4.52     4.53     5.98     2.31     1.09     .307
lo-Cntc%         3.85     3.60     8.17     3.03     0.90     .304

1st Pitch Strike Percentage (1st%) is the percentage of times the first pitch to a batter was a strike.  The overall 1st% was 60%.  The average of the pitchers in this study was 59%.  The comparison:

                        ERA     FIP       K/9       K/BB    HR/9    BABIP
hi-1st%            4.01     4.00     6.50     3.63     1.03     .308
mid-1st%         4.28     4.38     6.25     2.43     1.07     .302
lo-1st%            4.88     4.71     6.24     1.60     0.99     .310

So now we have an idea of the types of numbers a pitcher might be expected to put up based on his pitch data.  

Strike%

I have a pitcher with a Strk% of less than 60%.  How likely is it that he’ll raise it to an acceptable level?  There are 82 sets of pitcher seasons in my data set in which the pitcher faced at least 450 batters in back-to-back seasons.  Of those, two pitchers were able to raise their Str% by four points the next season, and one pitcher raised his five points.  The kicker is that these pitchers were Mussina, Sabathia, and Shields – and each had a season 1 Str% of at least 62%.   

Six of the 82 pitchers had a season 1 Str% of 60% or lower.  Their season 1 and season 2 rates:

Wright (2005)              60.3     60.2
Trachsel (2006)          60.0     58.7
Meche (2005)              59.5     61.1
Santos (2005)              59.1     59.5
Cabrera (2005)           59.1     57.2
Cabrera (2006)           57.2     58.1

Yes, four of the five pitchers listed there pitched for the Orioles in 2007.  No, these numbers don’t appear to bode well for Loewen, either Cabrera, Burres, Olson, Liz, Hoey, Cherry, Leicester, or Doyne.  On the bright side, Meche was able to raise his rate to 64% last year. 

Based on my admittedly small sample size, there is a 91.5% chance that a pitcher will raise or lower his Str% by three points or less in season 2. 

In-Play Strike %

I have a pitcher with a StIn% higher than 35%.  How likely is it that he’ll lower it to an acceptable level?  Only Erik Bedard and AJ Burnett were able to lower their StIn% by four points the next season (and they had season 1 rates of 28% and 31%, respectively).   

16 of the 82 pitchers had a season 1 StIn% of 35% or higher.  Their season 1 and 2 rates:

Silva (2005)                 41        39
Wang (2005)               41        40
Wang (2006)               40        37
Silva (2006)                 39        36
Rogers (2005)             37        35
Westbrook (2006)       36        33
Buerhle (2006)            36        34
Jam Wright (2005)      36        36
Batista (2006)             35        32
Byrd (2005)                 35        34
Halladay (2006)           35        34
Ra Ortiz (2005)           35        34
Robertson (2005)        35        34
Pineiro (2005)             35        34
Blanton (2006)            35        34
Westbrook (2005)       35        36

No former Orioles on that list, which I find to be promising.  Only one of these pitchers was able to get his rate close to the league average of 31% in year two.  These numbers don’t bode well for Liz, Bradford, or Leicester.  Actually, I doubt this will prove to be an issue for Liz. 

Based on this analysis, there is a 92.3% chance that a pitcher will raise or lower his StIn% by three points or less in season 2. 

Swinging Strike %

I have a pitcher with a StS% lower than 11%.  How likely is it that he’ll raise it to an acceptable level?  Only Gil Meche has raised his StS% by more than 3 points in year 2.   

10 of 82 pitchers had a season 1 StS% of 10% or less.  Their season 1 and 2 rates:

Byrd (2005)                 10        9
Garland (2006)            10        10
Rogers (2005)             10        11
Buehrle (2006)            10        12
Meche (2005)              10        15
Trachsel (2006)          9          9
Byrd (2006)                 9          10
Wang (2005)               9          11
Silva (2006)                 8          9
Silva (2005)                 7          8

This is one reason I didn’t like the signing of Trachsel.  Only Meche on this list was able to raise his StS% to at least league average the following season.  None of the 11 pitchers with a season 1 StS% of 11% was able to post a league average rate the following season.  Bradford, Doyne, and Leicester probably won’t care too much for this stat. 

Based on this analysis, there is a 92.3% chance that a pitcher will raise or lower his StS% by three points or less in season 2. 

Contact %

How likely is it that a pitcher with a Cntc% of 85% or higher will lower it to an acceptable level the following season?  Only Meche, Bedard (twice), and Burnett have lowered their Cntc% by more than 3 points in year 2.   

12 of 82 pitchers had a season 1 Cntc% of 85% or higher.  Their season 1 and 2 rates:

Silva (2005)                 90        89
Silva (2006)                 89        88
Wang (2005)               88        86
Byrd (2006)                 88        86
Garland (2006)            87        87
Trachsel (2006)          87        87
Buehrle (2006)            86        84
Byrd (2005)                 86        88
Meche (2005)              86        79
Rogers (2005)             86        84
Garland (2005)            86        87
Wang (2006)               86        84

Does anyone wonder why I’ve never wanted Silva, Byrd, or Garland on my team, and why I’ve never been sold on Wang?  By the way, it is looking as if the Gil Meche 2005 to 2006 transformation is what we’re hoping to see out of a few Orioles pitchers.  This list is one reason I’m not a fan of Leicester, and why I’m hoping Doyne was hindered by injury. 

Based on this analysis, there is an 87.8% chance that a pitcher will raise or lower his Cntc% by three points or less in season 2. 

1st Pitch Strike %

The stat that so many announcers allude to as being important.  How likely is it that a pitcher with a 1st% of 56% or lower will raise it to an acceptable level the following season?  Only Padilla, Wakefield, Mussina, Kazmir, Robertson, F Hernandez, and Sabathia have raised their 1st% by more than 4 points in year 2.   

10 of 82 pitchers had a season 1 1st% of 55% or lower.  Their season 1 and 2 rates:

Fossum (2005)           55        50        55
Batista (2006)             55        54
Wang (2005)               55        56        55
Padilla (2005)              55        60        58
Wakefield (2006)         55        60
Meche (2005)              54        56        60
Kazmir (2005)             53        59        57
Cabrera (2005)           52        52        55
Cabrera (2006)           52        55
Santos (2005)             51        48        53

The third column shows how each pitcher did in the third year.  None of these pitchers has been able to get his 1st% up to an acceptable level.  This is another reason I’m losing confidence in the ability of Cabrera to turn things around.  This doesn’t bode well for Loewen, Burres, Olson, Liz, Hoey, Cherry, Doyne, Leicester, and F Cabrera.  Fortunately, Kazmir and Wang are pretty good examples of pitchers having success with a fairly low 1st%. 

Based on this analysis, there is a 74.4% chance that a pitcher will raise or lower his 1st% by three points or less in season 2.  This is actually somewhat promising.  In fact, five pitchers in the survey (6.1%) actually had a 6 point increase from year 1 to year 2. 

Hopefully now you have a better idea of which pitch stats on which to place your focus. 

I would like to add one more thing.  I would recommend only using pitch data to help support conclusions as opposed to using the data to come to definitive conclusions. 

Thanks for reading. 

Thanks to Baseball-Reference for the great data.

Posted in Baseball Analysis | Leave a Comment »

You Have One Game to Win…

Posted by cookinwithgas on October 24, 2007

Someone over at the Orioles Hangout asked a great question the other day – if you have one game to win, who would be your starting pitcher? 

I took 20 pitchers who have each faced at least 72 batters during postseason play since 2002.  I originally used the pitchers listed in the OH thread, and then added others as their names came to me.  I only used the regular season numbers for the season in which the pitcher appeared in the postseason.  For instance, since 2002, Curt Schilling has appeared in the 2002, 2004, and 2007 postseasons.  So I compared his 2002, 2004, and 2007 regular season numbers to his 2002, 2004, and 2007 post season numbers. 

First, I want to attack the question using traditional stats, and my favorite – FIP ERA.  Overall, the 20 pitchers combined to post a 3.45 ERA in the 12,400 regular season innings in which they appeared in the playoffs.  The same 20 pitchers combined for a 3.88 ERA in 877 post season innings.  The regular season FIP ERA of the pitchers was 3.65 compared to 4.11 during post season play.  I personally found it interesting that the numbers represented a 12.4% increase in ERA compared to a 12.7% increase in FIP ERA. I will admit that I was surprised that the overall ERA actually increased.  It has always been my assumption that the overall ERA tends to go down in post season play.  Maybe that’s what I get for listening to announcers. 

Now we know that the overall ERA goes up, but do the numbers give us a clue as to why?  BABIP stayed about the same (it increased from .288 to .292).  LOB% is relatively unchanged (73.6% up to 73.8%).  WHIP goes up from 1.18 to 1.26.  K% dropped from 19.7% down to 18.1%, while BB% goes from 6% up to 6.7%.  Command Rate dropped from 3.31 down to 2.72.  So the only relatively big changes seen thus far involved walks and strikeouts.  I suppose this really shouldn’t be a surprise considering these pitchers are facing better hitters.  Want another indicator that the pitchers are facing better hitting?  HR/OFFB% went up a good amount (10.6% up to 12.2%) – this explains the increased ERA as much as anything.   

Of the 20 pitchers in the survey, only six had a better post season ERA than regular season ERA.  This caused me to think maybe the first comparison I made was negatively impacted by those at the bottom.  So I decided to do another comparison, this one taking a look at the pitchers in the survey with the 10 best regular season ERAs.  The ERA change this time was 2.99, compared to 3.13 in the post season.  The FIP ERA change was 3.38 up to 3.67.  Those represent a 5% and 7% increase, respectively.   

We’ve looked at the overall numbers, how’d the pitchers do individually?   

Curt Schilling’s postseason ERA during the time span is 3.17, compared to his 3.39 regular season ERA in the affected seasons. 

The pitcher in the survey with the 5th best ERA improvement was Andy Pettitte (3.44 down to 3.13 in the postseason). 

Chris Carpenter’s post season ERA was 2.53 compared to 2.95 in the affected regular seasons. 

John Smoltz had the third best improvement in the survey – a 2.76 regular season ERA compared to a 1.96 post season ERA. 

Mariano Rivera’s post season ERA since 2002 was an amazing 0.70 compared to 2.06 in the regular season. 

The biggest ERA improvement from the regular season to the post season?   

Drum roll, please. 

Raise your hand if you guessed Josh Beckett.  His regular season ERA during the 2003 and 2007 seasons was 3.18, compared to 1.78 during those two post seasons.  Before someone says that it’s only two seasons, only three pitchers in the survey have faced more post season batters than Beckett since 2002. 

The three pitchers with the biggest increase in ERA?  Glavine (3.36 to 5.84), R Johnson (3.60 to 7.11), and Wang (3.73 all the way up to 7.58). 

So how did they rank in overall post season ERA since 2002? 

Rivera (0.70), Beckett (1.78), Smoltz (1.96), Carpenter (2.53), Pettitte (3.13),Schilling (3.17), Lackey (3.63), Oswalt (3.66), Santana (3.97), Wells (4.08),Zito (4.11), Mussina (4.19), Pedro (4.39), Clemens (4.50), Morris (4.96),Hudson (5.10), Glavine (5.84), Wakefield (5.91), R Johnson (7.11), Wang (7.58). 

If you have one game to win, who would be your starting pitcher? 

Based on this survey, Josh Beckett seems to be the best bet to me. 

In my next post I’ll take a look at the question from a different angle. 

Thanks. 

Go Rockies!

Posted in 1 | Leave a Comment »

The Schilling Theory

Posted by cookinwithgas on October 1, 2007

During Boston’s first trip to Baltimore, Gary Thorne talked about Curt Schilling’s theory of which team would win the division title.  According to Schilling, the opening day five-man rotation which made the most combined starts would win the division.  So, is Curt Schilling on to something? 

The opening day rotation for the Blue Jays made a total of 85 starts.  Their overall total was hurt by the early loss of Gustavo Chacin, who made only five starts.   

The Orioles rotation made 96 starts, with the early exits of Loewen and Wright hurting the overall total. 

The Yankees?  105 starts, with Pavano’s two starts bringing down the total. 

The big surprise to me was Tampa Bay’s rotation, which actually combined for 116 starts.  They actually had three pitchers (Kazmir, Shields, and Jackson) with at least 31 starts. 

For those scoring at home, it looks as if Schilling was right, at least in terms of which team would win the division.  The opening day Red Sox rotation combined for an amazing 140 starts, including at least 23 starts from each member of the rotation. 

The cases of Jeremy Guthrie and Roger Clemens caused me to take it a step further and add the sixth starter to each rotation.  How much of an effect did this have? 

Boston             150 starts

Tampa Bay     138 starts

New York         122 starts

Baltimore         122 starts

Toronto            112 starts 

Kudos to Schilling, at least in terms of forecasting the division winner.  The thing I can’t remember is if he said the final division standings would be determined in the same manner.  If he did, then he wasn’t so right.  I’m probably in the minority here, but seeing Tampa Bay rank so high should make the rest of the division real nervous going into 2008. 

Hearing Thorne talk about the Schilling Theory gave me an idea.  I made a list I called “You’ll Know the 2007 Orioles Had a Bad Season If…”  For instance, one item on my list was “if Rob Bell winds up facing at least 250 batters.” 

I never posted the list because I realized that there was no way so many things could go wrong with one pitching staff.  So I held on to the list until now. Here goes: I’ll know the 2007 Orioles are in trouble if …

  • Jon Leicester, Victor Zambrano, Kurt Birkikns and someone named Victor Santos combine for more combined starts (12) than opening day rotation members Adam Loewen and Jaret Wright (9). 

  • Jeremy Guthrie starts 26 games.  (Of course, if he somehow gives the team a 3.70 ERA over 175 innings pitched, I would consider this one to be a good thing.)

  • Daniel Cabrera shows signs of regression by posting a 5.55 ERA with a 23% lower K-Rate.

  • Erik Bedard is not able to start at least 30 games or garner 200 innings pitched.

  • Steve Trachsel is actually allowed to start 25 games.  (Of course, if he pulls off the miracle of all miracles and posts a 4.48 ERA, then this would be a good thing.)

  • Rob Bell ends up facing at least 250 batters, while Chris Ray faces less than 180.

  • Chris Ray and Danys Baez combine for a 5.52 ERA with only 19 saves and only 93 innings pitched.

  • Scott Williamson is released following 16 appearances and 14.3 innings pitched.

  • Someone or something named Rocky Cherry is a key cog in the bullpen by September.

  • Paul Shuey is needed to appear in 25 games.

  • The team would need to give starts to 13 pitchers.

  • The team would need to use 27 pitchers.

  • 19 of the 27 pitchers used posted an ERA higher than 5.00.  (In fact, I would have bet this could not possibly happen.)

  • The final overall team ERA is 5.19.

 As I said, I had to keep this list hidden because it was just too far out there.  It’s a good thing none of those things actually happened.

Okay, you got me.  I didn’t make this list five months ago.  No one has an imagination like that.

Posted in 1 | Leave a Comment »

“The Great Orioles Teams Did Just Enough Offensively to Get By.” Really?

Posted by cookinwithgas on August 5, 2007

The great Orioles teams won because of “pitching, defense, and the three-run homer.”   

How many times have you heard this or a similar statement?  Yes, those teams had great pitching.  Yes, they typically played some darn good defense.  Yes, they hit quite a few home runs.  The one thing that seems to be overlooked when talking about pitching, defense, and the three-run homer is that you can’t hit a three-run homer if you don’t have people on base.  This, to me, is the hidden gem in the Earl Weaver philosophy.  

So I did some research.  Below is how the Orioles ranked in certain categories in the years they played in a World Series (the number in parentheses is the average rank). 

R/G > 1st – 2nd – 1st – 1st – 6th – 2nd (2.2)

OBP > 1st – 1st – 1st – 1st – 7th – 1st (2.0)

SLG > 1st – 2nd – 3rd – 2nd – 5th – 3rd (2.7)

OPS+ > 1st – 1st – 1st – 1st – 4th – 3rd (1.8) 

ERA > 4th – 1st – 1st – 1st – 1st – 2nd (1.7)

ERA + > 4th – 1st – 1st – 1st – 1st – 3rd (1.8)

Def Eff > 4th – 1st – 3rd – 2nd – 1st – 5th (2.7) 

Yes, pitching was a key part of the equation.  Those were obviously well balanced teams.  Notice that they led the AL in OBP in each of those seasons except for 1979.  These numbers also tell us something else about those teams that often seems to get overlooked.  They were very good offensive teams (finishing 1st or 2nd in runs 5 of 6 seasons, and 1st in OPS+ 4 times).   

I bring up the offensive prowess of these teams because of something that was recently posted on an Orioles internet message board.  Someone stated something to the effect that the great Orioles teams had great pitching, played great defense, and did enough to get by on offense.  The above numbers tell us they did quite a bit more than just get by. 

Nowick Gray wrote a very good column for The Orioles Hangout on this subject, and how the current front office seems to have forgotten a very important part of the equation.

Posted in Uncategorized | Leave a Comment »

Bedard a top 5?

Posted by cookinwithgas on July 22, 2007

There’s a lot of talk right now as to whether Erik Bedard’s current string of great outings makes him one of the best pitchers in baseball right now.  So where does he rank?

There are a lot of ways to look at this.  What I don’t get are those people who like to pull up the numbers from 3 or 4 years ago and use the overall numbers since then to prove that he isn’t.  Yes, if you look at the overall numbers since 2004, Roy Halladay has been a better pitcher than has Erik Bedard.  However, I say that anyone who really thinks Roy Halladay is a better pitcher right now than is Erik Bedard doesn’t know what he’s talking about.  Having said that, I can also understand why we need to look at more than just this year’s numbers.

So I decided to go back to June 5 of last year.  How does Bedard rank since then?  There have been 78 pitchers who have at least 200 innings pitched since that date.  Those are the 78 pitchers used for the following comparison.

Erik Bedard is third in ERA at 2.91 (behind Santana and Chris Young).

Bedard is fifth in H/9 at 7.44 (behind Young, Santana, Maine, and Zambrano).

Bedard is 41st in BB/9 with a very respectable 2.78 (the top three are Byrd, Maddux, and Sheets).

Bedard is 1st in K/9 at 10.05 (followed by Peavy, Hamels, and Santana).

Bedard is 13th in K/BB at 3.61 (the top three are Sheets, Sabathia, and Schilling).

Bedard is 12th in HR/9 at 0.74 (the top three are Wang, Lowe, and Peavy).

You probably know that one of my favorites measures of a pitcher is FIP ERA.  Here are the top five:

  1. Peavy – 2.96
  2. Bedard – 3.03
  3. Smoltz – 3.25
  4. Santana – 3.31
  5. Escobar – 3.34

And Bedard’s ranking in some of the counting stats: 

  • tied for 7th with 42 starts.
  • tied for 17th with 19 wins.
  • 12 pitchers have fewer than his 10 losses (but only Santana and Harang match his 42 starts among those with fewer losses).
  • Not a counting stat, but he is 13th with a .655 winning %.
  • He is one of 34 with a shutout (Sabathia, Hernandez, Lackey and Contreras have two each)
  • 15th in IP (he’s averaged 6.4 IP per start)
  • 1st in strikeouts with 300 (Santan has 296, Peavy 273, and Harang 264).

That’s a lot of information.  Bedard is in the top five in ERA, K/9, FIP ERA, and Strikeouts.  Compare the top five in ERA and FIP ERA:

  1. Santana ( 2.58) / Peavy (2.96)
  2. Young (2.64) / Bedard (3.03)
  3. Bedard (2.91) / Smoltz (3.25)
  4. Escobar (3.15) / Santana (3.31)
  5. Smoltz (3.16) / Escobar (3.34)

Santana, Bedard, Escobar and Smoltz appear in both lists.  If you were to give points based on rankings, Bedard ties with Santana for the most points (3 for 3rd, 4 for 2nd for a total of 7 points). 

Part of the reason for determining who is the best pitcher is predicting future performance.  Of the four pitchers who made both lists, only Smoltz has a smaller difference in FIP ERA and ERA than does Bedard.

I think Erik Bedard has a darn good argument as one of the best five pitchers in MLB right now.  In fact, the only one I see who is really better (when taking things such as league and park into effect) is Johan Santana.

So I’ll say it:

Erik Bedard is the second best pitcher in baseball.

Posted in Uncategorized | 3 Comments »

Posted by cookinwithgas on June 10, 2007

One of my newest, favorite toys is the Pitch Data supplied by Baseball-Reference.  The problem with the data is that there is so much of it that it is hard to decipher the importance of each stat contained in the data.  I’m sure there have been extensive studies of the data, but I haven’t seen them.  I did see a pretty good article on The Hardball Times website about it, but that’s about it.  I decided to do a small study of my own. To do this, I needed some data.  I transferred the 2005 through 2007 data of every pitcher who has appeared in a 2007 American League game through May into an Excel spreadsheet.  The approach I decided to take is to first determine which of the stats a pitcher has the most control over.  (I would be remiss if I didn’t warn you up front that the data used in this study potentially suffers from the dreaded “small sample size.”)   I did this via year-to-year correlation of each stat (filtering out any pitcher who had faced less than 100 batters in either year one or year two – leaving me with 224 pairs of seasons).  The results: 

Cntc%             .732

StS%               .721

StI%                .719

Sw/In%           .700

K%                  .685

Strk %             .668

P/PA                .620

StL%               .594

1st%                .533

StF%               .516

SO c%            .484

ERA                 .272

The definition of each of the above stats can be found on each pitcher’s B-R page.  Just below the pitching stats line you’ll see Pitch Data Summary (Show or Hide).  Click on the show or hide to see the data.  Below the data you will see the word Glossary – click on the word to see it.  I actually made up one of the above stats – Sw/In%.  Sw/In% is the percentage of strikes swung at by a batter that are put in play (a home run is counted as a ball in play in this instance).  Notice that seven of the 12 stats listed above have a correlation of at least 0.7 – remember those stats. Next, I needed to find out which stats correlate best with ERA.  To determine this, I decided to use the overall three-year data (filtering out any pitcher who faced fewer than 300 batters – leaving me with 161 pitchers).  The results: 

StI%        .430

Sw/In%   .421

Cntc%     .391

SO c%    .082

StL%       .046

StF%       -.122 P/PA        -.196

StS%       -.387

1st%        -.414

Strk %     -.427

K%          -.525

The correlation obviously isn’t as high for this set of correlations.  On the bright side, four of the stats I wanted to focus on based on the first chart each had a correlation of at least 0.4.  The only outlier is 1st%.   I’m a big believer in the importance of striking out batters.  To this end, I decided to run a correlation of Pitch Data to K% (the percentage of batters faced who strike out).  This was performed in a similar fashion to the previous study.  The results: 

StS%       .828

P/PA        .621

Strk %     .183

StF%       .174

1st%        .150

StL%       -.023

SO c%    -.119

ERA         -.525

Cntc%     -.859

Sw/In%   -.890

StI%        -.930 

No big surprises here.  One thing I’d like to point out.  The previous chart showed a relatively high correlation between 1st% and ERA, whereas this chart shows a low correlation between 1st% and K%.  I find it fascinating that there is such a low correlation between K% and 1st% – especially if you watch a lot of baseball and hear so many announcers talk about the importance of throwing strike one. Because 1st% has a relatively low year-to-year correlation, I will not focus on its importance.   My opinion, based on these correlations is that the pitch stats to focus on are:StI%Strk%Sw/In%Cntc%StS% So now that we know which stats we want to focus on, what do they mean in terms of Orioles pitchers?   This chart shows the ERA levels and expectancies at the various pitch data levels using three year cumulative data. 

StI%            Low         High        Median          AVG         <4.00       Between   >5.00

Lo                1.55         4.67         3.41                3.24         73%         23%            3%

Mid              1.99         6.46         4.19                4.19         38%         44%            17%

Hi                3.34         6.33         4.79                4.83         9%           63%            28%

The three-year average StI% (percentage of strikes thrown that are put into play) for the pitchers in this study is 31%, with 27% and 34% being at the extremes.  This chart tells us there’s a 73% chance that a pitcher with a 27% or lower StI% will finish with an ERA below 4.00, while there’s a 28% chance that a pitcher with a 34% or higher StI% will finish with an ERA above 5.00.   So how do the 2007 Orioles pitchers rate through Saturday? 

Ray              25%

Bedard          25%

Parrish          26%

Williamson    27%                   

Burres          29%

Walker          30%

Cabrera         31%

Bradford        33%

Guthrie         33%

Williams       33%                   

Trachsel        38%

Baez            41%

Strk%              Low        High        Median          AVG         <4.00       Between   >5.00

Hi                    1.55        5.27         3.45                3.56         62%         31%            8%

Mid                  2.15        6.33         4.19                4.17         41%         43%            16%

Lo                    3.29        6.46         4.77                4.77         14%         59%            28%

The league average Strk% (percentage of pitches thrown that are strikes) for the pitchers in this study is 63%, with 66% and 60% being at the extremes.  This chart tells us there’s a 62% chance that a pitcher with a 66% or higher Strk% will finish with an ERA below 4.00, while there’s a 28% chance that a pitcher with a 60% or lower Strk% will finish with an ERA above 5.00.   Orioles pitchers in 2007:

Bradford        69%

Walker          67%                     

Guthrie          65% Williams        65%

Bedard          64%

Ray               63%                     

Williamson    59%

Burres           59%

Cabrera        58%

Trachsel       57%

Parrish          56%

Baez             54%

Ouch.  This may be the single biggest stat that bothers me about Daniel Cabrera.  From what I’ve seen from eyeballing things, once a pitcher establishes himself as a sub 60% Strk% pitcher he typically stays there.  The only pitcher I’ve seen who has defied this is Randy Johnson. Of the 161 pitchers in the study, Bradford had the second highest Strk% (71%), while Cabrera had the 5th worst (58%). 

Sw/In%            Low     High         Median          AVG         <4.00       Between   >5.00

Lo                    1.55     5.18         3.45                3.32         73%         23%            3%

Mid                  2.62     6.46         4.25                4.24         38%         44%            17%

Hi                     3.39     6.33         4.78                4.80         9%           63%            28%

The three-year average Sw/In% for the pitchers in this study is 42%, with 37% and 47% being at the extremes.  This chart tells us there’s a 73% chance that a pitcher with a 37% or lower Sw/In% will finish with an ERA below 4.00, while there’s a 28% chance that a pitcher with a 47% or higher Sw/In% will finish with an ERA above 5.00.  

Parrish          34%

Ray               35%

Bedard          35%

Williamson    36%                     

Walker          39%

Burres           42%

Cabrera        44%

Guthrie          46%

Bradford        46%                     

Williams        48%

Baez             53%

Trachsel       57%

Trachsel had the third worst rate in the study.  This chart and the StI% chart are a couple of good examples of why I’m such a big fan of Erik Bedard. 

CntC%            Low     Hgh         Median          AVG         <4.00       Between   >5.00

Lo                    1.55     4.87         3.37                3.26         78%         22%            0%

Mid                  1.99     6.46         4.31                4.28         36%         46%            18%

Hi                     2.99     6.33         4.60                4.61         19%         56%            25%

The league average Cntc% (percentage of strikes thrown in which the batter makes contact) for the pitchers in this study is 80%, with 75% and 84% being at the extremes.  This chart tells us there’s a 78% chance that a pitcher with a 75% or lower Cntc% will finish with an ERA below 4.00, while there’s a 25% chance that a pitcher with an 84% or higher Cntc% will finish with an ERA above 5.00. 

  

Williamson    64%

Parrish          69%

Walker          74%

Ray               75%

Bedard          75%                     

Burres           78%

Cabrera        79%

Williams        80%

Baez             83%

Guthrie          83%                     

Bradford        85%

Trachsel       88%

One thing I like about this staff is that when they throw strikes they’re hard to hit.  Trachsel tied for the second highest rate in the study. 

StS%            Low        High        Median          AVG         <4.00       Between   >5.00

Hi                  1.55        5.28         3.40                3.42         71%         26%            3%

Mid               1.99        6.46         4.32                4.27         37%         45%            18%

Lo                 2.99        6.33         4.61                4.60         20%         56%            24%

The league average StS% (percentage of strikes thrown in which the batter swings and misses) for the pitchers in this study is 15%, with 18% and 12% being at the extremes.  This chart tells us there’s a 71% chance that a pitcher with a 18% or higher StS% will finish with an ERA below 4.00, while there’s a 24% chance that a pitcher with a 12% or lower StS% will finish with an ERA above 5.00.  I’ll admit that this is my favorite pitch data stat. 

Williamson    27%

Parrish          23%

Walker          20%

Bedard          18%

Ray               18%                     

Burres           15%

Cabrera        15%

Williams        14%

Wright           14%

Baez             14%                     

Guthrie          12%

Bradford        11%

Trachsel       8%

Trachsel tied for the highest rate in the study. Trachsel has obviously proven that a pitcher can still succeed while not doing well in pitch data stats.  The problem is that his margin for error is so much greater than it is for other pitchers.

Posted in Uncategorized | Leave a Comment »

Tejada and Gibbons – Where’s the Power?

Posted by cookinwithgas on May 10, 2007

There’s been some discussion on the Orioles Hangout Message Board about the lack of power shown by Miguel Tejada and Jay Gibbons this season.  As seemingly every such discussion these days, this one began with the question of whether (a lack of) steroids has played a role.  I can tell you that I’m neither qualified  nor willing to answer that question.  One thing I do think I’m qualified to do, however, is to analyze a set of stats and try to gain some insight from that analysis.

I wrote this article towards the end of last season in which I talked about Tejada’s increasing tendency to hit more groundballs.  The fact is that the percentage of balls Tejada puts into play that are groundballs has increased every season since 2003 – from 44.04% in 2003 up to 52.59% this season.  This article received some criticism when it came out because many felt the reason for his increase in ground balls is because he’s being pitched differently due to a lack of protection from his teammates. 

What I didn’t say then that I should have said is that even if it is true that he’s being pitched differently, that’s not the point.  The point is that, for whatever reason, he’s putting the ball on the ground much more often than he’s putting it in the air – and the end result is fewer extra base hits.  Think about this – the Orioles clean up hitter is one of the most prolific single’s hitters in the game today.  Having a prolific single’s hitter in the cleanup spot isn’t exactly a recipe for offensive success.  To Tejada’s credit, his propensity for hitting singles as opposed to home runs has not kept this from being his best season in terms of RC/27.

To the data.  I put together this file to show my work.  First, let me explain what I did.  Tejada has been pretty consistent – he has put 579, 593, 572, and 575 balls in play the last four seasons.  I wanted to normalize things, so I multiplied all of his prorated all of his batted ball data to 600 balls in play.  I then put together the graphs to show his trends.  Yes, it would have been easier to just show the percentages, but I wanted to be different.

Check the top left graph.  On normalized basis, he has hit 264, 279, 283, and 306 ground balls the last four years.  He is on pace to hit 316 ground balls per 600 balls in play this season.  He hasn’t been quite as consistent with line drives – even though if you take out last season, he does have a consistent declining trend.

Fangraphs doesn’t separate outfield flyballs from infield flyballs, but I like to do so.  By not separating them, you get the bottom graph – which shows that he is reversing the declining flyball trend.  The problem is that he’s on pace to hit more infield flyballs per 600 balls in play – by a very wide margin.  That obviously isn’t a good thing. 

Look only at his outfield flyballs per 600 balls in play, and you see a trend you don’t want to see from your cleanup hitter.  The number of outfield flyballs he’s hit per 6oo balls in play – 189, 179, 177, 149, and 129.  So even if 20% of all flyballs he’s hits becomes home runs he’d end up with 26 home runs this year.  There are three problems.  One is that his best HR/OFFB rate over this time has been 19.2% (which would equate to 25 HR), another is that his combined rate the previous four seasons was only 16.5% (22 HR), the final problem is that only his 2007 rate is only 8% – which would equate to only 10 HR.  If I had to guess, I’d say he’ll end up with right around 20 home runs.

As a Tejada and Orioles fan, I’m really hoping that OH’er Frobby is right in that Tejada’s a slow starter in the power department.  My fear is that he will continue to hit so many ground balls, and that his career high .377 BABIP (which is likely helped by his high number of ground balls, but is hurt by his relatively low LD%) will come back to earth.  If those my fears come true, they may be wishing they had traded him while his value was still high.

Continue down on the PDF file, and you’ll see similar graphs for Gibbons – with his Balls In Play normalized to 550.

Thanks for reading.

Posted in Uncategorized | Leave a Comment »

The Orioles and LI

Posted by cookinwithgas on May 6, 2007

I have posted additional information on the stats page.  One addition is the 2007 wOBA leaders amongst qualified batters (through May 3).  The stats needed for the formula are from Baseball Prospectus. 

 I have also added stats that show how Orioles pitchers and batters have done in the various levels of Leverage situations - Low, Medium, High, and Very High.  I’ve never seen them separated in this fashion.  Of course, that probably just means that I’m the only one who sees value in doing so.  We always hear about how certain batters do in clutch situations vice non-clutch.  I’m thinking this shows us just this.  Of course, I also have to give the standard sample size warning.

So what do the numbers tell us?  They’ve only had 31 opportunities in Very High Leverage situations, so I’ll skip those numbers initially.  Orioles batters actually perform better in Medium Leverage situations than they do in Low Leverage situations.  Their OPS by level (Low / Medium / High):

637 / 781 / 689

So they actually perform much better in Medium Leverage situations than they do in Low Leverage situations, before coming back to earth somewhat in High Leverage situations. 

One concern that I have is that their BB% and K% each goes in the wrong direction as the situation gets tougher:

BB% – 8.7 / 7.1 / 5.2

K% – 15.0 / 15.8 / 24.3

It would be interesting to see how these numbers compare to league averages.

The team leader in OPS in Low Leverage situations is Freddie Bynum (who saw that coming?), who was helped immensely by his home run and only eight opportunities.  The true leader is Mora (with only a .770 OPS).  Brian Roberts walked 20% of the time in these situations.

Kevin Millar leads the team in Medium Leverage situations with a 1.289 OPS, not to mention that he walks 19% of the time.  Based on WPA, Millar counts for a win all by himself with .469 WPA points.

Miguel Tejada leads those with at least 10 opportunities with a .904 OPS in High Leverage situations – even though he has yet to walk in 16 plate appearances.  Patterson has really hurt the team in these situations (accounting for almost 1 loss all by himself).

Patterson, Roberts, Millar, Markakis, Gomez, and Gibbons have all come through in Very High Leverage situations.

I also track how the team performs in situations in which the LI was 2.0 or higher.  Through their first 30 games, the Orioles had 290 such situations – 145 offensively, 145 defensively.  This works out to 12% of all plays. 

Starting pitchers produced positive results in 40 of 62 opportunities.  They had an overall WPA of .645 (so they have been 1.29 wins over .500).  They’ve held opposing batters to a .598 OPS in these situations.  The downside is that opposing batters have walked 17% of the time in these situations.

Relievers produced positive results in only 47 of 83 opportunities.  They had an overall WPA of 1.136 (so they have been 2.72 wins below .500).  Opposing batters have to the tune of a .873 OPS (and .410 OBP) in these situations.  Ouch.

Position players produced positive results in 49 of 145 opportunities (34% success rate compared to 35% in other situations).  They had an overall WPA of .065 (so they have been 0.13 wins over .500).  They’ve posted a .771 OPS in these situations. 

Overall, they have a WPA of (-).543 in these situations.  The overall team WPA is about (-)1.000 (by virtue of having lost two more games than they’ve won), which means they have about a (-).457 WPA in sub 2.0 Leverage situations.

NOTE: The data needed for this article is from the great site, Fangraphs.

Posted in Uncategorized | Leave a Comment »

Posted by cookinwithgas on May 3, 2007

Peter Angelos, if you’re reading this, BLOW IT UP!  And I don’t just mean the team on the field.  Start at the relative top – with Mike Flanagan and Jim Duquette.  About the only member of the front office worth keeping is Jordan.  Replace Perlozzo, and the entire coaching staff.  Probably the only player I wouldn’t make available would be Markakis.

Unfortunately, none of those things will happen.  I get the feeling that if this team loses 90 games, everyone except maybe Perlozzo will keep his job (and it won’t surprise me if he returns), and the front office will feed us the same load of crap – “we need two big bats.”

End of rant – and yes, that felt better.

——————————————————————————————

I wrote another article for Orioles Hangout on WPA.  I love that stuff.  In fact, keep looking, and in a few days I’ll be posting Orioles related information pertaining to WPA and Leverage Index.  I’ve been tracking the Orioles performance in each of the four levels of LI – Low, Medium, High, and Very High Leverage. 

I’m hoping to post this data in the stats section on Thursday.  I’ll provide some comment on it as well.

Posted in Uncategorized | Leave a Comment »