06 August 2008

Adjusted ERA+ and leaders

The most important stat for determining the quality a pitcher (at least over the course of a career) is ERA. There's really no argument that can be made against it. Wins are far too dependent on how the rest of the team (i.e., batters) do; strikeouts certainly show domination, and career strikeout numbers show longevity as well. But in terms of doing his job, preventing runs, career ERA is the single number that shows how well this was accomplished. (Okay, perhaps ERA over a 5-10 year peak period might be better, to not diminish the careers of those who choose to keep playing into their "worse than average but far better than replacement" years). The main problems with ERA are that it does not adjust for ballparks, leagues, mound heights, steroid usage of batters, or other factors that changed over the years.

Fortunately, the concept of adjusted ERA takes care of all this. And adjusted ERA+ makes reading adjusted ERA even easier. An aERA+ of 125 roughly means an ERA 25% lower than the league, adjusting for park effects etc. (okay, actually it's 20% lower; since an aERA+ of 200 means an ERA 50% of that of the league). An aERA+ of 125 also means you're likely to need to prepare a speech for Cooperstown. The vast majority of pitchers above 125 and almost all eligible starters above 130 are in the Hall.

For a long time, there was a big gap between aERA+ #1 and #2, with Pedro Martinez's 160 towering over Lefty Grove's 148. This disappointing season has lowered Pedro to a mere 157, but it would take more than a few years of (relative) mediocrity to drop him below Grove or Walter Johnson. (Next on the list, for those curious, is Dan Quisenberry which is probably the best Hall of Fame case for Quisenberry that could ever be made).

But why bring all this up now? Because despite Martinez's struggles, there's now (as of last week) a much bigger gap between #1 and #2:
  1.  Mariano Rivera   197
2. Pedro Martinez 157

What happened? Rivera pitched his 1000th inning last week, making him eligible for career ERA awards. Perhaps the IP requirement should be raised so that the list isn't dominated by closers. But maybe that's not necessary. With the development of the minor-league (or even high school) relief specialist, fewer and fewer will reach this plateau with each passing year. Bochy and Bud Black's cautious use of Trevor Hoffman since he returned from surgery six years ago has kept him from breaking the 1000 IP mark. As of this evening, he's 22 IP short. Since he's been averaging .35 IP per game over the past four seasons, he's likely to end the season (and possibly his career? the fan in me hopes not) about five innings short. His aERA+ of 145 would place him 9th or 10th all time. Prior to this disastrous season, he would have been third.

(For the complete list, see Baseball-Reference.com)

The only other reliever on the ERA title horizon is Billy Wagner, whose aERA+ of 180 would place him comfortably in second, though still at a great distance below Rivera. However with only 819 IP, he will likely need three more seasons to qualify. And anything could happen by then. Troy Percival, with an aERA+ of 151 but only 687 IP, averaging fewer than 50IP/year at age 38, seems highly unlikely to qualify.

Of the other active high saves players, Roberto Hernandez already qualifies at 131; a Hall of Fame number for a starter, but unlikely to get a notice for a reliever with "only" 326 saves. Jose Mesa is what we always thought him to be, exactly an average pitcher (aERA+ of 100), and thus probably a below average reliever. Todd Jones is a little better, but nothing to write to Cooperstown about. Ditto Jason Isringhausen, 100IP short; though kudos to him for having such dreadful recent performances that his manager is bringing back the (great) idea of a closer by committee.

And as far as the brilliant young closers of today. Yes, very exciting. Yes, I know Papelbon has an aERA+ of 269(!); K-Rod at 186; Nathan at 152; Lidge at 147. But for an article on a stat that requires 1000 IP, please ring again when there's just a few hundred innings left to go. Why so pessimistic? Robb Nen, John Wettland, Randy Meyers, Tom Henke, Jeff Montgomery, Rod Beck, Ugueth Urbina, and (ah heck, throw in the newest member of the fraternity) Eric Gagne. If these names mean something to you, you'll understand.

Labels: , , , ,

04 August 2008

Hate the stat, love the statter

I spent part of the last week of July reading the various, mostly extremely short, obits of Jerome Holtzman, sportswriter extraordinaire (and the Chicago cubs follower who did not incite the city of Sandy Eggo to riot by calling the selling of Sushi and the installation of baby-changing stations in Men's rooms at S.D.'s ballpark "the beginning of the decline of America." q.v., one M. Royko). Throughout all the obits, the discussion centered on Holtzman's invention of the "save" as baseball statistic.
(N.Y. Times obit.)

What puzzled me the most is, with the notable exception of The Hardball Times, how often commentators were blasting Holtzman for what a bad statistic the save is. To be sure, it is a bad statistic: it rewards (in the end, financially) players for participating in one facet of the game that has never been shown to be more significant than several others. It gives the same benefit to pitchers who bail their team out of the toughest of all situations as to closers who record a few outs with a pretty sizable lead. But consider again what the save replaced: a world where relief pitching had no worth on paper (and in a world where the "Win", another terrible stat, was even more important). Also consider what it meant in the early 60s to introduce a new stat into the world. Computing saves retroactively for every baseball player was not as simple as a few lines of Perl and a download from retrosheet. And, the tinkering that Holtzman and others applied to the save suggests that he did not believe he had discovered the perfect, never to be supplanted statistic. Admire what he did for the time in which he lived.

No, marshal your scorn for those who with the hindsight of time and better access to information still defend the save. Those who cling to a bad idea are far worse than those who throw the idea out to the marketplace in the first place.

P.S. I wish there were a stat with more sophistication than Ari Kaplan's "Fan Save Value", but with fewer lookup charts than his more accurate "Save Value." The former is not much of an improvement over the Save (and couldn't easily translate to a generalized "relief value") while the latter has too many "hidden" constants relating to expected runs (that aren't really constant, but actually change from year to year that I can't see it actually being adopted.

Here's a simple(r) formula for calculating expected runs, that you can carry around with you:

ER = (5 + total_runners + 3 * (total bases occupied))
* outs_left / 30

total bases occupied simply means to sum up the base numbers with runners. So runners on second and third is 2+3 = 5. Outs left is 3 - outs. So no outs = 3.

Here's how this version of expected runs compares to the standard expected runs chart:

Outs 1B 2B 3B My ERs Table Difference
0 0 0 0 0.50 0.54 -0.04
0 0 0 1 1.50 1.46 0.04
0 0 1 0 1.20 1.17 0.03
0 0 1 1 2.20 2.14 0.06
0 1 0 0 0.90 0.93 -0.03
0 1 0 1 1.90 1.86 0.04
0 1 1 0 1.60 1.49 0.11
0 1 1 1 2.60 2.27 0.33
1 0 0 0 0.33 0.29 0.04
1 0 0 1 1.00 0.98 0.02
1 0 1 0 0.80 0.71 0.09
1 0 1 1 1.47 1.47 0.00
1 1 0 0 0.60 0.55 0.05
1 1 0 1 1.27 1.24 0.03
1 1 1 0 1.07 0.97 0.10
1 1 1 1 1.73 1.60 0.13
2 0 0 0 0.17 0.11 0.06
2 0 0 1 0.50 0.38 0.12
2 0 1 0 0.40 0.34 0.06
2 0 1 1 0.73 0.63 0.10
2 1 0 0 0.30 0.25 0.05
2 1 0 1 0.63 0.54 0.09
2 1 1 0 0.53 0.46 0.07
2 1 1 1 0.87 0.82 0.05

As you can see, the formula slightly over-predicts expected runs (though with the important exception of the most-common occurrence, no outs, no one on, which almost balances out the rest of the error). The only case where it's over 13 hundreths of a run off is the rare case of no outs, bases loaded, which it over-estimates by 1/3 of a run. If a formula is to overestimate any situation, I'm happy with it being this rare situation: an "oh shit!" moment for any incoming reliever. In any case, it's an easier formula to remember than 24 "random" numbers.

Labels: , ,

30 May 2008

Are bullpens underused?

This post was mostly written during Spring Training, so 2007 figures are used throughout. Life prevented posting until now.

Back in the "good old days" of baseball, bullpens were nearly non-existent. Two-man rotations were common, 600 inning-pitched seasons were possible, and one man pitched every inning of an entire major league season (Wondering who? See below). Since then, we've created four-man rotations, dedicated bullpens, closers, five-man rotations, long relievers, setup men, and, coming soon to a ballpark near you, the seventh-inning specialist. And throughout all thus, grumpy old men--along with grumpy young men, grumpy old women, grumpy young women, and grumpy transexuals of all ages--have decried the changes as a weakening of the quality of starting pitching.

But is it possible that everyone is wrong? Could the problems with pitching be traced to an underuse of those 6 to 8 "guns" not considered durable enough to start a game? Let's consider some baseball axioms before we look further. I like math, so I'll use some weird symbols, but try to explain them.

Axiom 1: ∂(ERA)/∂(IP) > 0

That is just to say that we expect ERA to rise for every additional inning pitched. I'll admit that this axiom might not hold for a former AA-pitcher getting his first few innings in at the major league level, or for someone coming back just off the disabled list. But for the most part, it's hard to disagree with, whether we're dealing with innings pitched within a game or innings pitched over a season.

Axiom 2: ERA(closer) < ERA(ace) ; ERA(setup man) < ERA(#2 starter) ; etc...

The ERA of your relief squad tends to be lower than the starting pitchers. Or to put it in another way, if you only had to put your best pitcher in for one inning of work, he would almost certainly come from the bullpen. Compare the best reliever to the best starter on almost every club and you'll see that the best reliever comes out ahead. Usually even the best two or three relievers have better ERAs than aces--this holds true for great teams and miserable ones. In 2007, the Detroit tigers had three relievers log more than 40 innings with a lower ERA than their ace, Justin Verlander, and two more with a lower ERA than their second best pitcher to log at least 15 starts. The Padres (the team I know best): Even a triple-crown winning Cy Young pitcher (Peavy) had an ERA bested by a setup man, Heath Bell (with 93 2/3 IP, not a small sample); and four more relief pitchers (Hoffman, Brocail, K. Cameron, and Justin Hampson) bested Chris Young, their second best pitcher. We see the same patterns even for lousy teams: four Royals relievers outpitched Gil Meche's 3.67 ERA.

Compare the ERAs of bullpens as of mid-2007 to this analysis of starters at the end of the season. The average bullpen is about as good as the average no. 2 starter! In other words, it does not really matter if by the sixth inning your starter is still "feeling good." Unless he's your ace, or throwing a shutout or no hitter, or your bullpen is absolutely drained from a recent 18-inning game, it's time to call in some new arms. Do so and your expected chance of winning just went up.

All else being equal, every inning that you have someone on the mound with a higher ERA than someone else you could put out there is an inning of poor managing. All else being equal, bullpens should be used more until their ERAs rise to meet that of the starters.

Now here's the argument I'm ready to hear: all else is not equal. Not every inning is as important as every other. That's definitely true! The best relievers pitch in the most important situations: with the game close, and a win on the line if only he can not allow any runs. So it makes sense that you want some of your lowest ERA-men pitching then. But when do starters begin pitching? They begin with the game tied, where any run allowed or not makes a huge difference in the probability of winning or losing the game -- nearly as important a situation as what setup men and closers pitch in. But there are several members of the bullpen who tend to pitch in less important innings; that is, blowouts in either direction. So if anything, the ERAs of bullpens should be substantially higher than starters. That they are not, shows that the bullpens are being over-rested and under used.

Trivia answer: Jim Devlin of the 1877 Louisville Grays pitched every inning in a 61 game season, compiling 559 innings pitched and allowing a total of 4 HRs. Though his ERA was a very good 2.25 (146 ERA+), it says something about the way official scorers have changed over the years: though he allowed only 140 earned runs, he allowed a total of 288 runs, or 8 more unearned runs than earned.

(Before I'm accused of copying this trivia information from Wikipedia, be sure to check who added it there in the first place).

Labels: , , ,

Context and baseball statistics

From this week's MLB Power Rankings on ESPN.com:
In May, Hideki Matsui has more multi-hit games (nine) than he has games in which he hasn't gotten a hit (five)

We're supposed to take this as to mean that Hideki Matsui is doing great this month. But what does it actually mean? How many hitters would we expect to have more multi-hit games than no-hit games? 1%? 10%? Stats like this out of context drive me crazy. It's pretty easy to figure it out at least for simple cases.

The probability of not getting a hit in any at bat is (1 - Batting Average), so the probability of going 4 at bats without a hit is (1-BA) to the 4th power. Here's a little program (in Python) for figuring this out:

def noHit(BA):
return (1-BA)**4

>>> noHit(.250): 0.32
>>> noHit(.300): 0.24
>>> noHit(.400): 0.13

So we can see that for even an average player, a no hit game happens only 1 in 3 games, and for a good batter, or a good batter on a real roll, these things happen rather seldom. What about Multi-Hit Games? If you have four at bats per game, there are 11 different ways of getting two or more hits (one way of getting four hits, 4 of 3, and 6 of 2). In the chart below, x = hit, and o = not hit:

xxxx = four hits

oxxx = three hits
xoxx
xxox
xxxo

xxoo = two hits
xoxo
xoox
oxxo
oxox
ooxx

xooo = one hit
oxoo
ooxo
ooox

oooo = no hits

(For those interested, the number of ways of getting 4, 3, 2, 1, 0 hits, that is, 1, 4, 6, 4, 1 is the fourth row of Pascal's triangle). So one way of calculating the probability of multi-hit game is to find the probability of a single hit game (4*BA*(1-BA)^3) add to it the probability of a no-hit game, and subtract it from 1:

def multiHit(BA):
return 1-(noHit(BA) + 4*(BA*(1-BA)**3))

>>>multiHit(.250): 0.26
>>>multiHit(.300): 0.35
>>>multiHit(.400): 0.52

So as long as you're getting 4 ABs per game, you don't really need to be a great hitter to expect to get more multi-hit games than no-hit games. In fact, a BA of just .267 will do it for you. Returning to the original post, we see that Matsui is doing better than 1:1 in May, getting a 9:5 Multi:No ratio. How good do you have to be to get that? A .320 BA will suffice. Matsui's BA has actually been slightly better than that in May, .337, but he's also been getting just under 4 ABs a game (3.83).

4 ABs a game (or even 3.8) is really only manageable if you're not having many plate appearances that don't count for at bats -- in other words, if you're not walking much. In April, Matsui was averaging only 3.1 ABs per game. This makes it much harder to have so many multi-hit games: if you have 3ABs per game, you need to be batting above .348 to get more more multi-hit games than no-hits, and have a whopping .411 to have Matsui's ratio.

Looking closely at the numbers, there's much less to cheer about for fans of Godzilla: the rise in multi-hit games came almost entirely from a drop in walks (from 12 to 7), resulting in a OBP 40 points lower in May than in April. He did not make up the difference in SLG either, dropping 50 points there. So upon closer inspection, the numbers tell an entirely different story: in everything but BA Matsui had a May that was worse than April and below his career levels.

Labels: , , ,

01 April 2008

The sort of mixed metaphors I love

"You talk about just inside! That was about a hemidemisemiquaver inside."
-- Jerry Coleman (Hall of Fame Broadcaster, 83 years old)

Labels: , ,