Tuesday, January 8, 2013

How Do You Adjust for an Environment of Cheating?

Continuing with my Hall of Fame theme from today, this post will consider how one should go about mentally adjusting the numbers put up by various different kinds of baseball player who might've been competing in an era featuring widespread cheating, like the Steroid Era of the 1990s and early 2000s that most of the new players on this year's Hall of Fame ballot hail from. Here's the simple model of the environment of cheating that I'll be using: suppose that, at a certain point, a substantial portion of the hitters in the league start using some illegal substance that makes them better hitters. Nothing else changes, so one difference between this model and the actual Steroid Era is that quite a few pitchers used steroids. So, how should we go about correcting for this cheating, if we want to try and figure out what the "true talent" landscape was like? Let's look at it from various stand-points.

The Cheating Hitter
It's more or less obvious how we should treat the numbers put up by those hitters who were cheating: we reduce them somewhat. How much, or in what way, might not be obvious, but the basic idea is that we should penalize them a bit to take back what they improperly gained. Simple enough.

The Non-Cheating Hitter
Likewise, it seems sensible that the proper approach to those hitters who didn't cheat is just to leave their numbers alone. The caveat is just that, whereas if we didn't correct for the cheating we would view most of the clean hitters as having been overshadowed by the cheaters, we should value their numbers a little more highly, because the people overshadowing them weren't coming by it honestly. And given the prevalence of adjusted-for-league-average stats, that's important: the league average for which those numbers are adjusted is itself skewed! Take John Olerud, for example, a presumptively-clean player whose career, from 1989 to 2005, coincided almost exactly with the steroid era. His .295/.398/.465 slash line, that of a truly great hitter who didn't specialize in hitting home runs, only got him a 129 OPS+ during his actual career, but players with very similar raw OPS numbers, around .863, in 2012 had more like a 140 OPS+. If we pretend that steroids never entered the league, in other words, a clean player like Olerud would look a lot better, not because his numbers got any better but because the frame of reference got uninflated. (Likewise, a defensively-gifted shortstop putting up a .246/.289/.310 line in today's game, say, Brendan Ryan, would look a lot better than it did when Rey Ordonez did it during the height of the steroid era.)

The Pitcher
The correct thing to do in analyzing pitchers from such an era is to give them a little extra credit than their basic numbers would suggest. After all, forces completely beyond their control caused them to face unfairly good hitters, which should unfairly depress their numbers, so they deserve to have those numbers reflated. All their numbers, that is, except league-adjusted ones. All the pitchers in the league had to face the same batch of cheating hitters, after all. So Pedro Martinez's 154 ERA+, the best ever for a starting pitcher, is completely legitimate and should be taken at face value (assuming we don't think he was on steroids himself, which I think we don't), despite the fact that his actual ERA is higher than that of, say, Andy Messersmith.

Part of the moral that I'm trying to get at in this post is that it's very easy to adjust for league average, but that doesn't always tell you the whole story. The 1960s, for instance, were a very low-run environment, so pitchers' numbers get adjusted downward and hitters' numbers upward. If that low-run environment was just caused by something in the air, or, more plausibly, in the baseball, then that makes sense, and giving pitchers a bit more credit and hitters a bit less after the 1969 lowering of the mound is quite proper. But if the dominance of pitching came about because there happened to be a glut of really good pitchers all at the same time, then we shouldn't really be penalizing all these great pitchers for happening to hit the league at the same time as one another. We should, however, continue to give the hitters bonus points for having had to face such tough pitching. Conversely, if the true cause was simply that all the hitters were kind of lousy, then we shouldn't be giving the hitters any credit and we are quite right to exercise skepticism regarding the pitchers' performances during this era. Simply adjusting for league average can't tease this out. (That's part of my theory on why Babe Ruth isn't the best player ever: by being the first person to figure out that it was a good idea to hit lots of home runs, he got to play quite a few years having acquired for himself the right to play in a league of mediocre hitters, which artificially inflates his adjusted stats.)

This kind of analysis, by the way, is basically why I think Ken Griffey, Jr. will be inducted into the Hall of Fame unanimously on his first ballot. He clearly deserves it on the merits, and because he's the only one of the mid-1990s sluggers who people think was squeaky clean I think they'll all just be so damn eager to have someone they can vote for that everyone will vote for him. The only problem might be the idiots who simply refuse to vote for anyone who played in the Steroid Era. (Seriously, can you believe how awful some of these arguments are?)

No comments:

Post a Comment