Sunday, April 7, 2013

Something is Wrong with Relief Pitcher WAR

In 2010, both Brian Matusz of the Baltimore Orioles and Neftali Feliz of the Texas Rangers were rookie-elligible pitchers. Matusz started 32 games for the Orioles, pitching 175.2 innings (around 5.5 innings per game), won 10 games and lost 12. He gave up 88 runs, 84 of them earned, for an ERA of 4.30, slightly below league average. He struck out 143 (7.3 per 9 innings), walked 63 (3.2/9 IP), and allowed 19 home runs (1.0 per 9), numbers good for a 4.05 FIP. Felix, meanwhile, appeared in 70 games, all in relief, 59 of which he finished (which led the league). He pitched 69.1 innings, had a 4-3 record, and racked up 40 saves, third in the league behind Rafael Soriano and Joakim Soria. He allowed 21 runs, all earned, for an ERA of 2.73 and an adjusted ERA+ of 165. He struck out 71, walked only 18, and allowed just 5 home runs, for per-nine-inning rates of 9.1 K, 2.3 BB, and 0.6 HR and a 2.96 FIP. Oh, and he did all this in Arlington, a great hitter's environment (which is accounted for in the ERA+ number but nothing else). Those numbers, apparently, gave Brian Matusz, slightly-below-average not-particularly-innings-eating starter, 2.8 fWAR and 3.0 bWAR.* Dominant closer Neftali Feliz, meanwhile, got 2.0 fWAR and 2.3 bWAR.

This can't be right, can it? Is it really true that a mediocre starter is more valuable than a great closer? This is an incredibly counter-intuitive result, so we should be skeptical of it. Of course, its counter-intuitive nature doesn't mean we should dismiss it out of hand; for a lot of people it's counter-intuitive that a solid defender at a medium defensive position like second base or center field who draws a lot of walks can be more valuable than the guy who leads the league in home runs and RBI but plays left field, badly, can't run, and draws about a dozen walks a year. But it's true, and the Wins Above Replacement statistic basically forces you to take all those other things besides the flashy power into account. So maybe it is true that Matusz was better than Feliz in 2010, or more valuable anyway, despite how weird that feels to me. But boy does it ever feel weird. And it's not the case that I can't back up my weirdness with objective numerical analysis.



Consider Win Probability Added. Using the pretty rigorously defined Win Expectancies for every possible state of a baseball game, in terms of run differential, inning, outs, and bases occupied, this stat tells you how much each event of a game shifted the odds (under neutral assumptions about the skills of the various players) on the game's ultimate outcome. It's a "story" statistic, in that it tells you how the game progressed but isn't necessarily great for evaluating players overall. Batters, for instance, don't really control whether they come up with the bases loaded or the bases empty, in an extremely important spot or a meaningless one. A starting pitcher might pitch seven scoreless innings and, if his offense is also getting blanked, rack up a massive WPA, but if his team puts up seven runs for him he'll barely be adding anything. It's the exact same individual performance, but in a different narrative context. We do tend to view the shutout when you're dueling a shutout from the opposing pitcher as more important than the shutout when you've got a huge lead, but it doesn't make sense to view that added importance as reflecting the pitcher's own skill level or anything.

Except for relief pitchers. Relief pitching is, after all, an intimately narrative occupation. You, the relief pitcher, are brought into the story at a particular point, deliberately, because your manager thinks you are the right person to be pitching at that moment in the story. Maybe you're a lefty specialist, and the moment is that it's the eighth inning with the other team's slugging lefty clean-up hitter coming up with two outs and runners on. Or maybe you're the closer, and the moment in the story is the conclusion (hopefully). Relief pitchers do control the circumstances under which they pitch, because their qualities as a pitcher determine under which circumstances they're brought in to pitch. So, let's evaluate Feliz and Matusz in 2010 using Win Probability Added to give full credit to Feliz for his place in the narrative of each game.

Matusz added 0.6 WPA over the course of 2010. That's better than nothing, obviously, but it's only slightly more than the WPA necessary for your team to win one game. (That's rigorously defined at 0.5, since each team starts at a 50% Win Expectancy and, if they win, ends up at 100%.) If we control for the vicissitudes of happenstance by dividing the WPA from each at-bat by the Leverage Index of the at-bat, a metric of the relative importance of each play in a game, Matusz' performance gets a little bit better, at 0.8 WPA/LI for the year. Feliz,  however, generated 3.5 WPA on the year. That's the rough equivalent of 7 wins. Even if we mod out by Leverage Index, Feliz still registers at 1.9, more than twice what Matusz had. With the understanding of WPA as a "story" stat, we can say with a fair amount of confidence that Neftali Feliz did more to improve his team's story in 2010 than Brian Matusz did. A lot more.

Why, then, does Matusz get so much more WAR than Feliz? Simple: he pitched more innings. The basic approach to pitching Wins Above Replacement is to take the number of innings a player pitched, see how many runs they gave up over those innings (or, for Fangraphs, how many runs you would've expected them to give up based on their strikeouts, walks, and home runs allowed), how many runs a replacement-level pitcher would've given up over those same innings, and, well, subtract. That makes an awful lot of sense for starters. It has the consequence of denigrating the very concept of relief pitching, because they simply don't pitch very many innings. The very, very best years of relief pitching ever come out a little below the nominal All-Star level for a starter. I don't know exactly how Fangraphs does it, but I do know that Baseball-Reference makes some effort to adjust their relief pitcher WAR for the leverage of the spots they get used in. It doesn't seem to be sufficient. One way or another, it's impossible for a dominant, dominant relief pitcher to register as more valuable than a slightly-above-average starter. This doesn't seem right, and Win Probability Added seems to suggest that it isn't right.

There is one little wrinkle I haven't addressed yet. Baseball-Reference, at least, includes an adjustment for pitcher's role in its WAR calculus. This amounts to a penalty on relief pitchers. The logic here is that, as a class, relief pitchers put up better numbers than starters. This seems to suggest that it is easier to find a reliever who'll put up a 130 ERA+ than to find a starter who'll do so. I suppose that sort of makes sense, given the whole "replacement" thing. But I wonder whether it isn't somewhat perverse in this context. There's been a fad recently for teams trying to convert dominant relievers into starters: Daniel Bard with the Red Sox, Feliz with Texas, and the ultimately-rejected plan to do the same with Aroldis Chapman, he of the 103-mph fastball, in Cincinatti with the Reds. Oh, and of course Joba Chamberlain with the Yankees (heheh). I have a feeling this is inspired by the thought that, well, starters are just more valuable than relievers. They pitch more innings, after all, and since starters as a whole are lousy you'll make up for the apparently-reduced effectiveness for your converted reliever by replacing another even-more-mediocre starter. Except... it never seems to work. They always lose so much effectiveness that it doesn't seem like it was worth it. Exchanging a dominant reliever for a good starter seems like a good idea in advance, but when instead you get a mediocre starter it doesn't seem worth it in retrospect.

The moral of this story is that Wins Above Replacement seems to seriously devalue relief pitchers, arguably more than they deserve. Yes, they pitch fewer innings and yes, it does seem to be easier to put up good numbers in relief than as a starter. Still, when Win Probability Added tells us that a dominant closer can do about as much to make his team's story a good one as a dominant starter (Soria led the A.L. in WPA that year, only a short distance behind Roy Halladay of the Phillies in his Cy Young year), it seems a mistake to adopt as a comprehensive evaluation metric a system that denies the possibility of the most valuable pitcher in the league being a reliever. Think of it this way: suppose you were the G.M. of an expansion team conducting an expansion draft, picking players off the rosters of the current MLB teams. Would you rather take 2010 Neftali Feliz, or 2010 Brian Matusz?

No comments:

Post a Comment