Tuesday, September 25, 2012

R.A. Dickey and Median vs. Mean

The other day I saw an analysis of the various National League 2012 Cy Young Award contenders by average game score (excluding Craig Kimbrel, who being a relief pitcher doesn't get game scores). Game Score, by the way, is a metric devised by Bill James to evaluate the quality of a starting pitcher's performance on a given night, where 50 is average and up is good. (Over 100 is technically possible, and under 0 may be as well, though I'm not sure I've ever seen it.) Now, since this was a Mets fan site on which I saw this analysis, I wasn't surprised to see that R.A. Dickey came out on top by it. Good for him, and in my opinion he deserves to win the award. But I'd like to delve a little further into the data and look at median Game Score, and then possibly some other variations as well.



Let's get a baseline by looking at those average numbers: for Dickey it's 62.4 in 31 starts; for Clayton Kershaw of the Dodgers it's 62.0 in 31 starts; for Matt Cain of the Giants it's 60.4 in 30 starts; for Gio Gonzalez of the Nationals it's 59.8 in 31 starts; and for Johnny Cueto of the Reds it's 57.1 in 31 starts. But now let's look at the median number for each pitcher: for Dickey we've got a 64; for Kershaw it's a 66; for Cain it's 60.5; for Gio it's 62; and for Cueto it's 60. Okay, well, that didn't make things look much better for R.A.; mainly it just pushed Matt Cain back, which isn't surprising given his perfect game, good for a 101 Game Score. Note, of course, that all of these starters look better by this metric than by the average numbers, because it's easier to have really sucky games than really great games so the distribution is skewed.

Now let's take a slightly different tack, looking at the number of starts each candidate has had in each ten-point range of Game Score. So, the first number will be the starts of 90 or better, the second number will be the starts of 80 to 89, and so on until the last number, which will be the starts of 29 or worse.

R.A. Dickey: 2/5/6/5/4/6/2/1
Clayton Kershaw: 0/1/12/6/6/3/2/1
Matt Cain: *2/1/4/9/5/6/3/0
Gio Gonzalez: 0/3/7/7/8/4/1/1
Johnny Cueto: 0/0/7/11/6/2/2/3

The asterisk, of course, is Cain's perfect game, which exceeded 100. Now, one thing I notice is that Dickey's got a much less bunched distribution than any of the others, with no one region of the scale having more than 6 of his starts. I also notice that Kershaw, Gonzalez, and Cueto haven't had a single start in the 90s or higher between them, and that Cueto hasn't even broken 80! What we see, in other words, is that a lot of Dickey's competition has been very consistently quite good, but they haven't produced the kind of dominance R.A. has on anything like the scale he has. The exception is Cain, who has the best individual game of this whole bunch, but unlike Dickey he's shown a complete inability to operate consistently in the 80s range, with just one start there to Dickey's 5, and just 4 starts in the 70s to Dickey's 6.

Let's group this data another way: how many starts did each pitcher have in the "really good" category, how many in the "okay" category, and how many in the "not-so-good" category? I think there's some usefulness in looking at "really good" both as 80-and-up and as 70-and-up, and in looking at "not-so-good" as both 39-and-down and as 49-and-down. So:

>80 Starts: Dickey 7, Kershaw 1, Cain 3, Gonzalez 3, Cueto 0
>70 Starts: Dickey 13, Kershaw 13, Cain 7, Gonzalez 10, Cueto 7
<50 6="6" 7="7" 9="9" cain="cain" cueto="cueto" dickey="dickey" gonzalez="gonzalez" kershaw="kershaw" p="p" starts:="starts:"><40 2="2" 3="3" 5="5" cain="cain" cueto="cueto" dickey="dickey" gonzalez="gonzalez" kershaw="kershaw" p="p" starts:="starts:">50-80 Starts: Dickey 15, Kershaw 24, Cain 18, Gonzalez 22, Cueto 24
40-70 Starts: Dickey 15, Kershaw 15, Cain 20, Gonzalez 19, Cueto 19

Here's another point of data: Dickey has the worst game score of any of these pitchers. Why is that a good thing for him? Because you can only lose each game once. That start was very early in the season, in Atlanta, and it was raining and his knuckleball didn't work properly and he got lit up. As in, 4.1 innings, 8 runs, 3 home runs lit up. And the team lost that game exactly as much as Gio Gonzalez' team lost a game in which Gio pitched 5 innings of 2-run ball with a game score of 52. Bunching up your runs into a handful of really bad starts, in other words, is sort of like creating a "vote-sink" district for your opponent's political party: yes, that district or those games are utterly out of reach for you, but you then get to win all the other games/districts because all of your opponent's voters/runs have been poured into surplus in that small area.

Or, to put it another way, here's ERA and average Game Score dropping the worst three starts, with actual ERA listed in parenthesis:

Dickey (2.66): 2.07, 65.9
Kershaw (2.68): 2.17, 65.1
Cain (2.86): 2.42, 62.9
Gonzalez (2.84): 2.38, 62.8
Cueto (2.84): 2.27, 60.8

Now, Johnny Cueto gains two-tenths of a point on Dickey in terms of game score from dropping the worst three starts, but Dickey improves in every other comparison. In other words, Dickey's overall numbers are being weighed down by a handful of his worst starts, which you can assume the team will lose anyway, more than those of his rivals. It's not a coincidence that Dickey is 19-6, despite having only a slight advantage in ERA over his competition and playing for a much, much worse team (the Mets have clinched a losing season, while the Reds, Nationals, and Giants will win their divisions and the Dodgers are in competition for a Wild Card spot, oh, and Kershaw's got the worst win-loss record of the five by far). He's packaged his season into a very large number of good and/or great starts and a smaller number of utter stinkers. That's why the Mets keep winning Dickey's starts even when they can't win anything else.

So, I guess the conclusion is, any way you slice it R.A. Dickey deserves to win the Cy Young Award? Or, at least, deserves it over any of the other starters.

No comments:

Post a Comment