Tuesday, February 19, 2013

The Dread Hand of Nate Silver

(Sorry it's been a while; I've been in a kind of law school admissions purgatory for the past couple of weeks.)

I happened to follow some links today and wind up on the Baseball Prospectus page that gives expected 2013 performances for each Major League Baseball team. They've got projected wins and losses, estimated odds of winning the division, of winning a wild card, of getting to the post-season (i.e. the previous two added together), even of getting to the not-the-silly-wild-card-game part of the post-season, and, of course, estimated odds of winning the world series. And as I cast my eye over this page, what I couldn't help noticing was the obvious influence of Nate Silver (and/or the influence of BP on Nate). Here's one example: the projections for each team's wins and losses are given to one decimal point. The Mets, for instance, are forecast to win 80.6 games this year. The odds of the Mets winning 80.6 games is precisely zero, because you can't win six-tenths of a game. I recall a point during the last election cycle where Nate publicly defended presenting his electoral vote projections to the tenth of a vote, despite the impossibility of receiving tenths of votes. (I'll save telling you what his reason was for later in this piece, for dramatic effect.)

The similarities, however, don't end after the decimal point.

Another interesting thing is that they give separate odds for winning your division and winning a Wild Card. That's intriguing! It turns out that the way they split those two scenarios from each other is that they run a (large) number of simulations of the 2013 season, using their various projection tools, and then just count up all the times when X happened, and all the times when Y happened, etc. Presumably they could generate expected odds of all 15! possible year-end standings for each league (which would imply standings in each division and in the wild-card races), though given the fact that there are more different potential standings than simulations (as in, 15! > 13 * 10^11), the odds would probably be pretty noisy. If that technique sounds familiar, it's because it's exactly how Nate Silver generates his odds for such scenarios as "Obama wins the election while losing Pennsylvania." He simply runs 25,001 simulations on a given day, and counts up how many of them feature Obama winning the election but losing Pennsylvania. Easy (if you have a good computer).

But even that isn't the end of the connection. These projections are, in a very specific way, conservative. No team is projected to win 90 games, though the Detroit Tigers come close at 89.9. No actual major league team is projected to lose 90 games, and no, neither the Marlins nor the Astros count. Only three teams in baseball are given less than a 4% chance of making the playoffs, and no team is given more than an 82.6% chance. The Mets are given a 25.8% chance of making the playoffs, and a 1.4% chance of winning the world series, and hell, a 13% chance of winning the division, which would mean finishing the season with a better record than the Nationals and the Braves. Most people, I think, would say those odds are laughably rosy. The Mets, as we all know, suck. They sucked last year, and the year before that, and the year before that, and, oh yeah, the year before that one as well, and they haven't done anything to get better, so why would they be any good this year?

The answer is a very interesting one, and is basically the key to the Nate connection: no reason, or, on the other hand, lots of reasons. There's no reason at all why we would expect the Mets to be particularly good this year. Yeah, they've been bad recently, and while they have some good players on their roster, they've got a lot of holes, and a whole bunch of their talent comes with predictable question marks. But there are a lot of things that could end up being reasons why the Mets were good in 2013. Santana might be healthy all year. Maybe Ike Davis and Daniel Murphy will have simultaneous breakout years, while David Wright shows no noticeable regression off his near-MVP-level 2012 campaign. Maybe the young pitching will pan out, or the bullpen will happen not to suck, or the outfield will happen not to suck. Maybe Travis d'Arnaud will win Rookie of the Year. Maybe Bryce Harper will get injured. Maybe neither Upton brother will play particularly well. Who knows: anything could happen. None of these things are that likely to happen, but any of them could.

That's basically a long-winded way of saying that the answer is uncertainty. They project the Mets to be an 80.6-win team, i.e. basically a league-average team (which is slightly bullish against the consensus, but hey, consensuses are never wrong, right?), but that's just the rough middle of a very rough cloud of probability distributions. The Mets might win 90 games, or 70. It's not impossible that they'll be the ones challenging their own record for futility, and it's not impossible that they'll be challenging the 2001 Mariners' record for, you know, winning stuff. And in these BP projections, not only is that uncertainty quantified, it's embraced: not just its existence, but its magnitude. I look at these numbers, and I see a system whose entire point is accurately predicting not the end result, but the level of uncertainty. And that's Nate's trademark. It's the thing, really, that he's been arguing with people like Joe Scarborough about, because in addition to not getting economics he and the intellectual cohort he represents don't really get the concepts of probability and genuine statistical analysis. And it's the purpose of the decimal point: to make it clear that these "projections" are not predictions of the final result, but rather the expected value of a probability distribution. Precisely the fact that we know for a fact that the 2013 Mets will not win 80.6 games, saying that they will has informational content of its own: there is uncertainty here! Pay attention to it, it's the important thing.

It's almost like Nate has some sort of history with Baseball Prospectus, and their methods for "predicting the future."

No comments:

Post a Comment