Thursday, May 23, 2013

Fractal Dimension for Hits

Not all hits are created equal. This is, of course, obvious: a single and a home run aren't even approximately equal in any relevant regard except constituting a base hit. But I mean more than that: not all singles are created equal, not all doubles are created equal. (Actually, all triples and all home runs are kind of equivalent to each other, so let's focus on the less-flashy kinds of hit for now.) For instance, often when a runner is on first and a double is hit, the runner will manage to score, especially if there were two outs or it's a fast runner. However, I recently saw someone fail to score from second base on a double. The hit was a high fly ball to the shallow outfield, so the runner on second had to stay near the base in case it was caught, but the batter could just keep running during all that hang-time. So, when the ball dropped in, the batter made it to second, but the runner from second could only get as far as third. (With two outs that couldn't have happened, you'd just run on contact.) Similarly, one will occasionally see a runner from second advance zero bases on a single, if it's an infield single to the shortstop hit in just the right spot. Ordinarily, though, runners from second often have a good chance of scoring on a single.

To describe these differences in the impact of a base hit on the existing runners on base, I hereby invent the concept of fractal dimensions for base hits. This is defined as the average number of bases advanced on the hit (excluding any bases taken on throws or due to errors), with the modification that any runner who scores while advancing fewer bases on the play than any runner behind him is given credit for the same number of bases as that following runner. Any hit with the bases empty has fractal dimension equal to its number of total bases, i.e. 1 for a single, 2 for a double, etc. All home runs are of dimension 4, as every runner on base scores on the play and had a runner behind them advancing 4 bases. (This is why it is necessary to give lead runners who score credit for the advances of trailing runners; otherwise a grand slam would have dimension 2.5, less than a bases-empty triple let alone a solo home run, which is obviously unsatisfying.) All triples are also, I believe, of dimension 3; while the batter does not score, every baserunner will score, and will be able to take credit for the three bases taken by the batter. This is a slightly undesirable property of this metric, as ideally I'd like to be able to distinguish different triples, but as I mentioned above, I guess there just aren't different kinds of triples: they are all bases-clearing, every single time.

A bases-clearing double, however, has dimension 2.75, because the runner from first advanced three bases, giving the runners from second and third credit for three bases as well, but the batter only made it to second. The play I described earlier, where a runner from second failed to score on a double, would have dimension 1.5: two for the batter and one for the runner, averaged together. (Note that in this case the lead runner does not get credit for the advances of the trailing runner, since he did not score.) An infield single that fails to advance a runner from second has dimension 0.5. A single that scores a runner from second, on the other hand, is dimension 1.5, the same as the double that doesn't score that runner. Note that this is all very context-dependent, of course, and also not remotely related to any sort of statistical estimate of the run-value of different hits. The double that turns -2- into -23 without a run scoring and the single that turns -2- into 1-- with a run scoring are not in any obvious sense equivalent, as the bases they create are in different configurations and in one case but not the other you bank a run. Still, I think this is an interesting way to describe a hit, and to note when a single or double was leveraged well or poorly by the existing baserunners.

One can also extend the concept to something other than a hit. A walk, for instance, or a hit-by-pitch, always gives one base to the batter, and may or may not advance the existing baserunners at all. So a bases-loaded walk has dimension 1, as everyone moves up 90 feet, but a walk with runners on second and third has dimension 1/3, since two of the three baserunners involved in the play (counting the batter) stay put.  Sacrifice bunts, sacrifice flies, and productive outs of all kind are the reverse: the batter does not get a base, not even one, but other runners might. A sacrifice fly that drives in the sole baserunner from third has dimension 0.5. This also lets us distinguish between a sac-fly with, say, runners on second and third that only scores the runner from third or one that scores that runner and also advances the runner from second to third base, a difference that can be crucially important if there were zero outs prior to the play. (The former has dimension one-third, the latter two-thirds.) Grounding into a fielder's choice at second base, say, gives one base to the batter while subtracting a base from the baserunner; if there were no other runners who advanced on the play, then, this results in a dimension of 0. Grounding into a double play has negative dimension, as it subtracts an existing base and doesn't add anything in return, unless, of course, there were other runners, say a runner from third with no outs who came around to score on the play.

Now, we do get a situation where the batter is penalized, in a sense, for having an extra runner on first or second who doesn't advance on his sacrifice fly. There are two possible ways to go about correcting for this. One would just be to look at the total base-advances generated on by a play. One could leave off, in this case, the thing about giving lead runners credit for the bases taken by their trailing runners if they score; then a grand slam would be worth 10 bases, the highest possible. Or one could keep that feature, which would give a slam dimension 16 and might leave some numbers between 0 and 16 as impossible dimensions. I'm not sure how you'd get 15, for instance; a bases-loaded triple would be 4*3, since each scoring runner would only get credit for 3 bases of advancement.

The other possibility would be to, in essence, pretend that the bases are always loaded, and subjectively estimate what would've happened to any of those runners who happen actually to be fictional. This can encounter difficulties, however, because with the bases loaded I think it's not possible to have a hit of lower dimension than its total-base number. For instance, take the double that didn't score the runner from second. What happened to our fictional runner from first on that play? He can't have would up at second, since the batter finishes there, but he also can't have wound up at third, since the runner from second finishes there.

So, I'm not sure there's a way to perfect this system, but I still think it's useful. We do subjectively feel, I think, that a ripped line drive to right field that allows a runner to go first to third on a single is a stronger and better hit than an infield hit that only gets the runner to second. Fractal dimension is a nice conceptually-simple way to quantify that sense.

No comments:

Post a Comment