Thursday, July 15, 2010

My Problem With OPS

Here are the formulas for some baseball stats:

Batting Average = Hits over At Bats
On Base Percentage = Hits plus Walks plus Hit By Pitch over At Bats plus Walks plus Hit By Pitch plus Sacrifice Flies
Slugging Percentage = Total Bases over At Bats
On-Base Plus Slugging Percentage = On Base Percentage + Slugging Percentage

So, rendering OPS explicitly:

OPS = OBP + SLG = (H+BB+HBP)/(AB+BB+HBP+SF) + (TB/AB) = AB(H+BB+HBP)/AB(AB+BB+HBP+SF) + (AB+BB+HBP+SF)TB/AB(AB+BB+HBP+SF) = (H*AB + BB*AB + HBP*AB + TB*AB + TB*BB + HBP*TB + TB*SF)/(AB^2 + AB*BB + AB*HBP + AB*SF)

In other words. On-base plus slugging percentage equals the sum of hits times at bats, walks times at bats, hit by pitches times at bats, total bases times at bats, total bases times walks, total bases times hit by pitches, and total bases plus sacrifice flies divided by the sum of at-bats squared plus at bats times walks plus at bats times hit by pitches plus at bats times sacrifice flies.

What?

This is a meaningless construction. Sure, we know that a high OBP is good, and a high SLG is good, so a high OPS is good. But the stat itself is just awkward. You're adding things that don't want to add.

A much better way to try and construct an overarching statistic that includes both that skill which on-base percentage addresses and that skill which slugging percentage addresses would be as follows:

(TB + BB + HBP)/(AB + BB + HBP + SF)

This makes sense: it's the just like OBP, except you also get credit for your extra bases, thus taking power into account. Or, we could also decide to give players credit for their base-stealing abilities, thus taking into account all the different ways that a player can move himself* forward one or more base. I like to think of this summary statistic as breaking down into two halves, one of which I call Not Out Percentage, and the other of which I call Specialness Ratio:

NO% = (H + BB + HBP - CS)/(AB + BB + HBP + SF)
SPE = (TB + BB + HBP + SB - CS - CS3 - 2CSH)/(H + BB + HBP - CS) [CS3 is caught stealing third, CSH is caught stealing home).

The point of the first is simple: what percentage of the time that the player does not intentionally sacrifice himself does he produce an out? The second is about how many bases the player generates when they do not make an out. The point of subtracting the caught stealings in the top of that ratio is because getting oneself caught stealing is a way of negating the bases one has generated, and of course being caught stealing third destroys two bases and being caught stealing home destroys three. One could argue for taking those double-counts and triple-counts out (though there are very few CSH anymore!), or leave them in. Anyway, one can then multiply these two stats together to produce:

(TB + BB + HBP + SB - CS - CS3 - 2CSH)/(AB + BB + HBP + SF)

To my mind, this is a much more simple and elegant total summary stat than OPS: how many bases, on average, does the player generate in a genuine plate appearance when they are trying to generate bases? OPS has no equivalent question that it answers; it's just a rough metric assumed to roughly correlate with quality play.