“In Defense of the RBI” was the title of Graham Womack’s provocative post here at HHS on Thursday, a post that led to much interesting discussion on the topic of Runs Batted In and alternative statistics. Let’s suppose we want a stat that does what the RBI stat does, but narrowing some of its flaws. How might that be done? One approach is after the jump.
Let’s say that we want a stat that measure’s a hitter’s success at driving in runs. Not his value as a player generally, or as a hitter overall. Not his success at getting on base or his power, except to the extent those abilities directly affect his success in driving in runs. And let’s also suppose we want a counting stat, not a rate stat. Counting stats, after all, have an elegant simplicity: you want to know who the best player at accomplishing X is, the simplest and most intuitive way to do that is to count up who accomplished X the most times. The RBI is just that, a counting stat that measures a hitter’s success at driving in runs, by counting up the number of runs he drove in. But it is also flawed in that different hitters have dramatically different opportunities to drive in runs, most significantly, as a result of how many base runners a batter’s teammates puts on base and where in the batting order he bats. So merely counting up how many runs a hitter drives in may tell us more about his opportunities to drive in runs than about his actual success in doing so. Can we build a stat that counts success in driving in runs, as RBI does, but adjusts for opportunities? Here’s my modest approach to that problem.
In 2011, major league hitters had 4,344 plate appearances with the bases loaded. Those 4,344 PAs resulted in 2,842 runs batted in. That’s .654 RBI per bases loaded plate appearance, on average across the majors as a whole in 2011. Also in 2011, there were 33,073 PAs with just a man on first, nobody on second or third. Those 33,073 PAs with just a man on first resulted in 2,492 runs batted in, for a major league average of .075 RBI per man-on-first plate appearance. We can repeat this exercise for each of the eight man-on-base scenarios: bases empty, man on first, man on second, man on third, men on first and second, men on first and third, men on second and third, and bases loaded. Baseball-reference has splits of this type that you can look up for every season (and for every player and every team) going back to 1948.
Now we can check how well a particular hitter did in driving in runs, when compared to the major league average, given the particular opportunities he had to drive in runs. Let’s do an example. Matt Kemp in 2011 came to the plate with the bases loaded 10 times. We know from the numbers mentioned above that on average a major leaguer in 2011 produced 6.54 runs in 10 PAs with the bases loaded. Kemp actually had 14 RBIs in his ten bases-loaded PAs in 2011, 7.5 more than were produced on average. So let’s credit Matt for 7.5 runs driven in above average, over his 10 bases loaded PAs. Also in 2011, Kemp had 147 PAs with just a man on first. Again, we know that on average, those PAs would have produced about 11 RBI in the 2011 major leagues (147 PAs x .075 RBI per man-on-first RBI average in the majors = 11.025). But Matt Kemp actually produced 21 RBI in his 147 PAs with a man on first, ten more than the average hitter.
We can repeat this exercise for each of the eight different man-on-base situations for Matt Kemp in 2011. I did that, and found that if Kemp had driven in runs at an average rate in each of the eight man-on-base situations he encountered, given the number of PAs he actually had in each of those situations, he would have ended 2011 with 78.5 RBI. But he really had 126 RBI in 2011, which means Kemp had 48.5 RBI over what a league average hitter would have produced with the same number of opportunities as Kemp had in each man-on-base situation. One might say, then, that Matt had 48.5 DOA, runs Driven-in Over Average. There’s an acronym that might stick in your mind, epecially if you’re a fan of police procedurals or film noir.
Here are the 2011 major league RBI leaders, along with their 2011 DOA numbers. Note how Ryan Braun has a relatively high DOA, and Ryan Howard a relatively low one. Howard had 116 RBI in 2011, and Braun 111. But an average major league hitter, given Ryan H.’s number of chances with each man-on-base situation, would have produced 83.3 RBI while an average major league hitter would have produced only 64.9 RBI with Ryan B.’s opportunities. So Braun ends up with more DOA than Howard, despite having fewer RBI. Braun had more success driving in runs given the opportunities he had to work with than Howard did.
Matt Kemp 126 RBI, 47.5 DOA
Prince Fielder 120 RBI, 39.3 DOA
Curtis Granderson 119 RBI, 44.0 DOA
Robinson Cano 118 RBI, 44.2 DOA
Adrian Gonzalez 117 RBI, 33.5 DOA
Ryan Howard 116 RBI, 32.7 DOA
Ryan Braun 111 RBI, 46.1 DOA
Mark Teixeira 111 RBI, 41.1 DOA
One issue with runs Driven-in Over Average is that it is not easy to calculate in bulk with the standard tools we use here — baseball-reference and the Play Index, fangraphs and so on. I did the calculations for the eight guys above one at a time, and doing a lot more would have been a major effort. But I’m really more going for the concept here than a bulk implementation: an idea about evaluating the specific achievement that goes into RBI but adjusting for opportunities. DOA may not be the next widely accepted stat — indeed its fate may be hinted at by its name. But I hope I’ve at least helped move forward the conversation about RBI that Graham has so eloquently begun.