DOA: Driven-in Over Average

85 Replies

“In Defense of the RBI” was the title of Graham Womack’s provocative post here at HHS on Thursday, a post that led to much interesting discussion on the topic of Runs Batted In and alternative statistics. Let’s suppose we want a stat that does what the RBI stat does, but narrowing some of its flaws. How might that be done? One approach is after the jump.

Let’s say that we want a stat that measure’s a hitter’s success at driving in runs. Not his value as a player generally, or as a hitter overall. Not his success at getting on base or his power, except to the extent those abilities directly affect his success in driving in runs. And let’s also suppose we want a counting stat, not a rate stat. Counting stats, after all, have an elegant simplicity: you want to know who the best player at accomplishing X is, the simplest and most intuitive way to do that is to count up who accomplished X the most times. The RBI is just that, a counting stat that measures a hitter’s success at driving in runs, by counting up the number of runs he drove in. But it is also flawed in that different hitters have dramatically different opportunities to drive in runs, most significantly, as a result of how many base runners a batter’s teammates puts on base and where in the batting order he bats. So merely counting up how many runs a hitter drives in may tell us more about his opportunities to drive in runs than about his actual success in doing so. Can we build a stat that counts success in driving in runs, as RBI does, but adjusts for opportunities? Here’s my modest approach to that problem.

In 2011, major league hitters had 4,344 plate appearances with the bases loaded. Those 4,344 PAs resulted in 2,842 runs batted in. That’s .654 RBI per bases loaded plate appearance, on average across the majors as a whole in 2011. Also in 2011, there were 33,073 PAs with just a man on first, nobody on second or third. Those 33,073 PAs with just a man on first resulted in 2,492 runs batted in, for a major league average of .075 RBI per man-on-first plate appearance. We can repeat this exercise for each of the eight man-on-base scenarios: bases empty, man on first, man on second, man on third, men on first and second, men on first and third, men on second and third, and bases loaded. Baseball-reference has splits of this type that you can look up for every season (and for every player and every team) going back to 1948.

Now we can check how well a particular hitter did in driving in runs, when compared to the major league average, given the particular opportunities he had to drive in runs. Let’s do an example. Matt Kemp in 2011 came to the plate with the bases loaded 10 times. We know from the numbers mentioned above that on average a major leaguer in 2011 produced 6.54 runs in 10 PAs with the bases loaded. Kemp actually had 14 RBIs in his ten bases-loaded PAs in 2011, 7.5 more than were produced on average. So let’s credit Matt for 7.5 runs driven in above average, over his 10 bases loaded PAs. Also in 2011, Kemp had 147 PAs with just a man on first. Again, we know that on average, those PAs would have produced about 11 RBI in the 2011 major leagues (147 PAs x .075 RBI per man-on-first RBI average in the majors = 11.025). But Matt Kemp actually produced 21 RBI in his 147 PAs with a man on first, ten more than the average hitter.

We can repeat this exercise for each of the eight different man-on-base situations for Matt Kemp in 2011. I did that, and found that if Kemp had driven in runs at an average rate in each of the eight man-on-base situations he encountered, given the number of PAs he actually had in each of those situations, he would have ended 2011 with 78.5 RBI. But he really had 126 RBI in 2011, which means Kemp had 48.5 RBI over what a league average hitter would have produced with the same number of opportunities as Kemp had in each man-on-base situation. One might say, then, that Matt had 48.5 DOA, runs Driven-in Over Average. There’s an acronym that might stick in your mind, epecially if you’re a fan of police procedurals or film noir.

Here are the 2011 major league RBI leaders, along with their 2011 DOA numbers. Note how Ryan Braun has a relatively high DOA, and Ryan Howard a relatively low one. Howard had 116 RBI in 2011, and Braun 111. But an average major league hitter, given Ryan H.’s number of chances with each man-on-base situation, would have produced 83.3 RBI while an average major league hitter would have produced only 64.9 RBI with Ryan B.’s opportunities. So Braun ends up with more DOA than Howard, despite having fewer RBI. Braun had more success driving in runs given the opportunities he had to work with than Howard did.

Matt Kemp 126 RBI, 47.5 DOA
Prince Fielder 120 RBI, 39.3 DOA
Curtis Granderson 119 RBI, 44.0 DOA
Robinson Cano 118 RBI, 44.2 DOA
Adrian Gonzalez 117 RBI, 33.5 DOA
Ryan Howard 116 RBI, 32.7 DOA
Ryan Braun 111 RBI, 46.1 DOA
Mark Teixeira 111 RBI, 41.1 DOA

One issue with runs Driven-in Over Average is that it is not easy to calculate in bulk with the standard tools we use here — baseball-reference and the Play Index, fangraphs and so on. I did the calculations for the eight guys above one at a time, and doing a lot more would have been a major effort. But I’m really more going for the concept here than a bulk implementation: an idea about evaluating the specific achievement that goes into RBI but adjusting for opportunities. DOA may not be the next widely accepted stat — indeed its fate may be hinted at by its name. But I hope I’ve at least helped move forward the conversation about RBI that Graham has so eloquently begun.

85 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

Neil L.

13 years ago

Whoa, birtelcom, an amazing post. It will take me days to wrap my head ahead around all the implications.

Is there a simple formula for Driven In Over Average? I mean in terms of widely-available stats.

Does it requires comparison to a league average for a given season or is only dependent on base-out data over many decades?

Neil L.

13 years ago

I love the play in the acronym, birtelcaom, DOA. As if it is fated to fail before it starts.

So DOA is an answer to the all of the weaknesses in RBI as a counting stat. Hmmm …..

Mike L

13 years ago

Birtelcom-which are those RBI “clutch” RBI?

no statistician but

13 years ago

Reply to Mike L

This stat is interesting for what it does, but like RBI, it makes no distinction between a grand slam when the score is already 8-1 and a single that brings a man home from 2nd base to win the game. In fact, also like RBI, it tends to weigh in for the lesser accomplishment re winning the game in that particular instance. Nothing at all wrong with that as long as what it does show is not, like RBI, assumed to mean something else.

bstar

13 years ago

Reply to no statistician but

nsb, how is winning the game with a single that brings home the winning run a lesser accomplishment? Wouldn’t WPA suggest otherwise? Also, aren’t hits, doubles, triples, walks, runs, home runs, grand slams, also making no distinction between leverage index? Why do we have to tag RBI or DOA with this marker when most counting stats are inherently leverage independent?

no statistician but

13 years ago

Reply to bstar

I was implying just the opposite of what you inferred. To me the single is more important but DOA doesn’t show that, but the opposite.

Winning the game is the point of playing the game at most levels. Most stats don’t distinguish grades of importance re the circumstances in which an event occurs.

I’ll say here what i’ve said before: clutch hitting exists and it’s important—but it is situational. The DOA stat, like RBI, overlooks the importance of the situation.

Author

birtelcom

13 years ago

Reply to no statistician but

Baseball-reference already has a whole table worth of stats for every player that cover game situation leverage, including Win Probability Added. The DOA concept was not intended to replicate any of that. It was only intended to tweak the RBI stat to provide a count of runs driven in that adjusts for the fact that players differ in the number of opportunities they get with different men on base situations that heavily affect how easy or hard it is to rack up RBIs. In-game leverage is an extremely imnportant issue that deserves and gets its own statistical coverage; just not… Read more »

no statistician but

13 years ago

Reply to no statistician but

birtelcom:

I think we’re saying the same thing, just with different emphases. Your original post was about DOA, and my response was directed to it and the whimsical question posed by Mike L.

Actually, I think the DOA idea gives us a telling distinction, and with some tweaking for things like walks, as discussed by others in this discussion, it could be a real winner, driving in some important runs, metaphorically speaking.

Mike L

13 years ago

Reply to Mike L

Birtelcom, sorry for being flip or whimsical-I do like the new rubric. I also think there’s a core alternate reality regarding what is “clutch”. Your piece does one side-it compares players to league average, which should go a long way, over a career, to analyzing how a player would stack up against an average player given the same number of opportunities. There’s another side, which is how a player does compared to his “non clutch opportunity” self. A .750 OPS player who manages .825 with men on base-is he more “clutch” than the .825 player who performs the same with… Read more »

Author

birtelcom

13 years ago

Reply to Mike L

Mike: As you may know, b-ref has a stat called “Clutch” within the Win Probability tables that is actually derived by comparing a player’s performance in high leverage situations to his performance in low-leverage situations. A hitter who is excellent in both high-leverage and low-leverage situations but is a litle less excellent in high leverage situations than in low leverage situations will have a negative Clutch number (even though you still may want him up in those important situations because he is still a good hitter).

Neil L.

13 years ago

Please help me here, birtelcom.

The season formula for DOA is the sum, over all seven men-on-base situations, of the differences between the player’s RBI in those situations and the league average for that season?

I wish it were a more easily-calculated statistic but that doesn’t diminish its value.

DOA would have to be fine-tuned for each year for offensive context and strikouts et cetera.

bstar

13 years ago

Fantastic and thought-provoking post, birtelcom, with a good dose of your subtle, witty humor as well. I feel I need to light a pipe and go stroke my long beard and give this one a good pouring over. It sure makes a lot of sense on the surface. Well done.

Neil L.

13 years ago

Reply to bstar

btar, your Brian Wilson beard? 🙂

bstar

13 years ago

Reply to Neil L.

The world only needs one of those.

Lawrence Azrin

13 years ago

Reply to bstar

Only one what – one Brian Wilson?

Then I’ll choose the guy in the Beach Boys who wrote, arranged, and produced all those great songs (my personal preference is mostly from the 1963-66 era).

Neil L.

13 years ago

Interesting that players like Jose Bautista and Miguel Cabrera don’t feature prominently on the DOA list, whereas other players do.

Or were there DOA’s not calculated? (I guess they hope they were DOA …. sorry, couldn’t resist. 🙂 )

Hank G.

13 years ago

This seems to be an improvement over straight RBI, but it is still limited by opportunities. Who is better at driving in runs, a player that drives in 20 runs over average where the average is 100, or a player who drives in 15 runs over average where the average is 50?

Richard Chester

13 years ago

birtelcom: Maybe I am wrong but aren’t you supposed to subtract solo HRs from the players RBI totals before computing DOA?

birtelcom

13 years ago

Reply to Richard Chester

I thought about that, but to me a guy who hits a home run does drive himself in and it is reasonable to credit him for that in this context, the same as RBI does. It just depends on what you want out of this sort of stat — if you only want to know how well a player drives in other guys, and don’t want that contaminated by home runs, then that’s another way of doing it. More a matter of taste than a right way or wrong way, I think.

bstar

13 years ago

Reply to birtelcom

birtelcom, if we’re to include solo home runs, don’t we need to compute a league-average RBI expectancy based on HR hit with no one on, similar to the way we did with every other runners-on-base situation?

Author

birtelcom

13 years ago

Reply to bstar

Yes, bases empty is one of the seven different “man-on-base” states that I used in calculating my sample eight guys for 2011.

Shping

13 years ago

Well done birtelcom. DAM — Driven-in Above the Mean? Actually i like yours better. The only flaw i see is that it doesnt factor in the raw number of rbi opportunities that a batter gets. Players on higher-scoring teams and in more favorable batting-order-positions will still have an unfair advantage over, say, the leadoff hitter for the Padres, to use an extreme example. Of course we know that a leadoff hitter gets fewer rbi opportunities and we don’t expect him to drive in many runs. But it would still be nice to know if someone with a low rbi total… Read more »

bstar

13 years ago

Reply to Shping

Shping, interesting observations. I’ll apologize to anyone else who’s had to see me post this link a third time, but it’s so appropo I have to. Here’s Others Batted In, from Baseball Prospectus. It measures raw baserunner RBI opportunities(PA_ROB=plate appearances with runners on base) and efficiency at driving other players on base in(OBI%). Disregard the highlighted raw OBI(it’s the same as RBI minus RBIs from scoring yourself on a home run). Anyway, here’s the link(again, sorry) for the 2012 leaders:

http://www.baseballprospectus.com/sortable/index.php?cid=1091213

Shping

13 years ago

Reply to bstar

Thanks bstar. I suspected there were stats like that out there. It’s very useful, for example, to see the RBI% broken down by baserunner location. Ichiro, of course, scores very low in driving in runners from first, much better from 2nd or 3rd base. And traditional power hitters, of course, have more balanced numbers by comparison.

Just wondering if there’s anyway to somehow meld these stats with DOA.

#17 makes some interesting points as well.

AlbaNate

13 years ago

Am I misunderstanding something, or was Kemp actually 7.46 RsBI above expected with the bases loaded? (14-6.54=7.46)

Author

birtelcom

13 years ago

Reply to AlbaNate

Thank you — you are absolutely right. I’ve corrected the text.

Shping

13 years ago

Following up on my own #11 and also #9. I’m trying to get a handle on DOA by looking at someone like Ichiro. As many of you know, Seattle is experimenting with him batting 3rd this year. He has always had a reputation for being a good clutch hitter (whatever that means), despite the fact that he is by no means a power hitter. And he plays on a team that definitely struggles to score runs. So even if Ichiro rebounds this year and hits .330 with an OPS above .800, I don’t expect him to drive in 100 runs.… Read more »

Author

birtelcom

13 years ago

Reply to Shping

Leadoff men and players on low-run-scoring teams will probably never reach the very top of the DOA list, I think, because DOA is still a counting stat and like all counting stats still depends on total opportunities. But the effect I think you will see is, for example, that a leadoff guy for a low-scoring team would inevitably be well below average in RBI for a starting player, whereas with DOA, as long as that guy is a good hitter who knocks in the few runs he has a chance to knock in, he will have a positive DOA that… Read more »

AlbaNate

13 years ago

I like this idea a lot…one minor quibble is that if a player has a large percentage of his “man on third” plate appearances come with two outs, it would adversely affect his total. I’m not sure that this is a realistic argument, but I would think that where you bat in the order would affect your odds of coming up with two outs, and your odds of driving a runner in from third is greatly decreased if there are two outs. Another thing about this that bothers me a bit is that if you walk a lot, or if… Read more »

AlbaNate

13 years ago

Reply to AlbaNate

If I’m doing the math right, Barry Bonds drove in 12.29 runs fewer than expected in 2004.

Bonds numbers that year were mind-boggling: .609 OBP, .812 SLG, OPS+ of 263, but only 101 RBIs…and 232 walks.

Author

birtelcom

13 years ago

Reply to AlbaNate

I thought about the man on third , two outs vs. less than two outs issue. If one were really serious about doing this DOA thing, you are right that breaking down all the situations that include a man on third into “two outs” and “less than two outs” situations. That would make the calculation process more complicated, but it would also make it more accurate, because it is indeed meaningfully easier to get an RBI with a man on third and less than two out than with two out. This was a little more detailed than I wanted to… Read more »

Brendan

13 years ago

Reply to AlbaNate

Birtelcom, I applaud your logical, well-reasoned approach to the RBI issue. This is a valuable contribution. Two random thoughts: 1) Intentional walks might throw another interesting twist into the mix. I don’t know that my thinking on this matter is clear, but IBB are always intended to set up a situation that is favorable to the defensive team (create a force situation or bring a weak hitter to the plate). Assuming the defensive team is correct in employing the strategy, on balance plate appearances following IBBs will be associated with fewer RBIs. Therefore, those plate appearances (following IBB) might need… Read more »

Editor

John Autin

13 years ago

Reply to Brendan

Brendan (point 1) — I would not assume that teams generally make wise decisions about issuing IBBs.

Brendan

13 years ago

Reply to John Autin

Of course, it’s a calculated risk (or sometimes an act of desparation).

Michael Sullivan

13 years ago

Reply to Brendan

I don’t think you need to do 1. Most of the time, the situation that calls for an IBB is the base/out situation, so having a lower chance will be reflected in the average RBI from that runners on base situation. I suppose it’s marginally more likely that a slow guy gets walked and a few other things that are dependent on more the the base situation, but I suspect these factors are small compared to the main one which birtelcom’s algorithm already accounts for.

Richard Chester

13 years ago

I did an analysis for Ryan Howard for his 2008 season when he batted .251 and had 146 RBIs. His DOA calculated to 69.4 RBIs. His BA with men on base was .309, considerably above his overall BA. He was especially good with a runner on second (.379) and on first and third (.333).

Chad Evely

13 years ago

Interesting idea. I just ran your process for all players in 2011 if anyone wants to check out a complete list:

http://www.statisticalmeanderings.com/2012/04/high-heat-stats-doa.html

Author

birtelcom

13 years ago

Reply to Chad Evely

That is fantastic, thank you!

AlbaNate

13 years ago

Reply to Chad Evely

Well, my thoughts about getting walked a lot and having lower than expected RBIs does not seem to hold up. All the BB leaders are also the players with the most RBIs over expected.

JDV

13 years ago

Reply to AlbaNate

It still depends on whether the ‘opportunities’ are considered for each AB, or for each PA. If only ABs are being looked at here, your original thought almost certainly holds up.

bstar

13 years ago

Reply to Chad Evely

Chad, that’s phenomenal. How many internet six-packs of brew would I need to buy you to convince you to run a few more years? What I’m really interested in is Ryan Howard, whose allegedly inflated RBI totals from 2006-09 are kind of a lightning rod for this RBI debate. I would love to know how he compared DOA-wise for those years vs. other NL sluggers. Other metrics have suggested he indeed was quite good in this period at driving runners in.

Chad Evely

13 years ago

Reply to bstar

Done and done… I’ve got 1956-2011 running as we speak. I’ll let you know when they’re done but to whet your appetite: in 1956 Mickey Mantle had a +61.1 DOA and Nellie Fox had -21.8.

bstar

13 years ago

Reply to Chad Evely

Fantastic, Chad, thanks a lot!

Chad Evely

13 years ago

Reply to bstar

I just updated my blog post with links to spreadsheets with every year from 1956-2011… I haven’t gotten a chance to check them out so I suspect I may not be overly productive at work tomorrow.

http://www.statisticalmeanderings.com/2012/04/high-heat-stats-doa.html

bstar

13 years ago

Reply to bstar

We’re all indebted. Are you using Retrosheet, Chad?

Chad Evely

13 years ago

Reply to bstar

My pleasure. And, yes, I used Retrosheet.

An excellent idea, birtelcom.

bstar

13 years ago

Reply to bstar

I can tell right off just looking at a few of the recent years that these tables you’ve provided might change some people’s ideas about Ryan Howard, for one. I’m up most of the night so I am just gonna get lost in these numbers. Please stop by more often, Chad.

Neil L.

13 years ago

Reply to Chad Evely

Awesome list, Chad. I just added your site to my favorites.

Chad Evely

13 years ago

Reply to Neil L.

That’s great. Unfortunately I don’t get to post as often as I’d like to (1-year-old daughter with another on the way) but get my fix from all the interesting things they do on this site.

Voomo Zanzibar

13 years ago

Reply to Chad Evely

Chad E,

I have a one-year old daughter, and I have enough time to come here and make a smart ass comment once a day while I am sitting on the toilet.

Could you briefly explain to a semi-luddite how the hell you created that document?

Please tell me that there is a way to create all of those hundreds of player-links in an automated manner.

Editor

John Autin

13 years ago

Reply to Voomo Zanzibar

Voomo — I’m also dying to know Chad’s methods.

But creating the player links is no problem. I and others routinely copy out P-I search results into Excel, and they retain the links that they have in the P-I pages.

Also, there is a page on B-R where you can type or paste player names and then click the button to “Linkify Text.”
http://www.baseball-reference.com/friv/link_players.cgi

Chad Evely

13 years ago

Reply to Voomo Zanzibar

Voomo, Typically I make my smart-ass comments WHILE sitting on the toilet… it frees up a bunch of time but, trust me, you don’t want to borrow my smart phone. I’m a programmer and, being interested in baseball stats, have spent some free time in the last year or so writing code that reads Retrosheet event files which I can then use to do things like this DOA calculation very easily. Every player has an ID in Retrosheet and even though it differs from the B-R ID, I can use Sean Lahman’s DB to find the B-R ID of each… Read more »

kds

13 years ago

Of course there are 8 possible runners-on-base states; none, 1, 2, 3, 1-2, 1-3, 2-3, 1-2-3. In my long post in the last RBI thread I described how for Pujols-Howard ’08 I looked at who had a higher % at each of the 8 states. I subtracted HBP and IBB since these are not choices of the batter. I included UIBB since they are choices, to some extent. If you are doing this with some large database that codes all plays, (Retro-sheet or the Lahman database), you may want to just use all 24 base-out states. This will enable you… Read more »

Author

birtelcom

13 years ago

Reply to kds

–Thanks so much for the correction on the runners on base states; yes, of course when you include the no men on scenario, there are eight different states. I’ve corrected the error in the text of the post. –Good point that in the comments on Graham’s original post you were already doing something much like this DOA idea. Great minds…. –One way to run the numbers would indeed be to do all 24 possible base/out states. Some of the base/out states though really represent distinctions that are not likely to be meaningful for this exercise. For example, man on first… Read more »

Evan

13 years ago

Reply to birtelcom

It’s one of those interesting quirks of baseball that a batter has a better chance of getting an RBI from a runner on 3rd if there are less than 2 outs (because of SF), but benefits from there being 2 outs when attempting to get an RBI from a runner on 1st or 2nd (because the runner will have a better chance of advancing an extra base on a ball hit in the air). If we take the RBI expectancies for each of the 8 base-occupied situations and compare these figures based-upon how many outs there are we can get… Read more »

Fireworks

13 years ago

Like the post, birtelcom, but you need a better acronym. RBIOA, or something. DOA might rightly be DOA as per your above comment. I would also like if someone could figure out a way to turn DOA into a rate stat, since that corrects for lack of opportunity and playing time. As it is, though, I like this stat and would like to see it referenced sometimes (which is why you need something better than ‘DOA’). It would be harder for someone to make the claim that a guy is a ‘run-producer’ or ‘drives in runs’ if he has an… Read more »

Author

birtelcom

13 years ago

Reply to Fireworks

Yes, I thought about using an acronym less gruesome than DOA. But then there are so many sabermetric acronyms out there, and seemingly a new one popping up somewhere every week, I was looking for something eye-catching. In an environment where zombies, vampires and gladitorial teenagers are cultural touchstones, why not DOA for a baseball acronym? It’s an acronym that would fit right in on the new spinoff “CSI Miami Marlins”.

Fireworks

13 years ago

Reply to birtelcom

You need to tweet this to David Cone. He’s one of the few guys on TV whose job it is to talk baseball that actually cares to support the things he says with (useful, illuminating) data.

Fireworks

13 years ago

Ignore my last sentence. Replied before looking at Chad’s link.

Fireworks

13 years ago

Looks like Joe Carter was consistently good at driving in runs. Better than I would’ve expected. Of course it goes without saying that his low OBP still meant that his contribution was not quite worth the glowing praise bestowed upon him; he still made a lot of outs and wasn’t on base a ton for the guys behind him to drive in, but nonetheless “he drives in runs” is accurate for Joe Carter. Of course that would be tempered a bit with a DOA-related stat that accounting for all the opportunities that ended in a walk. Carter would suffer some… Read more »

Timmy Pea

13 years ago

Another awesome post! I am of the opinion that a player may have a special skill that helps him drive in runs. Take away 25 RBI’s from Joe Carter every year and he is a lousy player. Joe’s RBI totals don’t elevate him to great player status, but they do offer some redemption to his career. I think we’ve all seen a slugger have a bad year and limp to 95 or 100 RBIs. He still had a lousy year, and yes as the RBI haters point out anybody bating 4th will probably drive in a few runs regardless. As… Read more »

Lawrence Azrin

13 years ago

Reply to Timmy Pea

“I think we’ve all seen a slugger have a bad year and limp to 95 or 100 RBIs” –

DAVID ORTIX, 2009:
99 RBI, but a 101 OPS+ on a .238 BA, 28 HR – literally an “average” batting year by OPS%.

Next two years: in 2010, 102 RBI on a 137 OPS+; 2011, 96 RBI on a 154 OPS+. Just over 600 PA for all three season.

This is one in an inummerable number of data points that show that raw seasonal RBI totals can be extremely deceiving, and not that useful in evaluating players.

Dave V.

13 years ago

Great stuff, Chad. One random question – did Don Mattingly get left off the 1980s search? I’ve searched for him by both first and last name and can’t find him listed (I do see him in the 1990s search).

Dave V.

13 years ago

Oops nevermind my comment @54, as I realized my mistake.

Editor

Doug

13 years ago

Nice work, birelcom. Couple of thoughts. While this is a counting stat, it would be an odd sort of counting stat to track over the season for a couple of reasons: (i) it wouldn’t be whole numbers, and (ii) it would fluctuate up and down through the season. All other stats that are referred to as counting stats are whole number (or understood fractions for IP), and only accumulate in one direction. To track a stat like this, presumably would have to use the average performance stats from previous season (or avg of 3 previous seasons, or something along those… Read more »

AlbaNate

13 years ago

Reply to Doug

WAR is another counting stat that has fractions and that can rise and fall through a season.

Editor

Doug

13 years ago

Reply to AlbaNate

Right you are.

But that is another example (like DOA) of a different kind of counting stat. The kind that are hard to get your head around (or, my head, anyway) if tracking them during the year.

philaphan

13 years ago

Interesting post, apologies if this was touched on (I skimmed several comments), but wouldn’t using a linear weight based-run expectancy be a better baseline to use than the previous season’s RBI totals? It seems like DOA can only be calculated retroactively, and that players’ numbers will be skewed by having their own RBI’s as part of the league average. In the Matt Kemp example, all of his RBI’s are driving up the league average (albeit very slightly) that he is being measured against. Again, not sure if this was already discussed or considered, but it seems like an improvement.

Chad Evely

13 years ago

I finished generating career DOA values for every player with at least one plate appearance between 1956 and 2011. There is one spreadsheet that breaks down by baserunning situation and another by player age… I updated previous blog post with links: http://www.statisticalmeanderings.com/2012/04/high-heat-stats-doa.html I think I may modify to incorporate a couple of the mentioned improvements and re-run later tonight. These changes will be: – Eliminate plate appearances ending with BB, IBB and HBP from baseline dataset as well as player calculations. – Take outs into account. Creating separate situations for 0, 1 or 2 outs will create 24 situations instead… Read more »

bstar

13 years ago

Reply to Chad Evely

Chad, I agree with IBB and HBP being taken out, but consider this about eliminating all BB from the dataset: that player who drew the walk may have had several good pitches to get a hit and drive runners in, he may have fouled several off, so to completely eliminate these at-bats from consideration is questionable to me. Also, unless the bases are loaded, a walk isn’t going to drive a runner in anyway, similar to the way a single with a man on first is very unlikely to produce an RBI. Are we going to eliminate all singles with… Read more »

birtelcom

13 years ago

Reply to Chad Evely

-I see no reason to distinguish between none out and one out, for this purpose. The only situations where the out state really makes any difference are situations that include a man on third (third only, first and third, second and third, bases loaded)and even then the only meaningful distinction is between two outs and less than two outs. So a total of twelve states I think will cover all the meaningful distinctions that might come up. –I’m personally not a fan of taking one run driven in out of equation for each homer, but I can understand why you… Read more »

Chad Evely

13 years ago

Reply to birtelcom

I’m glad I was able to help visualize this stat a little bit. I’ve always been conflicted about RBIs, understanding the stat has its limitations but realizing that there is some value to what it’s telling us so it’s interesting to help explore an alternate way to measure the skill that RBI is supposed to measure.

Chad Evely

13 years ago

Reply to Chad Evely

Alright, well it sounds like we all agree that removing IBB and HBP from the equation and expanding to 12 states is the way to go so I’ll set that as my base. I just finished setting up the process to allow me to turn on/off whether to include walks and whether to include the RBI from a HR. I’ll run the four different combinations to allow birtelcom to determine which one he thinks bests accomplishes what we’re trying to measure… it is his baby after all. It’ll take ~1.5 hours to run each setting so I’ll probably just post… Read more »

kds

13 years ago

I think whether or not to include UIBB and the batter on HR is mostly just a matter of exactly what questions we are trying to answer. I don’t think there is one clearly right or wrong answer. I would like to see tables with Actual_RBI/Expected_RBI as well as the Actual-Expected that have been produced. Some stats, OPS+ would be the most familiar, eliminate pitchers batting in figuring out the average. So the average NL team as of Saturday had an OPS+ of 92 because the 100 average is figured without pitchers batting. I wonder if we might not get… Read more »

birtelcom

13 years ago

Reply to kds

PAs by pitchers only constituted about 3.2% of the major league total of PAs last season, and those were after all actual PAs; pretending they didn’t happen seems a little odd to me. Does including those pitcher PAs mean that the “average” hitter performs a little worse than what we would get if we were looking at the average “non-pitcher”? Yes, but as long as you apply the same “average” formula to every player, I’m not sure it makes much difference for DOA purposes. Yes, a few more non-pitchers will come out “above average” than “belo-average” but if it’s good… Read more »

bstar

13 years ago

Good point about eliminating the pitchers’ OPS from the equation. I guess we’ll leave that to birtelcom to decide.

kds, what are you using to make the distinction between UIBB and regular BB?

kds

13 years ago

Reply to bstar

BB-IBB. Yeah we are mixing the batter’s choice with the pitcher’s choice. Ideally we would try to figure out; 1) absolute minimum walk rate when pitcher totally wants to avoid a walk. (e.g., close game bases loaded.), 2)try to figure how many additional walks a pitcher would give up in different base/out/score situations., 3) How much the identity of the batter matters for the pitcher. Once we’ve done all that we will be able to estimate how much of each walk was up to the pitcher vs the batter. I don’t know if we could get the signal to noise… Read more »

bstar

13 years ago

Reply to kds

Wow—IBB-BB. So all non-IBB walks are “unintentionally intentional”? That’s a huge leap. I would wager myself that over 75% of all walks the pitcher would immediately want to take back and are not intentional in the least, but that’s just my opinion. I can’t recall the last time I’ve seen a pitcher walk the leadoff hitter in an inning and have it look like it’s something he “intended” to do. In fact, leadoff walks seem more infuriating to both a pitcher and fan than a leadoff single.

bstar

13 years ago

Reply to kds

Oh, wait, let me correct myself before you need to reply, kds. UIBB simply means “unintentional walks”; I was misreading it to mean “unintentional intentional walks”. Sorry, now I get it although I do not think these should be subtracted from RBI opportunities.

Author

birtelcom

13 years ago

Reply to bstar

I’ve long been a skeptic about the “intentional walk” stat, and I never rely on it for anything substantive. The intentionalness of a walk is on a continuum, it’s not a yes or no issue. It seesm to me if the batter got to first base on four balls it’s a walk and separating them out into intenitonal and unintentional categories doesn’t really add anything. But I’m eccentric about some things. As for walks and DOA, I certainly understand the argument that “penalizing” a hitter for failing to drive in a man on base by taking a walk is sort… Read more »

Chad Evely

13 years ago

I just created a new post with links to a whole lot of data. Rather than settle on any of the variations suggested, I tried to incorporate them all and came up with 5 different data sets to give an idea of how the variations affect the results. I think got all of the big ones in there. Let me know what you think.

http://www.statisticalmeanderings.com/2012/04/high-heat-stats-doa-continued.html

Lawrence Azrin

13 years ago

Alternate title to DOA: “RBI/OA” {RBI/Over Average}?

Author

birtelcom

13 years ago

Reply to Lawrence Azrin

The guys in the engineeering department like it, the guys in the marketing department — just after the second martini, before the third cigarette — say “too many characters”.

kds

13 years ago

Reply to birtelcom

Need a like check-box. No!, we’re a a stats site, we have to be all serious. Oh wait, it’s OK, we can keep stats on the likes.

Friv Kizi

12 years ago

Hey There. I found your blog using msn. This is a really well written article.
I’ll make sure to bookmark it and come back to read more of your useful info. Thanks for the post. I will certainly return.