Pyth-Off: The Race for the Best Luck-Free Record

Most HHS readers will be familiar with the “pythagorean expectation” in baseball. That’s the concept, originally formulated by Bill James, that over time one can get a better sense of a team’s true talent level by looking at its runs scored and runs allowed (within a relatively simple mathematical formula) than at its actual win-loss record. The theory, which has well-stood the test of time and further study, is that actual win-loss records can be subject to random fluctuations in the distribution of runs across games, but that with a large enough sample to smooth out such random fluctuations those runs scored and runs allowed numbers will prevail as the the more accurate determinant of the most talented teams.

The 2012 race for the best pythagorean-expectation-based record in the AL is a tight one right now. Through last night’s games, the Rangers have (using b-ref’s version of the Pythagorean calculation) a record of 76-53, merely one game ahead of the Yankees, whose Pythag record is 75-54. You have to go back to 2005 to find a season in which an Eastern Division team failed to hold the best Pythag record in the AL. After the jump is a list of the season-by-season leaders in Pythag record in the American League during the three-division era.
If a different team than the Pythag leader had the AL’s actual best regular season win-loss record, that team appears in parentheses:

2011: Yankees
2010: Yankees (Rays)
2009: Yankees
2008: Red Sox (Angels)
2007: Red Sox (Indians/Red Sox tie)
2006: Yankees/Tigers tie (Yankees)
2005: Indians (White Sox)
2004: Red Sox (Yankees)
2003: Mariners (Yankees)
2002: Angels (Yankees)
2001: Mariners
2000: A’s (White Sox)
1999: Yankees
1998: Yankees
1997: Yankees (Orioles)
1996: Indians
1995: Indians
1994: White Sox (Yankees)

The Cardinals currently have a 1.5 game lead on the Nats for the 2012 NL Pythag standings lead, despite trailing the Nats and four other NL teams in actual winning percentage. The Cards in the NL Central are four games ahead of the Reds in Pythag terms, even as they trail the Reds by seven full games in the division standings that count.

Here are the year-by-year, league-wide Pythag leaders in the NL from 1994 on (again, the best real-world record in the league is in parens if different from the Pythag leader):

2011: Phillies
2010: Phillies
2009: Dodgers
2008: Cubs
2007: Rockies (Diamondbacks)
2006: Mets
2005: Cardinals
2004: Cardinals
2003: Braves
2002: Giants (Braves)
2001: Diamondbacks (Cards/Astros tie)
2000: Giants
1999: Diamondbacks (Braves)
1998: Braves/Astros tie (Braves)
1997: Braves
1996: Braves
1995: Braves/Reds (Braves)
1994: Expos

0 0 votes
Article Rating
Subscribe
Notify of
guest

57 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Chris
12 years ago

I would never have guessed the Rockies had the best Pythagorean W-L record in the NL in 2007.

brp
brp
12 years ago
Reply to  birtelcom

Pyth just accounts for runs for/runs against, right? I would think a team in a higher scoring park like Colorado might do a tiny bit better in Pythagorean W-L because of a higher chance to score a lot of runs and get some blowout wins.

Obviously they could also suffer due to blowout losses so it’s probably moot, but it seems possible at least. Overall I have to agree with what I believe to be Jim’s point at #6 in that Pythag is not very helpful in any real way and is more “interesting” than anything.

Jim Bouldin
12 years ago
Reply to  birtelcom

I have to disagree with your second paragraph there, but it gets into exactly what is meant by the phrase “probably better/worse than it seems”. By what criteria do we make that judgement? In what sense is a team that scores a lot of runs in blowout wins but loses more than its share of close games “better than seems”? Until that question is addressed, I don’t the point. I don’t agree that the Orioles are likely “worse than they seem”. Indeed, their superior record in close games argues that they have a better chance than most in the playoffs,… Read more »

mosc
mosc
12 years ago
Reply to  birtelcom

He’s saying there’s no such thing as being good in 1-run games. It’s all down to luck in terms of when the run scoring and run producing sides of baseball combine for a team to lead to a W. A better indication of a team’s ability is the number of runs they score and the number of runs they allow. How those things add up on a given day is purely coincidence. The crux of it is saying that players play the same, on both sides of the ball, regardless of the score. That assumption leads to the rest.

Jim Bouldin
12 years ago
Reply to  birtelcom

I don’t think he’s saying that mosc. I addressed that very issue at length a few weeks back and the evidence is strong that the Orioles (at that time 19-6 and now 24-6 in one-run games) are doing that by chance. Some type of skill is coming into play there. John discussed a couple of possibilities of just what that might be, but I’d like to hear some opinions from Oriole fans. And since playoff games are typically tight, this bodes well for them, as well as the White Sox, which is also why I think the Tigers will be… Read more »

Jim Bouldin
12 years ago
Reply to  birtelcom

correction, that should read:

“evidence is strong that the Orioles…are *NOT* doing that by chance.”

John Autin
Editor
12 years ago
Reply to  birtelcom

I still don’t think Baltimore’s 24-6 record in one-run games is strong evidence of anything useful, but I’m not going to step into that debate again, especially since I’m already facing a bitter dose of medicine.

On a tangent, David Schoenfield of ESPN’s SweetSpot blog linked to this FanGraphs piece about what to expect from the Orioles (though not specifically about one-run games):
http://www.fangraphs.com/blogs/index.php/slowly-back-away-from-the-pythag-expectation/

The gist: They say that for predicting a 2nd-half record from 1st-half data, the actual record is more predictive than the expected record.

Jim Bouldin
12 years ago
Reply to  birtelcom

Bill James is assumedly a very smart guy, but I have no idea why he thought the pythag was useful in any way, practical or theoretical. I think it’s one of the most useless statistics I’ve ever encountered. As for the Orioles, my working theory is that it’s due largely to something about Showalter’s affect on those guys (and likely Thome as well–very smart move picking him up IMO). I agree exactly with this quote from the NYT: “In virtually every category of winning that has been called unsustainable or based in luck, the Orioles have excelled. With a 24-6… Read more »

Jim Bouldin
12 years ago
Reply to  birtelcom

err…Showalter’s *effect* on those guys that should be. Not that I don’t think he likes his players mind you.

Too much coffee and too little sleep recently.

bstar
bstar
12 years ago
Reply to  birtelcom

I’m not a big fan of the Pythag thing either but I don’t believe one-year records in one-run games are any indication of any sort of skill whatsoever. There may be some one-year wave of confidence that the O’s are running on right now; that is, they feel confident in one-run games right now because they’ve won such a high percentage of close games this year but I don’t think it tells us anything about what kind of one-run team they will be next year. This dynamic reminds me a lot of creating turnovers in football. Just because an NFL… Read more »

no statistician but
no statistician but
12 years ago
Reply to  birtelcom

For Jim Bouldin, mainly: Jim, I tend to agree that completely ascribing success in one-run games to “luck” doesn’t conform to reality. The difficulty I see in any wholesale evaluation, though, derives from the variety of situations that result in one-run outcomes. To name a few: the 1-0 shutout; closing the gap to one run late in the game but losing (or—holding on to that single run advantage in the end); the one run margin that stays the same from the fifth inning on; the tie game broken in late or extra innings; the come from behind victory. The latter… Read more »

Hartvig
Hartvig
12 years ago

Is there an easy way to determine the biggest under and over performing teams (vs. the Pythagrorean expectation) of all-time or which manager was the best or worst?

I ask because in a discussion long ago I looked into Gene Mauch- who, as I expected, consistently underperformed- until his final seasons as a manager with the Angels.

Andy
Admin
12 years ago

Why don’t YOU pyth off?

Hub Kid
Hub Kid
12 years ago
Reply to  Andy

Raising this whole idea appears to be a good way to p/…
really anger Orioles fans.

Brooklyn Mick
Brooklyn Mick
12 years ago
Reply to  Hub Kid

Low attendance at Camden Yards is evidence of the savvy nature of the Baltimore fanbase.

And I bet many other teams wish they were playing the Pythagorean Orioles (61-70) instead of the team that actually takes the field (73-58).

Jim Bouldin
12 years ago

A group of us have investigated the “pythag” using extensive game simulations over a number of simulated seasons. In a surprising result that none of us could have anticipated, it turns out that in order to win baseball games, it’s highly helpful to score >= 1 run more than your opponent! Some would go so far as to say “mandatory”, although others might consider such a view as bordering on unnecessary and counter-productive extremism. The corollary of this surprising finding, surprisingly enough, is that, surprisingly, those teams that win more games than they lose also tend, surprisingly, to also score… Read more »

Jim Dreyer
Jim Dreyer
12 years ago
Reply to  Jim Bouldin

That is quite a radical theory you have there. Can you see if the same holds true for other sports such as football or basketball? I think I may use this as a betting edge. If I “know” that 1 team is going to score >=1 than the other team, that may be an angle worth exploiting.

As Doug Collins once said “Any time Detroit scores more than 100 points and holds the other team below 100 points, they almost always win.”

“Almost” being the key word.

mosc
mosc
12 years ago

I would love to see a test of various methods for predicting yearly win loss totals. Compare pythagorean w/l to a method tied to ops, just for example, and see which comes ahead. You WAR lovers can try that one out too. There is a “best” answer here and I’m surprised someone hasn’t tried to figure it out. We’ve got years of sample size to pull from.

JDV
JDV
12 years ago

Oriole fan and ‘pyth’ skeptic. I think the Orioles success in close games and extra-inning games is impressive…something beyond dumb luck. That said, I do think they’re over-performing. I think their difficulty in scoring runs in the absence of home runs, and their sub-par defense, and their unstable rotation will likely limit their progression from here on out. To the extent that those factors affect ‘runs for’ and ‘runs against’, fine, but ‘pyth’ itself is worthless. What is the point of claiming that a team that didn’t do well, should have, or that a team that did well, shouldn’t have.… Read more »

Hub Kid
Hub Kid
12 years ago
Reply to  JDV

Within a season I can see how runs scored and runs allowed may help predict a team’s future and ability. But, as a poster put it a while ago, pixie dust can last for a whole season and into the post season, and outlast any of this. Team spirit, or a good manager, leadership, or clutch ability or any number of other unquantifiable things including luck may have something to do with it. When a season is over, and any possible predictive value of pythagorean record is gone, it still interests me as a fan. As someone with a rooting… Read more »

e pluribus munu
e pluribus munu
12 years ago

I don’t have objections to the formula, and it seems to me to get at one of the most interesting issues in baseball: the ability to deploy talent when it’s most valuable – a talent that some see as the spark that drives a clutch player or team, and others as the phlogiston of baseball theory – an outdated myth. (This relates to a list of interesting HHS discussions, most recently whether measures like ERA+ are deficient because they treat all earned runs allowed as equal, unduly penalizing strong pitching seasons that include some super-blowups – a question behind John… Read more »

Brent
Brent
12 years ago

The Cardinals are working hard this week to get their Pythag record in line with their actual record, losing the last three games 9-0, 5-0 and 8-1.

Brent
Brent
12 years ago

bstar @18: Kansas City Chiefs under Schottenheimer turnover differential (with NFL rank in parentheses): 1989 -8 (21st); 1990 +26 (1st); 1991 +11 (5th); 1992 +18 (1st); 1993 +10 (5th); 1994 +12 (2nd); 1995 +12 (1st); 1996 +3 (11th); 1997 +14 (3rd); 1998 +1 (14th). I think the combination of an aggressive pressure on QB defense with a conservative, run first offense probably can consistently have a positive TO ratio in the NFL.

bstar
bstar
12 years ago
Reply to  Brent

Well done, Brent. Poor Schottenheimer was underrated. I saw my Broncos beat his Browns in the AFC Championship game in ’86(The Drive) and ’87(The Fumble). Cleveland was a better team both years and in my opinion one of the better teams to never make a Super Bowl. And those Joe-Montana-led Schottenheimer Chiefs teams of the early ’90s were better than history remembers also.

Paul E
Paul E
12 years ago
Reply to  bstar

bstar:
Not to mention that recent Marty-led Charger team (14-2 ?) that Norv Turner took over and just totally “Norved” into mediocrity………

bstar
bstar
12 years ago
Reply to  Paul E

When will San Diego learn that Norv Turner is a great offensive coordinator but not a good head coach, Paul?

I believe the loss in the playoffs that 14-2 team sustained got Marty fired. Several years before he got hired as Redskins head coach, went ?8-8? his first year, and got fired because Daniel Snyder had his Steve Spurrier moment. Again….poor Marty.

Jim Bouldin
12 years ago

Lots of good comments above. Setting aside my overly sarcastic response at #6, the main problem I have with the pythag is embedded in the notions of how many games a team “should have won” and “luck”. It seems that some jump straight to the conclusion that if a pythag prediction differs from an actual record, that “luck/unluck” was the cause thereof. I take great exception to that as a general explanation. Also, to clarify, I include all those typically “intangible” things such as confidence, focus, calmness under pressure, etc., as definite skills (skills subject to positive and negative feedback,… Read more »

Jim Bouldin
12 years ago
Reply to  birtelcom

It’s an excellent argument birtelcom, clear and well raised. John and others have also raised it, or parts of it, as well. It’s entirely possible that I’m making more of the “skill not luck” argument than is warranted, or at least not clarifying my position well enough. Let me try to do the latter, incorporating your commment and the others above. I actually think a number of the arguments offered here are not incompatible with each other, but care and precision is required in the discussion. I think there are (at least) two issues here. One, already mentioned by others,… Read more »

Jim Bouldin
12 years ago
Reply to  birtelcom

I left out one sentence there. After: “Whatever the reason, they have some type of confidence in their ability to pull out the close ones, and enough baseball ability to actually execute it.” should come: However, this phenomenon is indeed ephemeral, or perhaps situational, in that it doesn’t repeat the following year, and even the players and coaches may not know why. But that doesn’t mean that it’s not due to the players actually performing better than their opponents while it *is* happening. Commonly referred to as “chemistry”, but chemistry is not at all the same thing as luck in… Read more »

Jim Bouldin
12 years ago
Reply to  birtelcom

Another way to look at it is that we have two essentially contradictory pieces of information.

The first piece of evidence is James’ study that shows that the historically best and worst pythag expectation teams rarely repeat the feat the following year.

The second is that if, using all 1-r games over the last 20 years as a basis, one computes a binomial expectation of going 24-6 in 1-r games, when chance alone determines the outcome of each game, is extremely improbable.

The question then is how to resolve these two. Does one trump the other necessarily (or obviously)?

Jim Bouldin
12 years ago
Reply to  Jim Bouldin

I seem to have lost the ability to compose a sentence, and paragraph, correctly. Bear with me and replace the above with the following. ==================================== Another way to look at it is that we have two *apparently* contradictory pieces of information. The first is James’ study that shows that the historically best and worst pythag expectation teams rarely repeat the feat the following year. The second is that, using the set of 1-r games over the last 20 years as a basis, the probability of going 24-6 in 1-r games after 30 such games, is extremely low (p < .001),… Read more »

Jim Bouldin
12 years ago
Reply to  birtelcom

I already accounted for the multiple season effect. More than accounted actually–I ran one million trials of team-seasons (actually partial seasons, using the number of 1-r games played to date for each of the 30 teams*). For each trial, a randomly chosen number of wins from the binomial distribution was chosen, and the corresponding win % computed. The probability of achieving an .800 win % came out at .00026. For just the standard binomial prob of 24 wins in 30 games I get .00016, or just slightly lower, and about 1/4 the value you’re getting. Be careful with Excel, it… Read more »

Jim Bouldin
12 years ago
Reply to  birtelcom

OK, I just read the Neyer piece and needless to say, I disagree with him on the luck issue.

e pluribus munu
e pluribus munu
12 years ago
Reply to  birtelcom

It should be obvious from my earlier comment that I feel Jim is on the right track in his general approach: luck/skill is not an adequate set of options to explore this issue. It seems to me as though the only way to get at what’s going on with the O’s – or with any team that the Pythagorean results flag as under- or over-performing – is to review the games anecdotally one by one, looking as closely as possible at what actually happened to assess the degree to which individual/team agency (on both W and L sides) determined the… Read more »

Jim Bouldin
12 years ago
Reply to  birtelcom

birtelcom, your value of .0007 for the binomial prob of 24 out of 30 is correct. I mistakenly used 25 out of 30, which gives .00016. In R you have to be careful how you phrase the binomial probability statement or you will add one extra success to the computation, which I did.

So, you did it right and Excel is computing it right.

tag
tag
12 years ago
Reply to  birtelcom

Birtelcom, This to me is where the binomial distribution argument breaks down. Many times teams don’t have a 50-50 chance of winning a one-run game. For the O’s story in particular, see my posts under Saturday AL notes. Eight of their one-run wins came when they took large early leads and frittered them away. Now, depending on exactly when those leads were whittled down, there might have been only one or two innings when the trailing team was only trailing by a single run. The team leading in this case, especially when it has a bullpen as good as the… Read more »

Jim Bouldin
12 years ago

As long as we’re talking Orioles, they’re already won 3 more games than they did all of last year. And this seems not to be the first time the franchise has achieved a remarkable turnaround in short order: Year W L W% 2011 69 93 .426 2012 72 58 .554 1967 76 85 .472 1968 91 71 .562 1969 109 53 .673 1988 54 107 .335 1989 87 75 .537 I wonder where that 88/89 turnaround ranks among one year turnarounds. The 68 to 69 jump may be equally impressive, given how hard it is to make a big jump… Read more »

Richard Chester
Richard Chester
12 years ago
Reply to  birtelcom

Decreases can be more startling. The A’s went from .651 in 1914 to .283 in 1915.

Lawrence Azrin
Lawrence Azrin
12 years ago

Connie Mack basically dismantled the A’s between 1914 and 1915 because:
– he concluded having a perennial pennant-winner wasn’t as profitable as he thought it would be
– competition from the Federal League was raising MLB salaries, especially for his big stars (Collins, Baker)

Jim Bouldin
12 years ago
Reply to  birtelcom

It’s the jump from 1968 to 69 that really has me interested. An 18 game improvement for a team that had already won 91 games–that’s impressive.

The Mariners however, went from a 91 team to a 116 win team in 2001, which must rank as the most astounding improvement by an MLB team ever.

Doug
Doug
12 years ago
Reply to  Jim Bouldin

Part of the reason was an expansion year. The other was that some pitchers had more success than others adjusting to the new mound and strike zone.

e pluribus munu
e pluribus munu
12 years ago
Reply to  Jim Bouldin

I don’t know, Jim. There have been a lot of big one-season leaps: The Braves leapt 25 wins from ’13 to ’14, like the Mriners, but in a 154-game schedule; they added 33 wins over ’35/’36 and 29 wins over ’90/’91 29. The A’s ’46/’47 added 29 wins (of course, the ’14/’15 negative 53 wins is much more impressive); the Mets leapt 27 wins over ’68/’69 (following a 12 game gain the previous year). Along the lines of the Mariners, going from very good to superlative, the 1906 Cubs moved to 116 wins from 92, gaining one fewer than Seattle,… Read more »

e pluribus munu
e pluribus munu
12 years ago

Replying to my own post, because I realized I’d failed to look above Jim’s #48, where he and others had already noted most of these. Only the ’06 Cubs example is really relevant.

Richard Chester
Richard Chester
12 years ago
Reply to  Jim Bouldin

The 1998 D’backs went from 65 wins to 100 wins in 1999.

oneblankspace
oneblankspace
12 years ago
Reply to  Jim Bouldin

The 1988 O’s, as you may recall, started the season 0-21 or something like that.

Brooklyn Mick
Brooklyn Mick
12 years ago
Reply to  oneblankspace

The 1987 O’s weren’t a good team either. Their final record of 67-95 was capped off with a 7-24 finish, all under the helm of Cal Ripkin Sr. After an 0-6 start in 1988 Cal Sr. got the ax and hired Frank Robinson, under who’s guidance the O’s went on to lose 15 more straight en route to a last place finish with a dismal 54-107 record. These were the final days of Edward Bennett Williams as the club owner, who had focused much of his attention to negotiations with the city of Baltimore for the long-term deal that would… Read more »

Mike L
Mike L
12 years ago
Reply to  birtelcom

birtelcom, as a Yankee fan, I certainly hope so. The home team is reeling from injuries and age. However, Baltimore is a big hearted town, I have friends there, and they will forgive.

Bryan
12 years ago

Seven times since 2001, the team with the best W-L record didn’t have the best pythag record and both teams made the playoffs. *Every time*, the team with the better pythag record advanced farther in the postseason than the team with the better W-L. Even if we throw in the ties in ’06 and ’07, the ’06 Tigers, who tied NY in pythag but not W-L, beat them in the playoffs and the ’07 Red Sox, who won the pythag outright but tied in W-L, beat the Indians in the playoffs. Can’t take much from such a small sample, but… Read more »

tag
tag
12 years ago
Reply to  Bryan

Bryan,

From what I’ve read, the reason good pythag teams don’t record better W-L records is mainly due to bad luck in their one-run games (an argument that a lot of people around here don’t want to hear). In fact the team with the better pythag but inferior W-L record could very well be “superior,” as evidenced by better records in wins by 3-, 5- and 10-run margins, which, I believe and I think has been borne out in various studies, is far more indicative of team strength than records in one-run games.

Bryan
12 years ago
Reply to  tag

Tag, I’m a firm believer in the predictive value of a team’s pythagorean record. I’ve dismissed this year’s Orioles on that basis more times than I’m willing to admit. What surprised me in tinkering around with the numbers above was how consistently pythag predicted postseason success, since postseason baseball isn’t the same game. Teams use fewer pitchers and rarely give their stars nights off in October. It’s often hypothesized that teams that were able to outperform their pythags over a long season did so because they were blown out a lot when their weaker pitchers were on the mound, but… Read more »