Pythagorean Estimated Wins for SPs (1893-2011)

[Another work-in-progress. Your observations are welcomed, but please review the “Known shortcomings” at the end of the post.]

Taking suggestions from readers Mike L. and kds, I used the Pythagorean method to calculate the estimated career wins for all starting pitchers with at least 1,000 innings from 1893-2011 (min. 60% of career games as a SP).

Method: First, I calculated each pitcher’s Pythagorean winning percentage (P%), based on his average runs allowed per 9 innings (R/9) and the MLB scoring average per 9 innings over the length of his career (Lg R/9). Note that this is based on all runs, not earned runs. To adapt the Pythagorean formula for this purpose, “Runs Scored” is replaced by the league scoring average (Lg R/9), and “Runs Allowed” is replaced by the pitcher’s R/9. Also, I used the advanced formula’s exponent of 1.83, rather than 2. Thus, the formula is:

(Lg R/9)^1.83
————————————–
(Lg R/9)^1.83 + (R/9)^1.83

Next, I calculated two sets of estimated wins and losses: one based on their actual decisions (W:1, L:1), and one based on their estimated decisions (W:2, L:2). To figure estimated decisions, I first calculated the average IP per decision for this set of pitchers, which came out to 8.77; then I divided each pitcher’s IP by 8.77 to arrive at his estimated decisions. The columns D:1 and D:2 are the difference in wins for each method compared to the actual wins. The total estimated wins by each method is within 0.5% of the total actual wins; both are less than the actual wins.

The table is currently sorted in order of D:1, i.e., the greatest gain in estimated wins over actual wins based on actual decisions. Note that, in order to present as many relevant columns as possible, I’ve truncated the “Years” column to show only the last 2 digits of their final year. I don’t believe this should cause any confusion.

[table id=38 /]

 

Known shortcomings of this method:

  • Because there’s no park adjustment, this method should tend to help pitchers who spent many years in low-run parks and hurt those in high-run parks.
  • Estimating wins based on career totals will produce somewhat different results than doing so for each individual year. (Unfortunately, copying into Excel the year-by-year stat lines for over 700 pitchers is more work than I care to do on a volunteer basis.)
  • The MLB scoring average used for each pitcher is a straight average of the MLB rates over the length of his career, rather than a weighted average. If a pitcher logged a small number of innings over a number of years in a very different run environment than the bulk of his career, the MLB scoring average used here for that pitcher will not be as accurate as a weighted average would be.
  • Estimated decisions are based on the group’s average of 8.77 IP per decision. I have not adjusted for the slight variation in this figure across the eras and that may complicate the task of comparing pitchers across eras. Also, the figure of 8.77 for this group of SPs with 1,000+ IP is higher than the figure for all SPs, because the 1,000-IP standard biases the group towards good pitchers, which means more wins, which means more IP per decision. This fact is basically moot when comparing pitchers among this group, but may distort a comparison to pitchers not listed here or to some abstract standard.
0 0 votes
Article Rating
Subscribe
Notify of
guest

57 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Lawrence Azrin
Lawrence Azrin
12 years ago

This method of adjusting career W-L totals makes the HOF arguments for Rixey, Waddell, Walsh, and Drysdale look much stronger.

Kaiser Wilhelm never should have started that war back in 1914.

Hank G.
Hank G.
12 years ago
Reply to  Lawrence Azrin

It also burnishes the HOF argument for Babe Ruth. Of the top ten pitchers in winning percentage, he is the only one that is estimated to have been better than his W-L record. That may be an anomaly of his not having a decline phase as a pitcher.

The great majority of the top 100 in winning percentage lose estimated wins, over half of them double digits. I don’t know if that’s indicative of anything, but it’s interesting.

The two major exceptions are Walter Johnson and Carl Hubbell.

Dr. Doom
Dr. Doom
12 years ago
Reply to  John Autin

File this comment under “times I miss the “Like” button.

kds
kds
12 years ago
Reply to  Hank G.

Part of it may be an artifact of not weighing the seasons by decisions or IP. He had 5 decisions, 31 IP in 4 years with the Yankees in the high scoring lively ball era, vs 135 and 1190.1 in 6 years with the Sox in the dead ball era. This method counts those Yankee years as 40% when figuring what his league average run support would be, rather than about 3%. The increase in run scoring after the dead ball era was probably the fastest change up or down in ML history, so improper weighing of seasons in careers… Read more »

kds
kds
12 years ago
Reply to  John Autin

Thanks, I tried to weasel word it because I hadn’t done the research. Could you do the drops also? I’m betting some part of 1894-1908 beats the 60’s and 1911-1917, but I’ve been wrong before. Like, tuday.!

bstar
12 years ago
Reply to  John Autin

Well, there’s 1977, 1993, and 1994 on the biggest rise in R/g charts again. ’77 is rumored to be the first time Rawlings changed the ball. It probably happened again, sometime between ’92 and ’93. There is almost indisputable scientific evidence that the ball used in the mid 90s and onward was a different ball than the previous one.

MikeD
MikeD
12 years ago
Reply to  Lawrence Azrin

Rixey, Waddell, Walsh and Drysdale?

No, Yes, Yes and Yes.

Not that anyone asked.

Howard
Howard
12 years ago
Reply to  Lawrence Azrin

Drysdale benefits quite a bit on this chart because he played most of his career in a park very favorable to pitchers.

Howard
Howard
12 years ago
Reply to  John Autin

Good points. I thought it would have been more than 62%.

Paul E
Paul E
12 years ago

JA:
Thanks for the research and hard work – couldn’t help but notice Roberts and Blyleven joining the 300 win club and Maddux tying Spahn’s actual win total. Glavine, Johnson, Wynn, and Grove drop out of the 300 win club…

AND, of course, the luckiest “pitch to the score” hurler of all-time Jack Morris makes an appearance as number 774 of 778 on your chart. I guess we’ll hear about that Hall of Fame debate momentarily

Hank G.
Hank G.
12 years ago
Reply to  Paul E

The estimates also heavily penalize Andy Pettitte and David Wells. I don’t think too many people think of Wells as a HOFamer, but I think some people consider Pettitte at least marginally qualified.

bstar
12 years ago
Reply to  Hank G.

If CC Sabathia signs an extension and wants to end up a Yankee for life, he should start showing up as a guy who got/gets a big boost from having a potent offense hitting for him. Will this ultimately hurt his chances or help them? I would say help, especially if he gets close to or over 300 wins. It’s certainly not hurting Andy Pettitte and Jack Morris’ case right now.

bstar
12 years ago
Reply to  bstar

Edit: not hurting Pettitte and Morris’ case(in the minds of those who hold a vote).

Lawrence Azrin
Lawrence Azrin
12 years ago
Reply to  Hank G.

I think that Wells’ HOF case is hurt a lot by the perception that he was the ultimate mercenary, going from team to team his entire career. This is only partially true; he did change teams 11(!) times, but five were because of trades. Petite is seen as more of a True Yankee (yeah, I know about the three years with the Astros), plus he’s got all that post-season success, although Wells’ playoff record is also rather impressive (in 125 IP over 17 series, a 10-5 W-L record and 3.17 ERA). Their WAR are almost identical (50.7/49.9), and the “Hall… Read more »

Lawrence Azrin
Lawrence Azrin
12 years ago
Reply to  Lawrence Azrin

Andy Petite and David Wells are each other’s #1 Most Similar pitchers, the actual number is even the same (898). How did I miss that the first time around?

Hank G.
Hank G.
12 years ago

I understand your reason for using commas instead of a decimal point, but it does make it impossible to sort on WAR correctly.

kds
kds
12 years ago

Good stuff. When I click on the table it only allows me to see the top 100 at most. (I can click on a header to see the bottom 100.) Any way to see parts of the table that are not top or bottom by one of the measures listed? One adjustment you did not mention is for the team’s defense. Probably the easiest way to do this would be to take the Rdef number from their rWAR. My guess is that this would not be as important in general as some of the items you listed above, park effects,… Read more »

kds
kds
12 years ago
Reply to  kds

Well, I discovered what the switches at the bottom right of the table are for, so now I can look at any of the 778 entries.
TZ estimates that the Oriole’s defense saved Palmer 144 runs over the years. Phil Niekro lost about 110 runs to bad defense. If you combine that with Atlanta being a good hitters park his D1 and D2 should go up considerably.

kds
kds
12 years ago
Reply to  John Autin

I don’t know what Sean uses at B-Ref. Pathaganpat is usually said to be the best and most complete variation of a runs to wins estimator. It uses (RS + RA)^.287 to get the exponent to use in the pathag equation. With per game scoring more than twice as high in 1896, than in 1908 it was easier to double your opponents scoring over a number of games, so you would expect a lower winning % than doing the same ina lower scoring time. There were some papers that may be relevant to this in the SABR magazine “By the… Read more »

Fireworks
Fireworks
12 years ago

I knew Ryan would be near the top. I would guess that there’s still perhaps something about being a high K, high BB, low H guy that is hard to account for, but I remember going year by year through Ryan’s career and figuring just from the records of the other starting pitchers on his team and his ERA (not scientific at all) that he probably lost a win a year or thereabout and said to myself he shoulda been +25 in wins and -25 in losses. Funny that your analysis here is about the same. Not that it proves… Read more »

PhilM
PhilM
12 years ago

This is near and dear to my heart: I’ve compiled 600+ pitchers with annual win-loss and run data. I like to use ERA+ to produce “park neutral” LeagueR, which removes the smoothing of career-span average league factors. (And then i prefer negative binomial to Pythagorean, but that’s just me.) I’d put up another list, but I’ll refrain for now — suffice it to say that Eppa Rixey (Go ‘Hoos!) looks like a HOFer to me, too.

Doug
Editor
12 years ago

“The total estimated wins by each method is within 0.5% of the total actual wins” Known shortcomings not withstanding, that level of accuracy seems sufficient for using this method to identify lucky and unlucky pitchers. Remaining question would be whether luckiness or unluckiness is a peculiar trait of certain pitchers. I’m guessing the pitchers at either end of your list probably did better or worse than their estimated wins on a fairly consistent basis. Begs the question whether the something that is being measured really is luck or is it something else if some number of pitchers enjoy or lack… Read more »

kds
kds
12 years ago
Reply to  Doug

When judging a pitcher by his Won/Loss record I think it is fair to call everything that can effect it that is outside his pitching to be luck. And the biggest thing we are looking at here is his teammates offensive abilities. So, I think it is fair to say that almost any Yankee pitcher was lucky to have such a good offense. (Ruffing, Gomez, Ford, Guidry, Pettit, etc.) And some have been quite unlucky in that respect, starting with Walter Johnson, on may bad Senator teams.

Richard Chester
Richard Chester
12 years ago
Reply to  kds

Lefty Gomez once said “I’d rather be lucky than good”.

PhilM
PhilM
12 years ago

It’s results like Ed Brandt that give me pause: career ERA+ of 101 (6 seasons above average and 5 below), career K/BB of 1.13 . . . I don’t see him as “unlucky” to this extent. My NegBin “team-neutral” record for him is 134-133, so 13 games “unlucky.” My top 25(gotta have a list!): Pitcher W L bW bL Luck Johnson, Walter 417 279 477 219 -60 Garvin, Ned 57 97 95 59 -38 Young, Cy 511 316 546 281 -35 Rixey, Eppa 266 251 299 218 -33 Garver, Ned 129 157 162 124 -33 Friend, Bob 197 230 230… Read more »

PhilM
PhilM
12 years ago
Reply to  PhilM

And on the other end, Clark Griffith. Should over 3000 innings of 123 ERA+ result in only two games over .500? I like the idea, but maybe there’s just too much smoothing by using the career endpoints.

PhilM
PhilM
12 years ago
Reply to  John Autin

Great point, JA — those unearned runs occasionally rear up and throw a wrench into my reams of spreadsheets.

And great work to do all this! I have Excel spreadsheets with annual data for 678 starters(and 37 relievers), if you’re interested and/or crazy. . . .

Mike L
Mike L
12 years ago

John A. This is terrific. Looks like it shows something else. If you have a decent amount of talent and durability it’s good to be a Yankee pitcher. They score and have had great bullpens.

Mike L
Mike L
12 years ago
Reply to  John Autin

yeesh. Brings back those bad memories. Leary, BTW, was the second pick of the 1979 draft. 13 years, career ERA+ of 90.

Voomo Zanzibar
Voomo Zanzibar
12 years ago
Reply to  Mike L

Yeah, not that most people have much sympathy for “suffering” Yankees fans, but can everybody just agree on this blog to not post anything that refers in any way to River Ave, 1989 – 1992 ?

MikeD
MikeD
12 years ago
Reply to  John Autin

Interesting stuff.

I do find it interesting that four of the five biggest net losers are all from the past generation (1980s forward), and three of the four were contemporaries, pitching within the last 15 years.

Don’t know what it means, but I’m guessing it means something.

MikeD
MikeD
12 years ago
Reply to  John Autin

I can’t imagine that being on the Yankees didn’t boost their W%! Excellent offense and a man named Rivera to help close out games. I didn’t mention they were all teammates at various times since I wasn’t sure how much that impacted the career records of Wells and Rogers as shown by your charts. Pettitte has spent the bulk of his career with the Yankees, so it’s easy to see how he benefited, but Wells and Rogers spent a smaller portion of their careers with the Yankees. Rogers the least amount of the three and I seemed to remember it… Read more »

Mike L
Mike L
12 years ago
Reply to  John Autin

Note also that Whitey Ford, Allie Reynolds, and Ron Guidry are also big net beneficiaries. It’s not just the 1995-2012 era.

Ed
Ed
12 years ago
Reply to  John Autin

John – Griffith had 5 seasons near the end of his career where he pitched less than 10 innings. I’m guessing that’s why there’s a discrepancy with him?

bstar
12 years ago

The last time we had talked about estimated wins I mentioned an old Bill James study comparing Rick Reuschel and Jack Morris. He took the extra step and actually had Reuschel pitching in Morris’ environment and vice versa and was able to prove you could almost flipflop their W-L records doing so. Though to a lesser extent, John’s study of adjusting a player’s win total based on estimated wins shows a similar result. Here’s the players actual W-L record compared to their W1-L1: Jack Morris actual W-L 254-186 W1-L1 224-216 ERA+ 105 Rk Reuschel actual W-L 214-191 W1-L1 224-191 ERA+… Read more »

bstar
12 years ago
Reply to  bstar

Geez. Typo again. Reuschel’s adjusted W1-L1 should be 224-181. Sorry.

Mike L
Mike L
12 years ago

I want to come back to this “Yankee” distortion, because it’s not likely a coincidence, and may point to a wrinkle in the evaluation. There are two kinds of luck-the first is how good is the team you are pitching on, and the second is how lucky you are in getting the marginal win. Yankee pitchers are “lucky” systemically because they have pitched on good hitting teams with deep bullpens. So, using method 1, if you compare the average runs allowed by a Yankee pitcher to the league average, that shows what he should be doing if he were on… Read more »

Mike L
Mike L
12 years ago

John A, just a thanks if I hadn’t expressed it well enough. Whatever the methodology’s minor flaws, it’s really interesting and the data seem intuitively correct (at least for those pitchers I’ve observed). Well worth scrolling through 700 plus results.