[Another work-in-progress. Your observations are welcomed, but please review the “Known shortcomings” at the end of the post.]
Taking suggestions from readers Mike L. and kds, I used the Pythagorean method to calculate the estimated career wins for all starting pitchers with at least 1,000 innings from 1893-2011 (min. 60% of career games as a SP).
Method: First, I calculated each pitcher’s Pythagorean winning percentage (P%), based on his average runs allowed per 9 innings (R/9) and the MLB scoring average per 9 innings over the length of his career (Lg R/9). Note that this is based on all runs, not earned runs. To adapt the Pythagorean formula for this purpose, “Runs Scored” is replaced by the league scoring average (Lg R/9), and “Runs Allowed” is replaced by the pitcher’s R/9. Also, I used the advanced formula’s exponent of 1.83, rather than 2. Thus, the formula is:
(Lg R/9)^1.83
————————————–
(Lg R/9)^1.83 + (R/9)^1.83
Next, I calculated two sets of estimated wins and losses: one based on their actual decisions (W:1, L:1), and one based on their estimated decisions (W:2, L:2). To figure estimated decisions, I first calculated the average IP per decision for this set of pitchers, which came out to 8.77; then I divided each pitcher’s IP by 8.77 to arrive at his estimated decisions. The columns D:1 and D:2 are the difference in wins for each method compared to the actual wins. The total estimated wins by each method is within 0.5% of the total actual wins; both are less than the actual wins.
The table is currently sorted in order of D:1, i.e., the greatest gain in estimated wins over actual wins based on actual decisions. Note that, in order to present as many relevant columns as possible, I’ve truncated the “Years” column to show only the last 2 digits of their final year. I don’t believe this should cause any confusion.
[table id=38 /]
Known shortcomings of this method:
- Because there’s no park adjustment, this method should tend to help pitchers who spent many years in low-run parks and hurt those in high-run parks.
- Estimating wins based on career totals will produce somewhat different results than doing so for each individual year. (Unfortunately, copying into Excel the year-by-year stat lines for over 700 pitchers is more work than I care to do on a volunteer basis.)
- The MLB scoring average used for each pitcher is a straight average of the MLB rates over the length of his career, rather than a weighted average. If a pitcher logged a small number of innings over a number of years in a very different run environment than the bulk of his career, the MLB scoring average used here for that pitcher will not be as accurate as a weighted average would be.
- Estimated decisions are based on the group’s average of 8.77 IP per decision. I have not adjusted for the slight variation in this figure across the eras and that may complicate the task of comparing pitchers across eras. Also, the figure of 8.77 for this group of SPs with 1,000+ IP is higher than the figure for all SPs, because the 1,000-IP standard biases the group towards good pitchers, which means more wins, which means more IP per decision. This fact is basically moot when comparing pitchers among this group, but may distort a comparison to pitchers not listed here or to some abstract standard.