We all know the outcome of a playoff series is a random enough event that it’s pointless to spend 15+ hours building a model to predict the winners. So I spent 15+ hours last week building a model to predict the winners. It’s based on fangraphs’ Runs Above Replacement for pitchers and Runs Above Average for hitters, fielders, and baserunners. More after the jump.
Let’s start here. My best guess at each team’s postseason roster and the role each player will play:
[table id=256 /]
Of course, not every team will employ 12 pitchers and 13 position players in these exact roles. You may know something I don’t know, and I’m more than willing to update this table and the next few based on information you leave in the comments below.
Now, for the pitchers, let’s look at their Runs Above Average per inning pitched in 2015, per fangraphs. This includes games pitched for other teams, where applicable:
[table id=257 /]
These figures correspond to the same cells in the prior table, so you may have to maximize both tables to read them. The biggest number- that .308 in the closer row in the Yankees column- is Andrew Miller. His looks a lot like Kershaw’s .302 in the top right corner. The smallest number, half a run below replacement, represents Marco Gonzales. He may not be on the Cardinals’ postseason roster. As you’ll see below, it won’t make much difference.
On to the position players. Here are Fielding Runs Above Average (Def) per plate appearance for the players I expect to take the field at some point this month:
[table id=258 /]
As you may have ascertained, these figures include the positional adjustment, which is why most shortstops and catchers are positive, while most first basemen and corner outfielders are negative. I used PA as a baseline, rather than innings fielded, because the positional adjustment would make for some wonky numbers if a player was primarily a DH, but played a few innings in the field. The biggest numbers belong to Rangers utilityman Hanser Alberto (.055) and Cubs backup backstop David Ross (.046), who had just 104 and 182 PA, respectively. Among the regulars, Kevin Pillar ranks as the best fielder, having saved a quarter of a run every 10 PA. The worst scores are Chris Colabello’s -.056 and Pedro Alvarez’s -.044.
Let’s do the same for hitters, showing total hitting and baserunning Runs Above Average (Off) per plate appearance for the guys likely to bat:
[table id=259 /]
These lineups are all either actual lineups used at some point in the last week of the regular season or actual lineups tweaked to include a missing player or two. I gave all the NL teams a DH to put them on even footing with the AL teams. Obviously, these are more formidable lineups than we’ll see from NL teams before the World Series. The best hitter in the sample is Josh Donaldson at .069, with teammates Jose Bautista and Edwin Encarnacion 2nd and 4th, respectively, at .057 and .055. The Cubs claim the other two in the top five: Anthony Rizzo at .056 and Kris Bryant at .050. At the other end, Hanser Alberto and St. Louis’s Pete Kozma each cost their teams 8 runs every hundred times they came to the plate this year.
The next step is to project how many innings each pitcher will pitch and how many plate appearances each player will get in the playoffs. We know that teams will lean on their top starters and best relievers to the greatest extent possible, and that players who hit at the top of the order will bat more than the guys at the bottom (by about 1/9 of a PA per lineup spot per game). Using some real data (for instance, starters averaged 5.8 IP per start this year, but the #1 starters on the ten playoff teams averaged 6.7) and some guesses (a best-of-five series has a 20% chance of ending in a sweep and a 30% chance of going the distance), I made these projections and multiplied them by a weighted expected number of innings in a series of each length (best-of-1, best-of-5, best-of-7). The only contribution for the player I deemed Pinch Runner was Baserunning Runs Above Average per game times an expected one opportunity per playoff game. I called the sum of the pitching, fielding, hitting, and baserunning scores Playoff Runs.
You may notice that the baseline for pitchers is replacement level, while the baseline for position players is average. This is because these are the run values Fangraphs makes available. For those with a full season of innings pitched or plate appearances, this doesn’t make much difference when aggregating scores, since everyone gets credit for the same ~2 wins between replacement level and average. For those with less playing time, really good or really bad hitters over small samples drive their teams’ Playoff Runs up or down more than pitchers. This effect is mitigated to some extent by my projections not giving more than seven plate appearances in any series to a player not in the designated starting nine.
Let’s start by looking at the four teams playing in the Wild Card games and observing how they stack up in a one-game projection:
[table id=260 /]
The Astros look better than the Yankees, not just because of Dallas Keuchel, but because of superior hitting and fielding. The Cubs are the best of the bunch, led by Jake Arrieta, but again besting their foes in all three categories.
Here’s what all the playoff teams look like over best-of-five or -seven series. For simplicity’s sake, I did not shuffle starters around based on which teams are forced to burn their aces in the playoff games. It rarely makes a difference in terms of which team looks better than their potential opponent.
[table id=261 /]
[table id=262 /]
So, using the Playoff Runs model, the Blue Jays, Royals, Mets, and Cubs reach the Championship Series and the Mets beat the Blue Jays for the title. The Mets may be a little overrated here based on small samples for Conforto and Wright, and no one knows whether their young starters have enough innings left in the tank to get through three more rounds, but on a rate basis, their 25-man roster had a better 2015 than any other playoff team’s. That may not give them a real edge against Kershaw and Greinke, but it’s worth knowing.
For a look at each projected series, with notes about flaws in the model and how they could impact each series, read the companion piece here.