Fun With Win Expectancy
Baseball is a complex game full of multifaceted and convoluted statistics and numbers. There are statistics that assign a value to how many runs are created due to a runner’s speed on the bases. Other statistics measure how many of a batter’s balls in play go for hits. There is even a statistic to measure how well a relief pitcher helped or hindered his team’s chances of winning a game. That’s actually a stat. Shutdowns and Meltdowns are measured and kept track of. A relief pitcher gets a Shutdown if they increase the team’s win probability by 6%, and they get a Meltdown if the team is 6% more likely to lose because of that pitcher. I’m sure if you are wondering how well Felix Hernandez pitches during day games against teams with animal mascots (Tigers, Marlins, Rays, Orioles, Cardinals, Blue Jays, Cubs, Diamondbacks) after he has had a turkey sandwich for lunch, that statistic is probably on the internet somewhere. Basically the availability of all of this data and statistics can make the sport overwhelming.
But really, isn’t the point of baseball to score more runs than the other team, while simultaneously keeping them from scoring runs? This is the same principle as all other team sports. Do more of something while keeping your opponent from doing that same thing. By making that statement means you can take a look at the number of runs scored and runs allowed by a team and determine their win-loss record. That is true for the most part, but there is enough differentiation with actual application that Bill James* created a formula, called the Pythagorean Win-Loss, to estimate how many games a team SHOULD have won based on the number of runs they scored and allowed.
* Bill James is the ultimate baseball statistician. His work has revolutionized the way the game of baseball is viewed in regards to statistics. He coined the term sabermetircs, and in 2006, he was one of 100 most influential people in the world, according to Time magazine. And he’s from the great state of Kansas.
This formula can be used to determine which teams have more “luck” than others, but it does not explicitly determine skill and talent level of teams. Simply put, the formula can be used to look at the relationship between runs scored and allowed and project a win-loss record.
It’s also fun to compare the current standings with the Pythagorean expected standings to see which division winner has benefited from “luck,” or which teams would be projected to win more games than they have. Below are the standing as of September 12. The runs scored RS and runs allowed RA are the determinants of the Pythagorean standings.
The first-order column is based on the Pythagorean win-loss formula above. The second-order column substitutes actual runs scored and allowed with expected runs scored and allowed. The expected number of runs is formulated using more complex statistics that are determined by a team’s skill, or lack thereof, in order to eliminate luck from the equation. Third order wins are second-order wins adjusted for strength of schedule. The higher order winning percentages have been shown to predict future actual team winning percentage better than the actual and first order winning percentage. Cool Stuff!
Using third order winning percentage, the Red Sox gain about 5 wins on the Yankees, and they would be in first even though they are 3 1/2 games back in the actual standings. The Dodgers would be ahead of the incredibly lucky Diamondbacks. The NL West in general would have a much more exciting race with 4 teams likely within 2 games of each other. The injury-plagued Twins would have the worst record in baseball; the formula implies the Twinkies should have lost about 8 games more than they actually did. I guess that means the Astros have something to be excited about, the formula projects them to have about 7 more wins. The Indians record would be about the same as the Royals, which seems about right. They had a very lucky season.
Other than projecting wins and losses, the Pythagorean Win-Loss formula could* be used for trash talking. Just imagine the reaction you’ll receive at a sports bar in Phoenix when you say the Dodgers are better than the Diamondbacks because the Pythagorean Win-Loss formula provides indisputable evidence based on runs scored and runs allowed by each team that the Diamondbacks are about 10 wins worse than their record indicates. That should go over real well.
*And should!