There are many advanced projection systems that do a great job of projecting teams wins. I’m not interested in recreating that or coming up with my own system, but rather setting a baseline for what a projection system should hope to accomplish. You’ll see what I mean in a few moments.
Test #1: Every Team Is The Same
This is the simplest baseline: let’s project each team to go 8-8. If you did that in every season from 1989 to 2014, your model would have been off by, on average, 2.48 wins per team. This is calculated by taking the absolute value of the difference between 0.500 and each team’s actual winning percentage, and multiplying that result by 16. So that should be the absolute floor for any projection model: you have to come closer than that.
Test #2: Every Team Does What They Did Last Year
Looking at all teams from 1990 to 2014, I calculated their winning percentages in that season (Year N) and in the prior season (Year N-1). If you used the previous year’s record to project this year’s record, you would have been off by, on average, 2.84 wins per team. That’s right: you are better off predicting every team to go 8-8 than to predict every team to repeat what they did last season.
But that’s simply an artifact of the power of regression to the mean: it doesn’t mean last year doesn’t matter, but that we just need to be a little smarter about it. If you run a regresion using last year’s winning percentage to predict this year’s winning percentage, the best-fit formula (R^2 = 0.11) is 0.338 + 0.327 * N-1_Win%. This shows the power of regression to the mean — we are only going to take about one-third of last year’s winning percentage to project this year’s winning percentage.
So, if instead of using last year’s winning percentage, let’s used a regressed version of last year’s winning percentage. This would increase our model by reducing the delta to 2.35 wins per team, or by about 17%.
Test #3: Pythagenpat Winning Percentage
Another option is to use a regressed form of Pythagenpat Winning Percentage. The best-fit formula (R^2 = 0.13) is 0.302 + 0.400 * N-1_Pythagenpat_Win%. Just looking at the exponents, you can see that this is slightly more precise than just using last year’s winning percentage: the constant has dropped by about 4%, while the weight on last year’s data has jumped from about 1/3 to 2/5. By using this regressed version of Pythagenpat Winning Percentage, we reduce the delta to 2.31 wins per team.
Test #4: Offensive and Defensive SRS Ratings
Unlike in our other tests, using offensive and defensive Simple Rating System ratings allows us to take advantage of the fact that offenses are more consistent than defenses. The best-fit formula using these grades (R^2 = 0.15) is:
Year_N_Win% = 0.501 + 0.0146 * Off_SRS_Year_N-1 + 0.0075 * Def_SRS_Year_N-1
This tells us that offensive SRS grades are nearly twice as important as defensive SRS grades when projecting future performance. If we use this regressed formula of offensive and defensive SRS ratings, we reduce the delta to 2.30 wins per team. That’s obviously not much of an improvement, although it’s about 0.02 wins per team from the prior example. (It only looking like 0.01 due to rounding; it’s an increase of 0.019 wins per team from using the regresesd version of Pythagenpat.))
Test #5: Offensive and Defensive SRS Ratings Excludes Return Scores
Using the numbers from Tom M. available here, we can ignore non-offensive scores. The best-fit formula is:
Year_N_Win% = 0.501 + 0.01527 * Off_SRS_Year_N-1 + 0.0092 * Def_SRS_Year_N-1
Unsurprisingly, there’s not much of a difference here, but it does slightly improve on the previous test.
So my conclusion from today’s research. At a minimum, any projection system should have a goal of reducing the difference between projected and actual wins, over a long enough period, to below 2.30.
Before we conclude, let’s look at the 2014 data, and what that would mean for 2015 projections. Here’s how to read the table below. Denver had an Offensive SRS (which excludes non-offensive scores) of +9.0 last year, and a Defensive SRS (which again excludes non-offensive scores) of +0.1. 1 Using the formula above, we would project Denver to win 10.2 games this year. Now, Vegas had the Broncos as a 10-win team as of August 5th. However, Vegas projected the 32 teams to win 264 games, and there are only 256 games in a season. So I’ve reduced each team’s projected wins total by 256/264, which drops Denver down to 9.7 projected wins. So this formula has the Broncos with 0.5 more wins than what Vegas is projecting.
|Rk||Tm||Off SRS||Def SRS||Proj||Vegas||Vegas (Adj)||Diff|
All 32 teams are projected within 1.2 wins of their Vegas wins total, and 22 of the teams within 0.5 wins. Vegas isn’t projecting as much regression to the mean for the Colts, Seahawks, and Packers, and that’s understandable. And, I suppose, the same could be said of Oakland, Washington, and Jacksonville, three of the four teams that this model projects to win at least half a game more than Vegas. The only other team “overrated” by these projections is the 49ers, who had one of the worst offseasons you can have.
All of that is to say that I think this model can be pretty useful to serve as a baseline of team projections entering a season. Which can have a lot of implications for us in the future.
- Denver’s defense was better than league average last year, but there were a lot of drives in Broncos games. That led to more scoring for both Denver and its opponents. [↩]