## Guest Post: DVOA-Adjusted Pythagorean Expectation

Just above these words, it says “posted by Chase.” And it was literally posted by Chase, but the words below the line belong to Bryan Frye, a longtime reader and commenter who has agreed to write this guest post for us. And I thank him for it. Bryan lives in Yorktown, Virginia, and operates his own great site at http://www.thegridfe.com/, where he focuses on NFL stats and history.

In February, Chase used a regressed version of Football Outsiders’ DVOA metric to derive 2014 expected wins. If you are reading this site, you probably have some familiarity with Football Outsiders and DVOA, FO’s main efficiency statistic. Given the granularity of DVOA, it is no surprise that Year N DVOA correlates more strongly with Year N + 1 wins (correlation coefficient of .39) than Year N wins does (correlation coefficient of .32).

By now, even casual NFL fans probably have at least heard of Pythagorean wins, and regular readers of this site are certainly familiar with the concept. Typically, an analyst uses Pythagorean records to see which teams overachieved and underachieved, which can help us predict next year’s sleepers and paper tigers. Well, I wondered what would happen if we combined the two formulae to make a “DVOA-adjusted Pythagorean Expectation” (or something cooler sounding; you be the judge).

Going back to 1989, the earliest year for DVOA, I used the offensive, defensive, and special teams components of DVOA to adjust the normal input for Pythagorean wins (points). Because DVOA is measured as a percentage, I adjusted the league average points per team game accordingly (I split special teams DVOA between offense and defense). Let’s use Seattle, which led the league in DVOA in 2013, as an example.

In 2013, the league average points per game was 23.4. Last year, Seattle had an offensive DVOA of 9.4% and a defensive DVOA of -25.9% (in Football Outsiders’ world, a negative DVOA is better for defenses).  The Seahawks also had a special teams DVOA of 4.7%.  So to calculate Seattle’s DVOA-adjusted points per game average, we would use the following formula:

23.4 + [23.4 * (9.4% + 4.7%/2)] = 26.15 DVOA-adjusted PPG scored

And to calculate the team’s DVOA-adjusted PPG allowed average, we would perform the following calculation:

23.4 + [23.4 * (-25.9% – 4.7%/2)] = 16.79 DVOA-adjusted PPG allowed

Insert these numbers into the Pythagorean1 formula, and you get:

[26.15^2.67 / {(26.15^2.67) + (16.79^2.67)}] * 16 = 12.2 wins

Do this for all teams since 19892, and you get a correlation coefficient between Year N DVOA-adjusted Pythagorean Wins and Year N+1 actual wins of .38.3 Behind the scenes, Chase asked me if I can prove this is better than his more granular projection model he created. The answer, of course, is no. I can’t. These numbers aren’t regressed and are, thus, more distant from the mean; hence, we have a slightly lower correlation coefficient. However, I do believe that this model gives us plenty to chew on regarding next season’s possible surprise teams. But don’t take my word for it; see for yourself.

The table below uses the DVOA-adjusted formula to examine 2013 DVOA-Adjusted Pythagorean wins for the upcoming season. Here’s how to read the table: The Colts had the 13th most expected wins based on the methodology described above. In 2013, Indianapolis had an offensive DVOA of 4.3%, a defensive DVOA of 0.9%, and a special teams DVOA of 0.1%. This gives the Colts 24.39 and 23.62 DVOA-adjusted points for and points allowed per game averages, respectively. Using the Pythagorean formula, those numbers are good for an expected winning percentage of 0.521, or 8.3 Expected Wins. In reality, the Colts won 11 games, giving them 2.7 wins over expectation. That final column is the metric by which the table is sorted.

13IND4.30.9-0.124.3923.620.521118.32.7
6NE16.44.26.728.0223.60.613129.82.2
7SF9.1-4.63.725.9621.890.612129.82.2
2DEN33.5-0.2-131.1223.470.681310.92.1
3CAR7.9-15.7125.3719.610.6651210.61.4
8CIN0.4-12.61.223.6320.310.6119.61.4
4NOR16-5.8-2.526.8522.340.621119.91.1
5KC3-6.77.825.0120.920.617119.91.1
19GNB8.614.4-0.325.3826.80.4648.57.41.1
27NYG-22-11.4-5.117.6621.330.376761
24NYJ-15.3-5.62.120.0721.840.44487.10.9
1SEA9.4-25.94.726.1516.790.7651312.20.8
10ARI-2.4-16.4-4.122.3620.040.572109.20.8
23BAL-21.7-8.76.319.0620.630.44787.20.8
22MIA-1.82.4-2.422.724.240.45687.30.7
9PHI22.94.9-2.828.4324.870.588109.40.6
11SD23.117.50.828.927.40.53598.60.4
17DAL7.513.83.425.5526.230.48387.70.3
32JAX-29.810.92.516.7225.660.24243.90.1
15PIT4.440.524.4924.280.50688.1-0.1
20TEN1.44.2-3.223.3524.760.46177.4-0.4
31OAK-16.710.3-7.118.6626.640.27944.5-0.5
12CHI13.38.7226.7525.20.5488.6-0.6
16DET-1.9-0.8-0.422.9123.260.4977.8-0.8
14STL-9.5-5.76.321.9121.330.51878.3-1.3
26MIN-4.710.53.822.7425.410.4275.56.8-1.3
18BUF-11.5-13.8-5.620.0520.830.47567.6-1.6
28CLE-14.48.20.920.1425.210.35445.7-1.7
29WAS-104.2-1219.6625.790.32635.2-2.2
25ATL3.213.5-0.124.1426.570.43647-3
30HOU-18.92.5-5.118.3824.580.31525-3
21TB-10.4-6.8-1.520.7921.980.46347.4-3.4

The biggest over- and underachievers

Once again, Andrew Luck’s Colts defied all odds and exceeded their expected win total – this time by nearly three whole games. The numbers and history say Indianapolis is due for a letdown this year. Fortunately for the Colts, they play a very easy schedule. Oh, and they still have Luck under center.

The Patriots, 49ers, and Broncos are the only other teams to exceed expectations by at least two wins. I have long since given up on trying to predict regression for the BradyBelichick Patriots, but I do think the teams out west are due for a stumble. A playoff spot is all but automatic for a Peyton Manning-led team, but even a theoretically upgraded defense is unlikely to propel Denver to more than 12 wins. Not with their schedule. As for the Niners: they face an even tougher schedule, and their all-world offensive line has looked abysmal this preseason.

The Buccaneers underperformed by 3.4 games, according to this model. Replacing Greg Schiano with Lovie Smith alone could account for those wins. They may not win the NFC South, but look for them to improve significantly this season.

Houston and Atlanta both won three games fewer than expected; the former can blame the offense, while the latter blames the defense.

Despite having the best defensive player in football, Houston couldn’t get it together last year, winning three games fewer than expected. If Jadeveon Clowney even touches his potential, and if Ryan Fitzpatrick can provide above-replacement-level play, the Texans should sniff a winning season (Bill Barnwell is also on board the Houston train).

Prior to injury, Julio Jones was putting up Lance Alworth type numbers, so his return is huge for the Falcons (obviously). However, they sacrificed depth and defense to get Jones, which may have disastrous long-term results. Was last year an anomaly or the beginning of a downward spiral? I don’t know, but I can’t wait to find out.

Washington also failed to meet expectations in a major way. I really hope for RG3’s sake that an offensive line built mostly by Mike Shanahan can hold up to Jay Gruden’s offensive system. If not, Joe Theismann may get his wish to see more of Kirk Cousins.4

Oh, and for Chase: the 2013 Jets don’t grade out as significant overachievers in this analysis, at least compared to his Pythagenpat records post.

A few caveats.

There is a lot of information to which DVOA is not privy. It does not know that Dave Gettleman hates Cam Newton (and really hates Steve Smith). It doesn’t know that the AFC South has a creampuff schedule, or that both West divisions have the unfortunate proposition of facing each other.

As Chase pointed out in his post on the subject, DVOA also doesn’t know that Aaron Rodgers will obviously make a huge difference for Green Bay or that New England could get a full season out of Rob Gronkowski.5

DVOA doesn’t know about current events. It doesn’t know that Sam Bradford is out for the year (however, stats tell us that might not matter anyway) or that Nick Foles probably won’t maintain his incredible interception rate.

Even if it did know all of that, even DVOA can’t tell us who this year’s Chiefs or Falcons will be. But DVOA-Adjusted Pythagorean Expectation can help us spot things we may have missed just looking at win and losses.

1. I used the number 2.67 instead of 2.37 based on the work by Jim Glass here; for purposes of this post, using 2.67 does provide a higher correlation coefficient, lending support to his work. []
2. Excluding 1994, 1998, and 2001 []
3. This also gives us a best fit formula of 3.78 + .53 * DVOAPyth Wins, but regressions bring everything too close to 8 wins for my liking. []
4. My wife is a Skins fan. She will cry if this happens. That is the only vested interest I have in any professional sports team. []
5. At least, I think that’s allowed to happen. []
• Chase Stuart

Thank you Bryan for the contribution.

• It’s awesome to be a part of site. Thanks for letting me do some weird stuff, Chase!

Great stuff. Idea, since you have the dataset.

Multivariate regression. Year n offDVOA, defDVOA, stDVOA, and pythWins vs year N+1 wins.

You can also convert the dvoa numbers into in-season z scores to control for changes in how the gAme is played

• When I ran the regression you mentioned (since 2002, not since 1989), I ended up with a much wider range of projections than I am used to seeing from any regressed numbers. Keeping in mind I converted DVOA percentages to decimals, I got this formula:

Year N + 1 wins = 23.87 + OffDVOA*26.21 – DefDVOA*26.82 + STDVOA*24.21 – DVOAPyth*1.99

R = .37

The highest projection, as you might have guessed, is for 2007 New England, with a whopping 13.11. The lowest is for 2002 Detroit, at 3.88.

Most embarrassing misses:
The 6-10 2003 Steelers had 6.8 projected 2004 wins, which they exceeded by 8.2.
The 2012 Texans had 10.2 projected 2013 wins, which they missed by 8.2!
Bereft of Peyton Manning, the 2011 Colts missed their projection by 7.6. But…I mean, come on.

• nate

It’s conceptually straightforward to incorporate scheduling information into any win predictions like this, and although doing it by hand is a little tedious, it’s very easy and fast on a computer. The log5 formula (http://en.wikipedia.org/wiki/Log5) or something similar can be used to convert two pythagorean win expectations into head-to-head win probabilities. I think that incorporating strength of schedule would probably lead to better predictions for teams like the 2013 Colts.

The format of the Pythagorean Expectation is the same as a prediction made based on a logistic regression on the log of the predictor. That means that if you have a history of the predictor and the outcomes, you can run a logistic regression to generate exponents in a sensible way. It also provides a natural way to make multivariate Pythagorean Expectation formulae.

• I ran a schedule-adjusted 2014 projection here: http://nflsgreatest.co.nf/2014/05/2014-expected-wins-ii/

I didn’t use log5, but the results didn’t seem too wacky to me. If anything, they just made me really sad for Raiders fans.

• On my lunch break, I used 2014 projections from both SRS-adjusted and DVOA-adjusted Pythagorean win% for each team and incorporated schedule using the log5 formula.

I came up with this:

```TM	sw	dw
SEA	12.1	11.8
CAR	11.9	10.3
DEN	10.1	10.3
CIN	10.3	9.9
NE	10.4	9.8
NOR	11.3	9.8
PHI	8.8	9.8
KC	9.6	9.4
IND	10.6	9.3
SF	10.8	8.9
CHI	6.8	8.6
TEN	9.0	8.5
PIT	7.7	8.5
ARI	9.3	8.5
SD	8.3	8.2
DAL	7.9	8.0
BUF	7.4	7.9
DET	7.7	7.9
STL	7.6	7.6
BAL	6.7	7.6
MIA	8.4	7.6
TB	6.9	7.3
NYJ	5.6	7.0
GB	6.9	7.0
MIN	6.1	6.9
ATL	6.9	6.7
NYG	5.8	6.4
HOU	6.2	6.2
CLE	5.4	6.2
WAS	4.9	5.6
JAX	4.4	4.5
OAK	4.2	4.0
```

Where sw is SRS-adjusted projected 2014 wins and dw is DVOA-adjusted projected 2014 wins.

Using the original regression I ran based on Nick’s comment as a baseline for 2014 projected wins; the log5 method takes away 2.1 wins from Oakland and adds at least one win to Cincy, Philly, Indy, and Seattle. The entire AFC South gets a boost, really.

Here’s what I did for strength of schedule. Its a nice way to look at the whole length of the season.

http://www.ninersnation.com/2014/8/25/6065585/dvoa-strength-of-schedule-grids

• I have a similar Excel file, but it isn’t quite a visually appealing as yours. I actually used the 2014 schedule Chase made available for upload and just ran a bunch of vlookups from there.

you should probably add a 1 to DVOA.

• ubrab

Very good guest post, as usual on this site. Thanks Bryan

• Interesting and cool stuff. One thing that jumped out to me was the bigger adjustments for the Colts, Falcons, and Bucs than FO gets with their estimated wins (which also adjusts for schedule). But one much more notable difference looking at it now is the Broncos. FO gives them 14.1 estimated wins, while DVOA-adjusted Pyth wins gives them 10.9!

• Regarding the Broncos, it is mostly just a function of using the league average points as the baseline and the way Pythagorean expectation favors defensive teams.

Using regular Pythagorean wins (with the 2.67 exponent), the Broncos are good for 12.1 wins. This is because they scored 37.9 and allowed 24.9 ppg. But when I used DVOA to adjust the 23.4 league average ppg, Denver is only credited with 31.1 and 23.5 pf and pa.

Denver had a 207 point differential, compared to Seattle’s 186; however, they scored 1.52 points for every point allowed, versus 1.81 for Seattle. Because the Pythagorean formula uses the ratio rather than the gross differential, defense always wins. This is an issue that Pythagenpat generally accounts for, and it is why I think PPat is better for comparing expected wins across different scoring environments.

I have a ridiculous spreadsheet at home that has just about every stat I ever use in raw, above average, above replacement, and z-score (as well as PFR-style + score) format; I might have to bust it out for future study in the DVOA-Pyth area.

Also, I didn’t ask Chase how he did it, but I counted ties as half wins (so the Packers and Vikings had 8.5 and 5.5 wins this year, respectively).