## Pythagenpat Records in 2013

Brett Keisel.

For years, sports analysts have used Pythagorean records as more granular measure of team strength than pure record. We’re not exactly at the point where Pythagorean records are mainstream, but I think, at least with respect to readers of this blog, people are pretty comfortable using Pythagorean records.

For the uninitiated, the use of Pythagorean records in sports dates back at least 30 years, and probably longer. Bill James is generally credited with popularizing this approach in baseball, and the same analysis has since been applied to just about every other spot. The formula to calculate a team’s Pythagorean winning percentage is always some variation of:

(Points Scored^2) / (Points Scored ^2 + Points Allowed^2)

My research has discovered that for football, the best-fit exponent is 2.57. However, football is subject to points inflation.  The best-fit exponent for the NFL in 1972 is not necessarily the best one for 2002 or 2013. This is particularly relevant now, as the 2013 season was the second highest scoring in history.1 Moreover, the same exponent that works for a Broncos game does not necessarily work for a Panthers game.

In 2011, Football Outsiders began using PythagenPORT ratings instead of Pythagorean ratings. As Neil describes, Pythagenport ratings “employ a ‘floating’ exponent that varies with the scoring environment in which a team played, recognizing that a single point is more important to winning in lower-scoring environments.” So instead of using 2.57, or 2.37, or some other constant as the exponent, the Pythagenport ratings derive a different exponent for each team based on the following formula:

Exponent = 1.5 * log10((PF + PA) / G)

But Pythagenport winning percentage is just one variant. Another one is known as Pythagenpat, which derives the appropriate exponent using this formula for baseball:

Exponent = [ (Runs Scored + Runs Allowed) / G]^.285

Neil found that Pythagenpat records correlated ever so slightly better with actual records than Pythagneport, which also jives with the results of this study. However, for football, the appropriate exponent (in the formula used to derive the appropriate exponent) from 1970 to 2013 was 0.2477, not 0.285.

It’s important to keep in mind that the differences between Pythagenport and Pythagenpat are too minor to have any practical effect. The same is mostly — but not entirely — true with either of those versus the Pythagorean record. In any event, I plan to use Pythagenpat records for the rest of this post.

Like Neil, I also derived the best-fit exponent, but over a different dataset. From 1990 to 2013, the best exponent to use in the Pythagenpat formula is 0.251. Let’s use the 2013 Seahawks as an example. Seattle scored 417 points and allowed 231 points over 16 games last sesaon. That means we use the following equation to get the best fit exponent for the 2013 Seahawks:

[ (417 + 231) / 16 ] ^ 0.251

That number is 2.53. Then, we calculate the team’s Pythagenpat winning percentage using the following equation:

(417 ^ 2.53) / [(417 ^ 2.53) + (213 ^ 2.53)]

That gives Seattle a 0.817 Pythagenpat winning percentage, equivalent to 13.1 wins. Seattle actually won 13 games.

For Denver, the correct exponent is 2.83, since the Broncos scored 606 points and allowed 399 points. That gives Denver a Pythagenpat winning percentage of 0.765, or 12.2 wins. Had we used the same 2.53 exponent for Denver, the Broncos would only have 11.9 wins. In essence, Pythagorean records (which use a constant exponent) are slightly biased against high scoring teams.2 This solves that problem, which is a problem that arguably needed solving since it seems like Peyton Manning and Tom Brady (and now Andrew Luck) almost always exceeded their Pythagorean win totals. The table below shows each team’s Pythagenpat win totals in 2013:

RkTeamPFPAWinsP WinsDiff
1SEA4172311313.1-0.1
2DEN6063991312.20.8
3CAR3662411211.80.2
4SFO4062721211.80.2
5KAN4303051111.4-0.4
6CIN4303051111.4-0.4
7NOR41430411110
8NWE4443381210.81.2
9ARI379324109.60.4
10IND391336119.61.4
11PHI442382109.50.5
12SDG39634899.3-0.3
13DET39537678.5-1.5
14PIT37937088.3-0.3
15DAL43943288.2-0.2
16GNB4174288.57.70.8
17STL34836477.5-0.5
18TEN36238177.5-0.5
19MIA31733587.40.6
20CHI44547887.20.8
21BAL320352871
22BUF33938866.6-0.6
23MIN3914805.55.8-0.3
24ATL35344345.6-1.6
25NYG29438375.41.6
26CLE30840645.2-1.2
27NYJ29038785.22.8
28TAM28838945.1-1.1
29OAK32245344.6-0.6
30WAS33447834.4-1.4
31HOU27642823.9-1.9
32JAX24744942.81.2

Remember, these results do not incorporate strength of schedule or anything other than points scored and points allowed. But there are some interesting takeaways:

• Houston was a league-worst 2-9 in 2013 in games decided by 7 or fewer points. Not surprisingly, the Texans also underachieved the most as far as Pythagenpat wins to actual wins, falling 1.9 wins short of “expectation.” That could be good news for Bill O’Brien, who is essentially taking over a 4-win team with the number one draft pick while (perhaps) inheriting the expectations of a 2-win squad.
• The New York teams are on the other side of the coin, with the Jets standing out as the egregious example. The Jets were an incredible 5-1 in games decided by 7 or fewer points, and Geno Smith tied Brady for the league-lead in fourth quarter game-winning drives. That may raise expectations prematurely for Gang Green in 2013, as the Jets finished 8-8 despite ranking just 27th in Pythagenpat wins. The Giants exceeded their Pythagenpat win total for a different reason: there were 5 games New York played that were decided by 17 or more points, and Big Blue went 0-5 in those games.

If you’ve made it this far, you probably think a lot of this makes sense. But some of you may be wondering why do we use Pythagenpat wins instead of real wins in the first place. The simple answer is they are a better predictor of future wins. The correlation coefficient between real wins in Year N and real wins in Year N+1 is 0.32, while the CC between Pythagenpat wins in Year N and reals wins in Year N+1 is 0.36. Here’s another way to explain the difference.

I ran a linear regression using real wins in Year N as my input and real wins in Year N+1 as my output for the period ranging from 1990 to 2012, excluding 1994, 1998, and 2001.3 The best-fit formula was:

Year N+1 wins = 5.388 + 0.3265 * Year N wins

When using Pythagenpat wins as the input, you instead get the following best-fit formula:

Year N+1 wins = 4.813 + 0.3995 * Year N Pythagenpat Wins

We all know that predicting future wins is hard. After all, we don’t just write down however many wins a team had last year as our best guess for this year. That’s why, in the first formula, the weight placed on the Year N wins variable is pretty small — a team only gets credit for 33% of each win. For example, an 11-win team is regressed down to an 9.0-win team, while a 5-win team is regressed up to a 7.0-win team. The 6-win gaps shrinks to 1.96 games, a reflection of the uncertainly in predicting records.

While far from perfect, using Pythagenpat wins does remove some of that uncertainty. Teams get credit for 40% of their Pythagenpat wins, which is a significant difference. A team with 11 Pythagenpat wins is projected to win 9.2 games, while a team with 5 Pythagenpat wins is projected to win only 6.8 games. Here, the 6-win gap shrinks to 2.4 wins, indicating less uncertainty.4

Here are the projected 2014 win totals using both regression formulas. Here’s how to read it: Seattle won 13 games in 2013 and had 13.1 Pythagenpat wins. Based on the regression for real wins, we would project Seattle to win 9.6 games in 2014. Based on the regression of Pythagenpat wins, we would project Seattle to win 10 real games in 2014. The table below shows the number of 2013 wins and 2013 Pythagenpat wins for each team, along with the number of projected wins in 2014 based on simply won-loss record and based on Pythagenpat wins:

Team2013 Wins2013 Pyth Wins2014 Proj Wins (Using Real Wins)2014 Proj Wins (Using Pyth Wins)
SEA1313.19.610
DEN1312.29.69.7
CAR1211.89.39.5
SFO1211.89.39.5
KAN1111.499.4
CIN1111.499.4
NOR111199.2
NWE1210.89.39.1
ARI109.68.78.6
IND119.698.6
PHI109.58.78.6
SDG99.38.38.5
DET78.57.78.2
PIT88.388.1
DAL88.288.1
GNB8.57.78.27.9
STL77.57.77.8
TEN77.57.77.8
MIA87.487.8
CHI87.287.7
BAL8787.6
BUF66.67.37.5
MIN5.55.87.27.1
ATL45.66.77.1
NYG75.47.77
CLE45.26.76.9
NYJ85.286.9
TAM45.16.76.8
OAK44.66.76.7
WAS34.46.46.6
HOU23.966.4
JAX42.86.75.9

Of course, Pythagenpat wins are just part of the story. Perhaps they’re 40% of the story. For a team like the Jets that wildly overachieved, some regression is to be expected. But that can be countered with team improvement. The Jets are in very good cap shape, will have the 69th pick in the draft from Tampa Bay5 on top of their regular picks, and could see improvement from a number of young players (Smith, Dee Milliner, Stephen Hill, Sheldon Richardson, Quinton Coples, etc.). A year ago, the Colts were the team that most wildly exceeded their Pythagenpat expectation, but a young roster based around Andrew Luck improved in 2013. We’ll see if the Jets can pull off the same feat.

On the other hand, the Falcons, Texans, Titans, and Vikings were the next four teams that most exceeded their real number of wins in 2012, and those teams dropped from 41 real wins in 2012 to 18.5 real wins in 2013. And the Lions, Seahawks, Giants, and Saints were the biggest Pythagenpat underachievers in 2012, and improved from 31 wins to 38 in 2013. Pythagenpat winners and losers will have exceptions, like the 2013 Colts, but it’s important to remember that they’re the exception, not the rule.

1. In fact, it came in just four hundredths of a point behind the 10-team, 12-game 1948 schedule []
2. Remember, Pythagorean records were used because points differential is biased in favor of high-scoring teams. A team that scores 600 points and allows 400 points is not as good as a team that scores 400 and allows 200. So that’s why we use Pythagorean records, because labeling both teams as +200 teams is not an accurate way to measure team strength.

A 600/400 team is projected to win 11.1 games if you use 2.00 as the exponent. A team that scores 400 points and allows 267 points is also expected to win 11.1 games if you use 2.00 as the exponent. No matter the exponent, the two teams will be projected to win the same number of games. If we made the exponent in the Pythagorean equation an absurd number like 10, the number jumps to 15.7 wins, but it’s 15.7 wins for both.

But when using Pythagenpat winning percentage, a higher exponent is used for the high scoring team and a lower exponent for the lower-scoring team. If we use the exponent 2.5, both teams win 11.7 games. But if you keep the low-scoring team’s exponent at 2.5, and move the high-scoring team’s exponent to 2.9, they’re projected to win 12.2 games.

And since the Pythagenpat exponent is just Total Points per game raised to a power, well, a high-scoring team will always have a higher Pythagenpat exponent. In other words, Pythagorean winning percentage solves the problem of teams that score a lot of points being overvalued, but it goes slightly too far in the other direction. []

3. Why? Because the expansion Texans, Browns, Panthers, and Jaguars were below-average teams and weren’t around in Year N-1, that inflates the average win total for the teams that were around. The average team from 1990 to 2012 won 8.021 games in Year N+1 because of the expansion teams. While a more elegant solution may be available, we have enough years in our sample that I simply decided to just throw out the 1994, 1998, and 2001 seasons. For those curious, if you just use the period from 1990 to 2012, the regression formula would be Year N+1 wins = 5.550 + 0.309 * Year N wins. []
4. Here’s a third way to show why Pythagenpat wins are better. I used both Pythagenpat and Actual Wins to run a multiple regression analysis to predict future wins. The coefficient on the Pythagenpat wins was significant at every level, but the coefficient on Real wins was not (p = 0.47). So using both Pythagenpat and Real Wins is not preferable to using just Pythagenpat wins. Of course, there are issues with multicollinearity with this experiment (or maybe not), which is why I’ve relegated it to the footnotes. []
5. Assuming Darrelle Revis is not released []
• A Leap at the Wheel

“Brett Keisel.”

You owe a new keyboard and half a can of Diet Dr Pepper.

• That was awesome.

It also strangely made me think about when I was in high school and played in an online baseball simulation game where the players were all philosophers. We had to make up ratings for the players, and some of us usually made them up based on the work of the philosophers. It was ever-so-slightly nerdy . . .

• JeremyDe

Was this for a class? Because I remember doing something similar in either 10th or 11th grade philosophy class.

• Nope, it was just on my own for fun. We didn’t have philosophy classes in my high school (it was a small, low-income town). Unless of course you count the Bible class, which they claimed was a literature class. We also did not have internet access in the school for my first two years, so online anything was impossible to do for class.

• JeremyDe

I had the opposite experience. 4 years of ‘religion’ in my catholic high school (we did offer a philosophy course that no one ever took). 3 of those years were spent with 2 ‘religion’ teachers who essentially taught philosophy classes. Later, we found out that they were lying to the administration about how much religious discussion they were including in their lectures.

That is, as you put it, ‘ever-so-slight nerdy’ and awesome.

• Arif

I think part of the reason Pythagenpat exceeds traditional Pythagorean Wins in predictive capacity is drive rate. I’m curious to see whether or not Pythagorean Wins as applied to point differential per drive would work better than Pythagorean wins on a team level.

Denver scored more points and gave up more points than a generalized team, but also had the fourth-most drives.

A quick glance at the low-drive teams (Carolina and San Diego) tells me I may be wrong, though.

• Laverneus Dinglefoot

I would be interested to see how incorporating SRS into the system would affect the outcome. For instance, 2013 Seattle’s SRSO is 4.1 and SRSD is 8.9. Since the league average for PPG is 23.4, Seattle’s SRS adjusted score is 27.5 – 14.5 (23.4+4.1 for; 23.4-8.9 against).

I got 13 wins using 2.37 as the exponent and 11 using [(27.5-14.5)/16]^.285 = 1.32 as the exponent.

I don’t have the time or database knowhow to do this on a large scale, but it could be an interesting topic for future research.

• very good