**[Today's post is brought to you by Neil Paine, my comrade at Pro-Football-Reference.com and expert on all things Sports-Reference related. You can follow Neil on twitter, @Neil_Paine.]**

Most fans like to think of the NFL’s playoff system as being the final word on each team’s season — run the table and you’re the champs, the “best team in football”; lose, and your season means nothing. But what if I told you that the NFL playoffs are getting a lot more random in recent seasons? Would it change your attitude if you knew we were getting closer to the point where every playoff outcome might as well be determined by a coin flip?

To research this phenomenon, I want to explore two models of predicting playoff games: one powered by as much information as possible, the other completely ruled by randomness. I then want to simulate the last 34 postseasons, and see how much of a predictive edge that information actually gives you. If it’s giving you less of an edge, it means the playoffs are being ruled more by randomness.

First, I grabbed every playoff game since 1978 and looked at the Vegas lines. To convert from a pointspread to a win probability, you have to use Wayne Winston’s assumption that “the probability [...] of victory for an NFL team can be well approximated by a normal random variable margin with a mean of the Vegas line and a standard deviation of 13.86.” If the Patriots are favored by 7 over the Ravens, this means you can calculate their odds of winning in Excel via:

*p(W) = (1-NORMDIST(0.5,7,13.86,TRUE))+0.5*(NORMDIST(0.5,7,13.86,TRUE)-NORMDIST(-0.5,7,13.86,TRUE)) = 69.3%*

This gives us a good prediction — in fact, perhaps the best possible prediction — of the outcome going into the game. So for each playoff, I’m going to say a “Smart” fan picks winners based on these numbers; 69.3% of the time he’ll pick the Patriots, and 30.7% of the time he’ll pick the Ravens. Of course, we also need a control, a fan who picks completely at random, so I’m also going to track a “Dumb” fan who thinks every single game is a coin flip.

I’m going to simulate these decision-making processes for the Smart and Dumb fans in every playoff since 1978, running through each year 1,000 times. How much better at picking do you think the Smart fan will be than the Dumb one?

Well, over the course of the whole sample, the Smart fan averaged a little more than 204 correct picks in 356 games, which is good for a 56.6% rate. The Dumb fan had 178 correct picks, a 50% success rate. In other words, being “Smart” gave you an edge of 6.6% over the fan who picked Aaron Eckhart-style.

But what I really want to know is whether this number has changed over time. The logical comparison I wanted to make was pre- and post-free agency, but it turns out there is practically no difference. From 1978 through 1993, the Smart fan would pick winners at a 56.6% rate (6.8% better than his Dumb counterpart), and from 1995-2011, he picks at a 56.3% clip (6.2% better than the Dumb fan). That observed difference, less than a half a percentage point, can be chalked up completely to random variation, so there’s no evidence that the playoffs have been more or less random in the salary cap era.

However, if you compare pre-2005 to post-2005, you see a major difference that cannot be explained away by chance alone. From 2005-2011, the Smart fan would have picked only 53.2% of playoff games correctly; that’s a difference of 3.2 percent from 2005-11, vs. 6.6 percent over the course of the full sample!

Let me restate this finding: the difference between an intelligent prediction of NFL playoff games and a pure coinflip *has been sliced in half* in the last seven postseasons. In other words, the playoffs are more random now than they’ve ever been in the last 35 years, something we’ve all seen anecdotally with the 2005 Steelers, both Giants championships (especially last year, when they were actually outscored during the regular season), and the 2008 Cardinals’ unexpected SB run, among others.

So does this change how you feel about the playoffs? Do you still think the “best team” is synonymous with the Super Bowl Champion, or do you think it’s more of a crapshoot than ever before?

{ 22 comments… read them below or add one }

But why 2005? Where’s the theory? If you just took the data and looked for statistical significance then it could simply be a type I error. I’d need more to change my mind. That said, is anyone really arguing that the Super Bowl champ is the best team?

I don’t think it’s a crap shoot but the chance that the true best team wins in any given year is relatively small. Just to keep it simple let’s say the best team would win 80% of the time against any other team in the playoffs. I think that’s pretty dominant considering the competition but if too low, feel free to correct. Even still, they’d only expect to win three games in a row (which assumes the best team got a bye) a little over 50% of the time (.8*.8*.8=51.2%).

Agreed, Quinton. You always have to be careful about multiple endpoint analysis. Although I don’t think Neil’s point is necessarily to predict that they will continue to be more random, but rather just to note and describe the historical pattern.

One answer as to how often the best team wins comes from Doug Drinen: roughly 24% of the time.

http://www.pro-football-reference.com/blog/?p=57

It seems like your methodology double counts against the Smart fan the uncertainty left in the point spread prediction.

That is… if the Patriots are favored by 7, and based on Vegas’s history that 7-point favorites have a 69.3% chance of winning, then the Smart fan who is going to use the most information available should choose the favorite 100% of the time and expect when you compare his predictions to real life results that he’ll be right 69.3% of the time for games with spreads of 7 points.

But you’re making a very sizable segment of the Smart fans ignore the best information available when your stated goal is to have them use said information. You’re making them choose the winner that their information says is less likely. Doing so for games with a point spread of 7 drops the expected value of correct predictions for all Smart fans from 69.3%, to 57.4%.

So I think your results are largely driven by the method which isn’t following the stated goal of using the most information available.

All good points, GregR.

Perhaps a simpler way to look at this is to compute the average point spread of playoff games. I’d guess it might be smaller in recent years, which implies more evenly matched games, and thus a greater likelihood of upsets. Or you could use SRS to show the same thing.

I also wonder if the 2002 realignment is a possible factor. Perhaps division winners tend to be weaker on average, the wild cards tend to be stronger (relative to the division winners). And if nothing else, the conferences (and Super Bowl matchups) seem to be more evenly matched now than they have been historically.

Two good points. Assuming Vegas hasn’t gotten worse at predicting games (probably a safe assumption), the average point spread is basically a determination of how much information is available in picking a winner, so this is the real “signal” of randomness or order. And the realignment is a plausible cause for more evenly matched games (i.e., worse teams getting home games and ~6 points vs. being on the road).

It’s worth pointing out, though, that picking right 53.2% over 77 games instead of an expected 56.6% of games is only 0.6 standard deviations from expected. If you looked at any given 77 game stretch, you’d have about a 50% chance of that much deviation, so I’d hardly say it “cannot be explained away by chance alone”.

I definitely think that recently the playoffs have been a crapshoot.

The 2008 Steelers were the last team with the best SRS to win (or even make it to) the Super Bowl. The last team before that was the 2004 Patriots. I think SRS has done a pretty good job of recognizing the team that people would generally think is the best team of the regular season.

I don’t know if I have an explanation. A couple theories:

- random variance (it might be interesting to see if the unpredictability of the playoffs has been consistent EVERY season since 2005, or have there just been a couple years weighing down the average)

- new stadiums (are all the luxury boxes and stadium conveniences changing the demographics and/or behavior of the fans?)

All good questions. I also think teams seem to get “hot” and “cold” during the playoffs more frequently than they used to, but I suspect that’s just my subjective reaction and wouldn’t hold up to rigorous analysis.

I assume you mean for a team to get “hot” in the playoffs, that wasn’t hot at the end of the regular season?

Because all playoffs have one hot team, and a handful of other teams that were “hot” until they lost.

Due to the league-imposed parity between teams, wouldn’t it make sense that the outcome of games between “almost-equal” teams are increasingly more random?

The parity rules have been around for 40 years. The scheduling changed to a fixed rotation system in 2002, but I’m not sure that the 2002-2011 scheduling system is any more parity-ridden than pre-2002. The “worst team drafts first” system has been around for a long time. Up until 2011, there has been talk that having a high draft pick was more curse (for salary cap reasons) than cure. Although there is some debate there.

Are there any other obvious parity rules that have been implemented over the past decade that I have missed?

I’m not sure if I can see any logic for it, but something that has to do with realignment in 2002 would seem to make sense. A 2002 rule change might take a few years to really take effect, which would lead to a noticeable change beginning in 2005. But I just don’t see what it could be. If anything, the current playoff format seems (anecdotally) more likely to have an inferior team make the playoffs (with a home field game). This seems like it would make it MORE likely that the morally* “best” teams would make it to the Super Bowl.

* – not sure if that is accurate use of the term morally.The one thing I thought of related to the 2002 realignment is that now with four divisions the worst division winner tends to be a pretty mediocre team. It faces the best wild card, but it gets the home field advantage. It’s just one game per conference, but I wonder if this matchup has a smaller point spread on average than the 3/6 or 4/5 matchups prior to 2002 when the “better” team usually got home field advantage in the wild card round.

Is it possible that schematic complexity and increasing specialization has led to more situations where Team A would beat Team B, B would beat C, but C would beat A because of particular matchups? Feels like ’70s and ’80s was much more about our best 22 vs. your best 22, and that’s less the case nowadays. Don’t know if that’s true, but it’s my gut reaction. I will say I find this phenomenon frustrating–if my own team doesn’t win this year, I’ll definitely be rooting for the 2 most dominant teams to play in the Super Bowl and the most dominant team to win. I think we’ve been robbed of some truly epic Super Bowl matchups by all the playoff upsets over the past dozen years or so–especially in the NFC.

But how would the dumb fan fare if he simply picked every home team? I doubt the difference between a smart fan and a dumb fan with HFA is much different.

One thing that occurs to me is that 2005 — if that’s the 2004-2005 season — was the first year the Polian-inspired tighter pass interference rules were in place. Big gains from more pass interference may be an increased element of randomness in games after 2005. Obviously this could be simple correlation. Also, you’d expect to see the effect in the regular season as well as the playoffs.

Quick question about the excel formula: So the first part of it (the 1- part) is looking at the side of the bell curve that’s to the right of the observed value (0.5), i.e. the probability that the Patriots score more than 0.5 points than the Ravens. What are the second and third parts of p(W)? Why are we subtracting the probability that the Patriots lose?

Thanks!

Also, from Seattle, it’s relieving to know that the 2005 Steelers won more by chance (or, Bill Leavy, as he’s better known) than by actual skill.

I don’t think you understand what “random” means. It doesn’t mean “predictable.” It means “random.”

Football is not a card game. Or a dice game. Or game of chance in any sense. The outcomes are not random.

You also might want to read up on “sensitive dependence to initial conditions” while you’re at it.

Statisticians can be funny at times. No, there is more PARODY than ever before meaning equal amounts of skill. With more parody there is more variance between two teams. If two teams played each other 100 times and one skilled team was far superior than the other, it might win 100/100 games (think NFL team vs peewee football). But because games are so closely matched and because teams model themselves after winning teams more often, and because of free agency and rules that favor higher scoring offenses (giving any team a chance to draw a defensive p.i. flag even if they are terrible and put themselves in fieldgoal range) very small changes and very close decisions by a single individual or a single mistake at a crucial time can make the difference between winning and losing. Teams play too similar. Neither team goes for it significantly more often on 4th down or onside kicks significantly more. Everyone plays conventional football. Every single individual variable has a CAUSE and every single player has a degree of control. there is ZERO amount of luck that cannot be controlled or overcome with enough of a skill advantage. However, when you tighten up the skill, the point spread tightens and then ANY individual variable will have a MUCH greater impact on the play. Out of 11 guys ONE could have a mental lapse. That does not mean that his event wasn’t controllable or caused by something or that it wasn’t skill on a given play, but effectively things appear random because humans aren’t consistent as one individual team unit, and they certainly aren’t consistently better than other humans to overcome having worse teammates (which happens because of salary cap and better players getting more money and parody and schedule)…

but there is no such thing as luck it is a myth. EVERYTHING that happens is a result of some cause before it. Even a random number generator is not random but based upon a table that was created. Even a roll of a dice has physics that could be calculated. A person rolling the dice could easily impact the force of the roll, the spin, the angle, the velocity. If he spent 10,000 hours he could perfect it and your odds of getting a 1 could be 99% rather than 1/6. But it’s supposed to be “random”!! It’s not.. it never is! There is NO such thing as random or luck, it is merely variables that we can’t properly model so we consider it luck because effectively it seems to appear in ways we don’t understand or haven’t learned to calculate all the variables or control.

Statistics is the study of correlation between things when we do not know or have the perfect model to explain it. That doesn’t mean there isn’t though! If you really wanted to, you could create a dice roll that is the exact same with every single variable the same in a controlled environment and there would be ZERO randomness.

lol, I think you are wrong. I don’t think there is very much parody in the NFL at all. Except, of course, in the Pro Bowl.

Idiot.

Chase,

I’ve argued for some time (well, since my playoff model went 7-2-2 last year) that common regular season offensive/defensive stats are overrated in playoff situations. You’re dealing with good teams, all with pretty good offenses and defenses, and thus you don’t get huge talent/skill gaps the way you do in the regular season. Therefore, trying to predict playoffs with those kinds of stats isn’t as useful as you might initially expect.

More to the point, when I do logistic regressions of common offensive and defensive stats versus playoff wins, the confidence of the resulting models tend to p on the order of 0.15. Just not as predictive as people would like. The stats that in my hand are predictive to p < 0.05 (home field advantage, playoff experience, strength of schedule) are static almost, and can be vaguely dissatisfying.

What I'm finding a little surprising is how often my model results are similar to those of Brian Burke, whose approach is almost entirely different.

D-

{ 10 trackbacks }