## We don’t know anything and we never will

Five weeks in, you start to hear NFL experts trade their preseason overconfidence for regular season overconfidence. It’s tempting to fall into the trap thinking that with over a quarter of the season in the books, now we have an idea of how the rest of the regular season will unfold.

It’s tempting, but it’s not really true. The best way to measure whether someone knows what they’re talking about is to see if their predictions come true. Fortunately for us historians, each game a group of experts predict what will happen every week — it’s called the point spread.

Assuming we actually learn something each week, then the point spreads should reflect the actual results as the leaves change colors. But do they?

I looked at the point spread for all regular season games from 1988 to 2011. Now the question becomes how do we measure if a point spread was “right”? If the Texans are favored by 7 to beat the Bengals, and they win by 10, is that a “good” projection?

The simplest way to test this is to see the difference between the actual result and the projected result. The line that was most “off” in the database came two years ago. Likely due to the genius of Josh McDaniels, the Raiders were 7-point underdogs in Denver in 2010, but won the game 59-14. So in that game, the line was off by 52 points. Last year, the 49ers were 3-point favorites at home against Tampa Bay, but won by 45 points; that line was “off” by 42 points.

One interesting sidenote: you might think that big lines are more likely to be off by bigger margins than small lines. But that’s not really the case. The standard deviation of “how much a line is off by” is roughly 8 points regardless of the spread. It’s not exact, of course, but for our purposes, we can work under the assumption that lines are generally equally likely to be off by the same amount regardless of the spread.

Anyway, back to the point of the post. How accurate are lines early in the year? In week 1, the spreads are generally a little tighter than they are the rest of the way; perhaps the oddsmakers are just as unsure as the rest of us. But no matter what week it is, the lines are always off by about 10 points per week. Take a look. The table below shows how much the lines were off by, on average, in weeks 1 through 17 from 1988 to 2011. In the last column, I’ve shown the percentage of games where the line was within 10 points of the actual result.

13624.710.760%
23615.210.558%
33405.610.456%
43225.510.956%
53195.21064%
63155.41060%
73155.610.957%
83155.510.659%
93265.310.559%
103395.610.157%
113585.69.461%
123625.71062%
133635.410.159%
14360610.859%
15363610.957%
163635.710.957%
173355.710.461%

It’s tempting to think we know more once we see more, but that’s unfortunately not the case. Of course, if it was, it would be easy to make money gambling on football.

For those curious, week 12 of the 2003 season was as close as Vegas has ever come to perfection. Look at how close these lines were to the actual results (as always, the boxscores are clickable):

GameYrFavUnderlineFavPtsUnPtsLineOff
2003 rav -3 vs. sea2003ravsea344410
2003 tam -6 vs. nyg2003tamnyg619130
2003 clt -3 vs. buf2003cltbuf317140
2003 min -10.5 vs. det2003mindet10.524140.5
2003 oti -6.5 vs. atl2003otiatl6.538310.5
2003 nyj -4 vs. jax2003nyjjax413101
2003 dal -3 vs. car2003dalcar324201
2003 nwe -5 vs. htx2003nwehtx523202
2003 cin -3 vs. sdg2003cinsdg334274
2003 ram -7 vs. crd2003ramcrd730274
2003 mia -7 vs. was2003miawas724236
2003 gnb -3.5 vs. sfo2003gnbsfo3.520106.5
2003 phi -5.5 vs. nor2003phinor5.533207.5
2003 kan -11 vs. rai2003kanrai1127248
2003 cle -3 vs. pit2003clepit361310
2003 den -10.5 vs. chi2003denchi10.5101919.5

What about the other side of the coin? Vegas was really, really, really off in week 17 of the 1993 season. Just so you know, 1993 was the year the NFL tried the double-bye week approach resulting in an 18-game season, so this was really like a week 16 most years:

GameYrFavUnderlineFaPtsUnPtsLineOff
1993 nwe -6 vs. clt1993nweclt638032
1993 cle -2 vs. ram1993cleram2421426
1993 gnb -3 vs. rai1993gnbrai328025
1993 sdg -1 vs. mia1993sdgmia1452024
1993 kan -3 vs. min1993kanmin3103023
1993 den -13.5 vs. tam1993dentam13.5101720.5
1993 dal -17 vs. was1993dalwas1738318
1993 nyg -3 vs. crd1993nygcrd361714
1993 pit -3 vs. sea1993pitsea361613
1993 sfo -7.5 vs. oti1993sfooti7.571010.5
1993 chi -3 vs. det1993chidet314209
1993 phi -3 vs. nor1993phinor337268
1993 atl -3.5 vs. cin1993atlcin3.517217.5
1993 buf -7 vs. nyj1993bufnyj716145
• Great research, great stats, well written. However, the entire premise of this article is ridiculous and I don’t understand why I see so many sites offer similar analysis. Other than to disprove statements by people like Doug Gottlieb who constantly say “how does Vegas know?”, the entire conclusion can be reached with a little understanding and common sense.

The objective of the linemaker is not to be predictive of the final score – it never has been and never will be. The primary objectives of the linemaker is to maximize profitability while limiting exposure and they set lines accordingly. Bookmakers do not measure their success by how close they get to the point spread so why should outside analysis? This isn’t KenPom where predictions are then graded by the outcome, bookmakers grade their lines by their profitability which has more to do with the money that comes in on each side then it does with the final score.

College football last year is the perfect example of why this entire analysis is meaningless. While top teams continued to crush bookmakers no matter how high they made the spread due to covering every week, getting closer and closer to getting the final line close to the game margin was totally meaningless as bookmakers continued to lose money as public and sharp favorites covered week in and week out.

Good work and research though I appreciate well-researched articles this well-written.

• Chase Stuart

Thanks for the kind words.

I’d disagree slightly with your conclusion. Maximizing profitability is highly, highly correlated with predicting the final score. And when those two objectives diverge, they do so in largely random ways, which makes them less meaningful in a study of thousands of games.

The point here thought is not about Vegas but about our sense of knowledge. For example, many are shocked by how good the Vikings are. However, I’m quite confident that some team most people think stinks will go 4-1 over their next five games. The point being we don’t really “learn” as much as we think we do during a season.

• Richie

The point of a bookmaker might be to make money, but the number that the bookmaker uses is basically using “wisdom of crowds” to predict a final score (if you use the spread and the total).

• mrh

Good stuff as always. I’d be interested to see if the O/U shows any signs of “learning.”

If you’ve never read The National Football Lottery by Larry Merchant, it’s worth tracking down. A lot of insight into the line, gambling, and Vegas, plus a cool look at the ’72 season.

• George

Great read as ever and really good points. It sums up the conclusion of the Levitt paper in some respects (Why are Gambling Markets organised so differently from Financial Markets?) which I mentioned elsewhere. Generally I will agree with you the general person will never know anything and never will, because they typically aren’t prepared to put the leg-work in and do their “homework”. One of the conclusions of the Levitt paper, “bookmakers are better at predicting game outcomes than the typical bettor. As a consequence the bookmakers are able to set prices in order to exploit their greater talent” – which I believe to be the case and they will capitalise on the bias of the public. An off the cuff example would be the Florida vs. LSU line last week – the public felt LSU was the better team (even when the numbers – least squares, weighted least squares, SRS – suggested otherwise) and they were willing to lay the points. That’s the general person I guess, if you look a little deeper you can make systems works for you (e.g. least squares and SRS) and find trends (e.g. Baltimore 38-21-3 ATS over 10 years as a home favourite, and specifically 27-9-2 vs Non Divisional Rivals ATS as a home favourite). Least Squares on the NFL with numbers starting to become connective is 13-11-2 over the last two weeks (and the model is still short of the recognised standard deviation around the error, which would suggest it is still lacking in connectivity put probably good enough to break even at level stakes with 5 weeks of data). What do we know though?; probably not that much but what we can do is minimise risks if we feel the need to gamble (and yes I did back Arizona last week – in case anyone reminds me).

• Richie

(e.g. Baltimore 38-21-3 ATS over 10 years as a home favourite,

See, I am skeptical whether something that Kyle Boller/Jamal Lewis/Brian Billick (vs. David Carr/Domanick Davis/Dom Capers) did in 2003 has any bearing on what Joe Flacco/Ray Rice/John Harbaugh will do in 2012 against Tony Romo/DeMarco Murray/Jason Garrett.

I guess the ONLY thing that might have some bearing is if the public perception of the Baltimore Ravens has had a specific bias over that decade.

• George

I guess the ONLY thing that might have some bearing is if the public perception of the Baltimore Ravens has had a specific bias over that decade.

Super fair point – and I think it is a case of the above – e.g. what does the average person think about when they think of Baltimore, Ray Lewis, the D, and the D doesn’t score a great deal of points (directly except for that year when Ed Reed returned a load). In terms of the splits on 38-21-3

2002: 1-1
2003: 5-1-1
2004: 5-3
2005: 4-1
2006: 5-2
2007: 1-3
2008: 5-1
2009: 5-2
2010: 3-4-1
2011: 4-3-1

Given the Levitt study that I mentioned above where they found home favourites in the sample in the study (so relevant to the above wasn’t over 10 years but principle) only won 49.6% of the time you could argue that this is statistically significant (I haven’t worked this out yet – gut feeling). You’ve got to wonder why the linemakers haven’t caught up to this and you could possibly argue that they are taking advantage of the perceived public bias against the Ravens Offense (and the fact that it is low scoring) and are setting the line too low when they are home favourite?

• Pingback: Team Strength After Week 9 | NFL's Greatest()

• Aaron Josef Kelly

In order to get a decent understanding of whether betting lines are all just a bunch of garbage, we would need to actually compare it to a baseline expectation distribution. I would suggest to look at the same analysis with class labels permuted to determine whether if the data are random, you see a worse “line off” than you do in the real data or not. This will allow you to infer if the efforts used to determine betting lines are leading to statistically significant results compared to random. Just looking at these raw data doesn’t really give us any kind of information. I don’t know if being 10 points off on average in our estimations is good or bad unless i know what it would be like in a randomly chosen betting lines scenario.