In Buffalo’s loss to Tennessee on Sunday, Chan Gailey faced an interesting decision. Buffalo trailed 28-27 in the final seconds of the third quarter when Ryan Fitzpatrick hit Steve Johnson for a 27-yard touchdown. Now up 33-28, Gailey chose to kick the extra point, and ultimately saw his team lose, 35-34.

Why did Gailey choose to go for 1? Bill Barnwell has his theory:

[The next mistake was] Gailey’s decision to kick an extra point on a touchdown at the end of the third quarter, which created the margin of victory. By going for one with seconds left in the third and a five-point lead (pending the extra point), Gailey paid tribute to the long-standing rule that teams shouldn’t go for two and try to create a seven-point lead before the fourth quarter. It’s an absurd rule, of course, that breaks down when you ask anybody to explain at any length why it makes sense. The two-point conversion chart at footballcommentary.com suggests that the Bills should have tried to tack a two-pointer onto their 33-28 lead if their chances of converting were better than 24 percent. Because the clock hadn’t ticked for 10 additional seconds and bumped the decision into the fourth quarter, though, the Bills kicked and ended up losing by one.

When I read that, my reaction was “yep, that sounds about right.” Up 5 with just over 15 minutes left, it seems like the “stats-geek” move is to go for two while the “conservative old school train of thought” says it’s “too early” to go for two. Of course, if that’s all there was to the story, you wouldn’t be reading this post right now. Take it away, Jason Lisk:

When I look at the game winning probabilities at Advanced NFL Stats, though, Gailey’s decision was different [than Mike Tomlin's]. It pains me to say that conventional wisdom is right here, but it is. With 15 minutes left, being up 5 is more costly than up 7 is beneficial with all the permutations. There are enough possessions that you can get beat by two field goals gained, or not extend the lead with another field goal.

When is it too late to go for one point in either of these situations, though? As it turns out, the answer is roughly between the 6 and 7 minute mark of the fourth quarter. That’s when possessions become more limited and you must try to tie, or make it where a touchdown doesn’t beat you.

A little surprised, I went over to Advanced NFL Stats and entered the numbers into Brian Burke’s Win Probability Calculator. Up 5, at the start of the 4th quarter, with the opponent having 1st and 10 at the 22 yard line, yields a 72% win probability to the leading team. Up 6 translates to a 77% win probability and up 7 increases it to 80%. That’s what Lisk meant when he said that difference between being up 5 and up 6 — 5% — is greater than the difference between being up 6 and up 7 — 3%.

Nerd Fight! Brian is a good friend of the site and one of the smartest minds out there, but he’d be the first to tell you that his Win Probability model is not perfect. So the question we have to ask is, is this a situation where his Win Probability Model breaks down?

Let’s not forget what Barnwell noted: according to footballcommentary.com, going for 2 is the obvious call here. And let’s used my tried-and-true method for making any football decision. If you were a Titans fan, now trailing by 5 at the end of the 3rd quarter, would you have been happy to see Buffalo’s kicking team run onto the field, or would you have wished that instead they went for it? My gut tells me — and let’s stipulate that the Bills would have had a 50% chance of converting the 2-point attempt — that as a hypothetical Titans fan, I’d want Buffalo to kick the extra point. Being down 7 sounds really bad, while the difference between 5 and 6 seems pretty negligible to my Nashville gut.

The fact that this score happened at the end of the quarter is very convenient, because we can measure how frequently teams win games when leading after the third quarter by 5, 6, or 7 points.^{1} But instead of just giving you the answer, I want you to take a second and guess as to what each answer is.

- In 54 games from 2000 to 2011 (including playoffs), what percentage of the time did teams leading by 5 after the 3rd quarter ultimately win the game?

Click 'Show' for the Answer. | Show |
---|---|

- In 129 games from 2000 to 2011 (including playoffs), what percentage of the time did teams leading by 6 after the 3rd quarter ultimately win the game?

Click 'Show' for the Answer. | Show |
---|---|

- In 323 games from 2000 to 2011 (including playoffs), what percentage of the time did teams leading by 7 after the 3rd quarter ultimately win the game?

Click 'Show' for the Answer. | Show |
---|---|

Pretty interesting results, eh? And they would certainly support Lisk’s and Burke’s idea that the difference between 5 and 6 is much greater than the difference between 6 and 7. If you’ve made it this far, I only have bad news for you: it’s about to get more complicated.

There several ways we can go from here if we want to argue the Barnwell/Footballcommentary/gut theory: we could say that the data above undervalues teams leading by 5, overvalues teams leading by 6, or undervalues teams leading by 7. I’m going to choose door #2.

To me, the surprise in the data isn’t that “5″ is so low but rather that “6″ is so high. In fact, teams had a higher winning percentage when leading by 6 than by 7. We can write that off based on small sample size due to the miniscule difference, but what of the larger point that 6 is significantly closer to 7 than it is to 5?

There are two notable things going on here.

**Sample size problems**

Burke’s data is from 2000 to 2011, the era for which we have play by play logs, which form the backbone of his Win Probability Model. From 2000 to 2011, in regular season games, teams leading by 6 after three quarters won 78.4% of the time (98 out of 125). But from 1994 (the start of the two-point conversion era) to 1999, teams up by 6 after three quarters won only 62.5% of the time (45 out of 72). The table below shows team success rate when trailing by 6 after three quarters for each year since 1990:

Year | #Gm | #Wins | Win% |
---|---|---|---|

2011 | 8 | 4 | 50% |

2010 | 9 | 8 | 88.9% |

2009 | 10 | 8 | 80% |

2008 | 13 | 10 | 76.9% |

2007 | 7 | 5 | 71.4% |

2006 | 14 | 10 | 71.4% |

2005 | 7 | 6 | 85.7% |

2004 | 12 | 12 | 100% |

2003 | 10 | 9 | 90% |

2002 | 17 | 11 | 64.7% |

2001 | 11 | 10 | 90.9% |

2000 | 7 | 5 | 71.4% |

1999 | 14 | 11 | 78.6% |

1998 | 8 | 6 | 75% |

1997 | 18 | 8 | 44.4% |

1996 | 15 | 10 | 66.7% |

1995 | 7 | 5 | 71.4% |

1994 | 10 | 5 | 50% |

1993 | 20 | 18 | 90% |

1992 | 12 | 10 | 83.3% |

1991 | 14 | 13 | 92.9% |

1990 | 11 | 6 | 54.5% |

Total | 254 | 190 | 74.8% |

The results vary from year to year, as would be expected with tiny sample sizes, but on average, teams up by 6 entering the 4th quarter win 75% of the time. Since Burke was using data from games where teams won 79% of the time, he was essentially using a deck of games where teams didn’t come back as frequently as they are apt to do. As a result, I have to assume his win probability model overstates the likelihood of success when up by 6 entering the 4th quarter by a couple of percentage points.

Is that significant? Recall that the win probabilities when up by 5, 6, and 7 points were 72, 77 and 80. If that 77 was really, say, 74, that changes the analysis significantly, and therefore, the difference betwen 5 and 6 would in fact be much smaller than the diffference between 6 and 7. Therefore, I feel pretty comfortable siding with Barnwell and footballcommentary over Lisk and Burk. Both theory and my gut tell me that the difference between being up by 6 vs. 7 is larger than the difference between being up by 5 vs. 6. And while some data says otherwise, I think a larger sample size presents a clearer picture.

*[Update: Per pt's comment, I went back and checked how teams up by 5 and 7 after 3 quarters fared going back to 1990, too. The data looks largely the same when looking at 7, but from 1994 to 1999, teams up by 5 after 3 quarters won 77.3% of the time, which is a much higher rate than from 2000 to 2011. From 1990 to 2011, teams up by 5 after 3 quarters won 69.2% of the time, so the data from 2000 to 2011 may slightly undervalue the value of being down by 5, too. If that's the case, it decreases the difference between 5 and 6 even further, and makes it more clear that going for 2 is the correct call.]*

But I discovered something else in the data that I can’t quite understand.

**Does the type of 6-point deficit matter?**

When leading after three quarters by a score of 6-0, 13-7, 20-14, or 27-21, teams won **77.6%** of the time since 1994 (90 out of 116). When leading after three quarters by a score of 9-3, 16-10, 23-17, or 30-24, teams won just **61.9%** of the time (39 out of 63). You might think that just means that lower scoring games are less likely to see a comeback, but that’s not the case. In reality, the former set of games — we can call them the “opponent has zero field goals” games (“0FG”) — simply *correlates *better with winning than the latter group, the “opponent has one field goal games.” Being up 27-21 (100% in 8 games) is better than being up 23-17 (53.8%), being up 20-14 (78.1%) is better than being up 16-10 (69.7%), and being up 13-7 (75.4%) is better than being up 9-3 (50%).

Now, why in the world would this be the case? Small sample size is always a possible answer here, as our N is pretty small. But the fact that the winning percentage of the leading teams in the 0FG games holds consistent across a sliding scale makes me at least consider the possibility that something is actually going on; the p-value is less than 3%, which makes the results statistically significant, although that is far from conclusive.

Note that Gailey in this case was in one of those 0FG games, so obviously that means he should have won. So far in 2012, in 0FG games, teams are 3-3 when leading by 6, with the three losses coming in memorable games (Bills-Titans, Patriots-Ravens, and Dolphins-Cardinals). In 1FG games, teams up by 6 are 2-0. So maybe there is nothing to the data. But I can’t help but wonder if there is.

Do you remember my Snake Eyes Post, where I noted that teams trailing by 8 win less frequently than teams trailing by 9? I wonder if the same effect could possibly be going on here, where teams up 23-17 feel like they’re in a better position to win than teams up 20-14 (although that doesn’t really pass the smell test). The reason teams in the 0FG have a higher winning percentage is mostly because they score more points in the 4th quarter — on average, 5.6 points (while allowing 6.3 points) vs. 4.6 points (and allowing 6.5 points). Teams in 0FG games were shutout in the 4th quarter 26% of the time, while teams in the 1FG games were shutout 35% of the time. Who knows, there may be nothing there, but it’s yet another odd quirk in the fun annals of NFL history.

- To be clear, this isn’t perfect; in these examples, we don’t know who has possession, what down it is, or where the ball is placed. [↩]

{ 11 comments… read them below or add one }

Can you go back to 1994 with 5 and 7 point leads? Why did you only do that with the one number you thought was off? It’s possible Burke’s data overvalues those as well.

Yes, that’s the right question to ask. That was my plan, and then I got very distracted by the 0FG/1FG effect, and forgot to do that.

Looking at 7 point games, regular season only, the win% was 79.5% from 2000+, 79.2% from 1994 to 1999, and 77.8% from 1990 to 2011. No real change. But looking at 5-point games, regular season only, the win% was 77.3% during the period from 1994 to 1999, far ahead of the winning percentage of teams up by 6 during that era (62.5%). If this blog was started in the summer of 2000, there’s a good chance I would have written a post that said “Hey, this is odd. Over the last 6 years, teams up by 5 entering the 4th quarter have won 77% of their games, while teams up by 6 have won only 63% of thier games. That’s really weird.” Of course, it would have been a case of a small sample size. In any event, from 1990 to 2011, teams up by 5 won 69% of their games.

Hypothesis for type of 6-pt deficits: FGs could indicate the ability to sustain drives but not punch it in, while TDs might occur on a fluke play or two.

I think that makes some sense. If you’re down by 6 and you have a FG, you may have played a higher quality game (relative to your opponent) than being down by 6 and having no FG’s. Perhaps you’ve made some questionable coaching decisions to settle for FG’s, but you’ve probably moved the ball more on offense (relative to your opponent), thus would be more likely to be able to make up the deficit.

True, but the counter is the other team also then would have had one more scoring drive. And the real issue seems to be that the team with the lead in 23-17 games is less likely to score than the team with the lead in 20-16 games.

But honestly this is starting to strike me as so absurd that even if the results are significant at the 3% level, I am inclined to say it is still just random.

I wasn’t watching the game, so had no opinion at the time. Generally, I agree with the numbers guys and think teams don’t go for it enough (going for 2 and/or going for first downs, etc.).

But I also generally think that it’s better to just take the “guaranteed” extra point earlier in the game. Even right at the beginning of the 4th quarter, I think I’d rather take the point, because there are many possessions left and a lot can still happen.

It seems like I have seen plenty of times where a team goes for 2 in the first 3 quarters, in order to get the score back on the 3/7/10 type scoring cycle, fail, and then end up having to go for 2 again, and fail and lose by a point or two.

Richie, does your opinion change if you stick with the stipulated rule that the conversion rate on the 2-pt attempt would be 50%?

No, I think I’d rather just take the point for the first 3+ quarters.

Again, I generally like to think of myself as an objective person in these regards, but for the 1-point/2-point thing it just seems like you can get in trouble trying to “chase” the score early on.

I’ve noticed that the Broncos have gone down by 24 on three occasions this season. In every instance, John Fox has gone for the extra point as opposed to the two-point conversion. Thoughts?

You are not allowed to disagree with me, Chase.

I’ve got 68.6% up 5 at start of 4th since 1990

74.3% up 6 at start of 4th

77.6% up 7 at start

I would have probably gone for it. It’s debatable given the remaining possessions. I also wonder what impact 2 point conversions have at these numbers. In past, up 5 meant opponent couldn’t go up 3 if they scored a TD, and you could still take lead back with FG.

Yeah, I suppose that’s another way to look at it to show 5<<<<6<<7, although I think we’d both agree that margin of error is too great here to get any definitive answer. If you choose different cutoffs, you get different results.

So it comes down to theory. I think being down 7 is pretty serious. A touchdown only ties, and a field goal isn’t that helpful. Down 6, a touchdown wins, and a field goal is helpful. Down 5, a touchdown wins, and a FG is really, really helpful.

Agree that with 15 mins left, it’s a little tricky. One could also argue that you don’t want to incentive aggressive play by your opponent.