Those are some clutch shirts.

Eight years ago — almost to the day — our old PFR colleague Doug Drinen wrote a Sabernomics post about “The Manning Index”, a metric designed to roughly gauge the clutchness (or chokeitude) of a given quarterback by looking at how he did relative to expectations (he revived this concept in version two, six years ago). In a nutshell, Doug used the location of the game and the win differential of the two teams involved to establish an expected winning percentage for each quarterback in a given matchup. He then added those up across all of a quarterback’s playoff starts, and compared to the number of wins he actually had. Therefore, quarterbacks who frequently exceeded expectations in playoff games could be considered “clutch” while those who often fell short (like the Index’s namesake, Peyton Manning) might just be inveterate chokers.

Doug ran that study in the midst of the 2004-05 playoffs, so it shouldn’t be surprising that Tom Brady (who was at the time 8-0 as a playoff starter and would run it to 10-0 before ever suffering a loss) came out on top, winning 3.5 more games than you’d expect from the particulars of the games he started. Fast-forward eight years, though, and you get this list of quarterbacks who debuted after 1977:

Flacco, whom Chase wrote about yesterday, predictably finishes near the top of the list. As usual, the table is fully sortable (the table includes 130 quarterbacks, and you can change the number of quarterbacks displayed or use the search function to find other quarterbacks), and you can click on the “Index” column to bring the “biggest chokers” to the top. A few other notes: I generated the expected wins for this Manning Index not from Doug’s formula, but from the pregame win probabilities we can derive from the Vegas lines. Also, since I don’t have a list of all QB starts since 1978, I’m considering a player to be the QB of record for his team if he led them in pass attempts in a game (with yards as the tiebreaker).1

By now, you’ve probably noticed that the Manning Index has reached its full potential, with a Manning brother at the top and at the bottom! But I also want to focus on the strange career arc of Brady, and how much his Manning Index has changed since the end of the 2004 season.

Here are Brady’s career playoff stats as the Patriots’ primary QB in a game, year-by-year:

YearAgeGmtmPtsopPtsCmpAttYdsTDIntSkSkYdRushrYdsrTDWonLostExp WIndex

The big thing that jumps out is the difference between his Manning Index early (through age 27 it’s +3.5; through age 29 it’s +3.8) and late in his career (starting in 2007, it’s -2.5). As an aside, two years ago, Jason Lisk wrote about Brady’s career in reverse (and Bill Barnwell did the same earlier this week), showing how easy it is to flip the narrative about him depending on which direction you view his playoff record. It’s almost as though perceived early-career clutchness has no predictive value for the latter half of a QB’s career… How shocking!

But it’s worth noting that Brady isn’t alone — here are the QBs who had the most playoff games before and after age 27:

PlayerGm Thru 27W Thru 27eW Thru 27IndexGm After 27W After 27eW After 27Index
Brett Favre1075.02.01468.2-2.2
Tom Brady884.53.515810.1-2.1
Troy Aikman764.51.5955.9-0.9
John Elway743.90.114107.82.2
Donovan McNabb954.60.4743.70.3
Peyton Manning622.9-0.91479.0-2.0
Dan Marino634.0-1.01255.00.0
Steve McNair532.30.7522.4-0.4
Joe Montana542.61.4161210.51.5
Mark Brunell421.01.0522.4-0.4
Dave Krieg431.71.3502.1-2.1
Eli Manning742.71.3441.72.3
Ben Roethlisberger1085.52.5422.3-0.3

There are some exceptions, of course, but the majority of QBs who had playoff success (as measured by the Manning Index) through age 27 saw their late-career performance fall off a cliff. In fact, if we run a linear regression attempting to predict a QB’s Manning Index from age 28 onward using his Manning Index through age 27, we get a statistically significant negative coefficient:

Post27ManningIndex ~ -0.3997 * Thru27ManningIndex

In other words, the better you do relative to expectations (i.e., the “clutcher” you are) early in your career, the more you can expect to choke later in your career. The same trend also holds up, albeit with less significance, if we split quarterbacks’ Manning Indices at age 29 (through which Brady was +3.8) instead of 27. It’s unclear what causes this “Brady Effect” — whether it be Vegas and/or the betting public becoming overconfident in a QB who had success early in his career, the effect of opponents having more tape on a QB as he ages, or just a matter of QBs peaking as playoff performers at an early age — but it’s a real historical phenomenon.

While you could argue that quarterbacks lose their clutchness over time, maybe this just means clutch quarterbacking doesn’t exist — or at least isn’t detectable to the point that it could be used to predict future outcomes. Here’s a final bit of evidence for the latter camp: the correlation between a quarterback’s Manning Index in even-numbered years and his Manning Index in odd-numbered ones is just 0.05, meaning there’s really no relationship there at all.

If a quarterback’s Manning Index/clutchness was a persistent skill, we would expect to be able to predict it in one random subset of a player’s career from another random subset. But we see here that it’s impossible to predict the “even” half of a player’s playoff career from the “odd” half… and if we try to predict future performance from a quarterback’s Manning Index at an early age, we end up with a negative correlation! In other words, this basically means we should avoid forming any opinion of a quarterback’s clutchness from his playoff W-L record. Period.

  1. [Chase Note: Scanning the old comments to Doug’s post, Jason Lisk chimed in with this still-applicable comment: Steve Young is unfairly penalized for throwing more than 10 pass attempts in the 1987 loss to Minnesota at home (13-2 vs 8-7). Montana threw more passes that game, and was the starter, so Young got mop up duty in the loss. Take that game out when he was not the starter and did not throw the most attempts in the game, and Young moves up the list and is closer to zero. []
  • Chase Stuart

    • Richie

      Unfortunately, the majority of the “Peyton Manning is a playoff choker” crowd would just think this is stat nerd babble and/or wouldn’t even understand.

      Although, I wonder if using the Vegas line is the best method. Let’s say that Peyton really is inherently a playoff choker, and that Vegas has figured it out. Aren’t they going to move the point spread away from him to reflect this, therefore reducing his “ExpW”?

      • Chase Stuart

        You are theoretically correct, but I can say with 99.9% certainty that Vegas isn’t lowering the lines of Peyton Manning games because he’s a choker.

      • Neil

        That’s true, although it would actually benefit his Manning Index. If he’s not a choker and Vegas thinks he is, his M.I. will be higher than it would otherwise be; if he is a choker, then Vegas’ expectations are accurate. Either way, he’s not getting shortchanged.

  • Danish

  • Danish

    Actually, I had not noticed that Neil had started contributing. I hardly ever glance at the byline – maybe make it a little bigger or emphatic? It’s nice to know who writes, if nothing else so that I can adjust for Chases Jets homerism (kidding).

  • HW3

    Could be that clutch-y QBs help a bad team exceed its limitations but Manning-y QBs are on good teams that are then difficult to outperform once you get to the playoffs. Also it is possible that players who go from clutch to Manning could start w/ undistinguished teams that they can help outperform but later in life get good teams built up around them and/or choose what looks like good teams through free agency, and then there is less scope to outperform the expectations of the good team.

  • Ryan

    Has nothing to do with Brady’s performance early vs. late. He was on a better team with a better defense early in his career vs. late in his career.

    11 times in his 23 playoff games his teams have scored 21 or fewer points. During the 3 SB runs, “he” was 4-0 in those games. Since their last SB win, “he” is 1-6 in those games.

    They had a better defense then. They were able to win those low scoring games. Now they can’t

    • Neil

      I’ll grant you the part about defense, but was he really on a better team early in his career? During the Super Bowl years (2001-2004), the Pats went 48-16 (.750) with a +6.6 PPG differential and a +5.6 YPG margin. Since then they’ve gone 98-30 (.766) with a +10.5 PPG differential and a +48.6 YPG margin. By any statistical standard, these are better teams later in Brady’s career — and besides, Vegas is taking into account team strength anyway. The expected W-L records already have team quality baked into them.

      • Richie

        I don’t know. Is net PPG or YPG the best way to measure/predict overall success? A team can be +6.6 PPG in 3 main ways:
        1) Great offense that outscores the opposition.
        2) Great defense, that limits the scores of the opposition.
        3) Good on both sides of the ball.

        I believe the 2010, 2011 and 2012 Patriots rank as some of the best offenses ever. But their defenses ranked 8,15,9 in points and 25,31,25 in yards. So a lot of the +48.6 YPG margin late in Brady’s career is due to the success of the offense.

        But the 03 and 04 Patriots were stronger on defense. They were getting the +5.6 YPG margin because they were good on both sides of the ball. (But not historically great on either side.)

        Gregg Easterbrook pointed out that the top 9 (I think) highest-scoring teams in NFL history all failed to win the Super Bowl. I think having a less balanced team leaves a lower margin for error in the short run (playoffs).

        • Ryan

          I don’t know what your point is in all that.

          What matters is PPG. You can say the 2001 team was not great by yardage, but in the latter part of the season and the playoffs, after starting I believe 6-5, the most points they gave up to any team was 24 in the regular season against NE. Brady didn’t have to do a whole lot in that first SB run. He didn’t complete the afc championship game that year and Bledsoe threw there only TD. In the SB they had 7 points on an INT return and 13 on offense. The D gave up 13, 17 and 17 points during those 3 games. So yes, they weren’t a great team overall in 2001, but they still held teams to few points scored and the often quoted metric of them being 24th or whatever in total D by yardage is meaningless. In fact, it was until Brady’s fifth playoff game he played in that he completed the game and needed to score more than 21 points to win

          Its pretty clear if you just look at things by points scored and points allowed over his career. Between 2001 and 2005 in the playoffs he had a D to win low scoring games, now he doesn’t. Its PPG that matters.

          • Richie

            My point is this:

            Team A: +5.0 PPG (Offense averages 25.0 PPG, Defense allows 20.0 PPG)
            Team B: +5.0 PPG (Offense averages 14.9 PPG, Defense allows 9.9 PPG)

            They both have the same NET PPG, but achieve it in different ways.

            Are those two teams equally good? Are they equally likely to succeed in the playoffs?

          • Neil

            Re: Ryan – So I guess the metric you’d be interested in seeing would be Bill Barnwell’s “Flacco Index” (or whatever he calls it):


            That’s the QB equivalent of Support-Neutral W-L in baseball (http://www.baseballprospectus.com/current/snwlreport03.html). Basically saying, “When a team allows x points, how often does it win?”, and comparing that to the QB’s actual W-L.

            That’s a post I can also do in the next few weeks, if you want.

          • Neil

            Re: Richie – probably not. The more accurate way to measure it would be according to the pythagorean expectation, which would consider the defense-oriented team to be better because each of those 5 PPG is worth more in a lower scoring environment.

          • herewegobrowniesherewego

            They gave up 24 points against themselves? (Sorry, couldn’t resist. :) I figured you made an honest typo. Wiki shows you meant they gave up 16 points against the Browns and Jets [regular season,] and 17 against the Steelers and Rams [playoffs] during that streak after starting 6-5. )

    • Chase Stuart

      Yeah, I’m with Neil on this one. The 2001 team wasn’t great by any measure, and while the ’03 and ’04 teams were really good, the Patriots have been heavy favorites in every playoff game the last six years.

  • Ryan

    Also Manning and his teams……2-9 in games scoring 21 points or less. 2-9 vs. 5-6 in their careers.

  • http://cacc90.wordpress.com Sal

    Very interesting stuff. My one issue is that perhaps it would be more useful to frame this in terms of which teams have exceeded expectations or under-performed in the playoffs, as opposed to centering it on the quarterbacks. Since (correct me if I’m wrong) the index is calculated using the two teams’s winning percentages, perhaps it should be a team stat, rather than one we can use to label individuals.

    For example, in the case of P. Manning, there were four playoff games in which his defense blew a lead in a game’s final minute. Had the Colts/Denver defense done its job on half of these instances, Manning’s actual postseason record is closer to what the index expects it to be.

    • Sal

      Disregard, hadn’t read the last paragraph. My fault!

  • Chase Stuart

    Sal brings up a good point. Doug put this disclaimer in his post, which I am sure Neil would endorse here:

    Just to be clear, I believe that teams — not quarterbacks — win football games, so I’m not claiming this is the One True Measure Of Clutchness. Whether I like it or not though, wins are credited to quarterbacks in virtually every discussion about quarterback greatness. This is merely a way of putting a quarterback’s win-loss record into perspective.

    • Neil

      Yeah, I’m the last person to say we should use QB W-L records — this is actually a way of showing how B.S. that practice really is. Even after taking into account how good a QB’s teams are (via the Vegas line), there’s no predictive value in his W-L. If anything, being “clutch” early in your career is correlated with choking later!

    • Ryan

      That I agree with and always have. If you look at just Brady and Manning and their stats, not record but passing stats, in the postseason, overall, they aren’t much different. Even the year the Colts won the SB, that team sucked on D in the regular season, but in the playoffs they played great and he played mediocre to good outside of the AFC championship game. He played better than the 3 TD-7INT metric that year meausures, but no great every game by any means. The difference was the D did its job finally. Measuring QBs by post season record by itself is misleading. Especially when you consider you can go 0-1 and lose a divisional game with a bye, 1-1 by winning a WC game and losing the division game, it looks like the latter QB performed better, but he didn’t really. Still lost in the divisional round.

  • Pingback: Qb Clutchness?()

  • Matt W.

    Interesting take, but it’s striking to me that the words “small sample size” never make an appearance. You’re applying some sophisticated techniques to a dataset where the largest sample size is a tiny 24 games. Every breakdown in the article — before 27 vs. after, odd vs even, correlation — behaves like you would expect it to if the data were just basically statistical noise. Guys who got “lucky” early in their careers stopped getting lucky later, etc. You could roll different colored dice 100 times in different proportions and come up with similar narratives to what football fans try to attach to playoff performance.

    • Neil

      That’s kind of the point — the Manning Index isn’t really worth anything in terms of prediction. Basically, almost all of what we observe in QB W-L records is noise; nobody plays enough games for it any observed differences to be statistically significant.

      And although the pre-27 vs post-27 split is actually significant at the .05 level, I’m pretty confident it’s a case of bias in the Vegas line — the betting public puts too much stock in the idea of the clutch QB, pushing up the expected Ws of a QB who merely got lucky early in his career. The QB regresses to the mean, but the line is built expecting his luck to continue, leading to worsening Manning Indices.

  • Matt W.

    Yes, totally agree. I would have preferred that insight be the main focus of the post and called out more explicitly. All in all, good stuff, though.

  • J

    What’s the rationale for using modified Vegas lines over regular season pythagorean wins probability (perhaps similarly only including games the starter in question was the primary passer)? Lines are a reasonable point-in-time indicator, but are still a prediction and subject to bias whereas pythagoras is a measure of a team’s relative skill set based on historical data. You even reference this bias in one of the comment replies, so I’m curious why not switch to a different data point?

    • Daniel

      Vegas lines are more accurate than the pythagorean probability theorem. If not, there would be no such thing as football betting.

      • Richie

        Isn’t the point of a Vegas line to get approximately half the money bet on each side? (As opposed to correctly predicting the final score.)

        • Neil

          Yes, the lines have their issues but are, as a general rule, the most accurate predictor of future results — even more so than the SRS or pythagoras. Opening lines actually are grounded in the mathematical model, but they also are equipped to account for extra information and the wisdom of crowds. Sometimes that wisdom can be exploited (i.e., sharps “fading” the public), but over a large enough sample the error between the final margin of a game and its line follows a normal distribution centered around zero.

          Re: Richie – My understanding is that they are aiming for even action on each side, in order to guarantee themselves profit on the vig. That’s not the same as predicting the final score, but amazingly, it works out to be the same thing in the long run.

  • Pingback: The Schottenheimer Index()

  • Pingback: Charting Eli Manning’s career, in five and ten game increments()

  • Mike

    99.9% sure that the Vegas line isn’t affected by a quarterback’s rep? Seems kind of silly.

    A pretty big portion of the public is convinced Manning is less effective in the postseason than in the regular season. Why wouldn’t this move the Vegas line at least a little bit? Conversely, Brady gained a strong reputation for being clutch early in his career. This obviously might impact the Vegas line a bit.