## How to Project the Number of Passing Yards in a Game

Spoiler: the quarterback plays a big role in passing yards.

In May, I wrote that the scoring team is responsible for roughly 60% of the points it scores, while the opponent is responsible for 40% of those points. In other words, offense and defense both matter, but offense tends to matter more.

I was wondering the same thing about passing yards. When Team A plays Team B, how many passing yards should we expect? As we all know, Team A can look very different when it has Dan Orlovsky instead of Peyton Manning, so I instead chose to look at Quarterback A against Team B. Here’s the fine print:

1) I limited my study to all quarterbacks since 1978 who started at least 14 games for one team. Then, I looked at the number of passing yards averaged by each quarterback during that season, excluding the final game of every year. I also calculated, for his opponent, that team’s average passing yards allowed per game in their first 15 games of the season.

2) I then calculated the number of passing yards averaged by each quarterback in his games that season excluding the game in question. This number, which is different for each quarterback in each game, is the “Expected Passing Yards” for each quarterback in each game. I also calculated the “Expected Passing Yards Allowed” by his opponent in each game, based upon the opponent’s average yards allowed total in their other 14 games.

3) I then subtracted the league average from the Expected Passing Yards and Expected Passing Yards Allowed, to come up with era-adjusted numbers.

4) I performed a regression analysis using Era-Adjusted Expected Passing Yards and Era-Adjusted Expected Passing Yards Allowed as my inputs. My output was the actual number of passing yards produced in that game.

Below is the best-fit equation, after I forced the constant to be zero, since we don’t care what the constant is in this regression, we just want to understand the ratio between the two variables.:

0.704 * Era-Adjusted Expected Passing Yards + 0.255 * Era-Adjusted Expected Pass Yards allowed by the Defense

The key number in that equation isn’t even in the equation: the key number is the ratio between the two coefficients. The quarterback variable is 2.76 times as large as the defense variable. In other words, 73% of the amount of passing yards in the game can be attributed to the quarterback (and his offensive line, wide receivers, tight ends, and running backs), and 27% to the defense.

Let’s say we think Drew Brees is a 320-yards-per-game passer in an environment where the average team throws for 230 yards. If he faces a team that allows 200 yards per game passing, this formula would project Brees to throw for 288 yards.1  Put Brees against a defense that allows 300 yards per game through the air, and his projection bumps up to 315 yards.

That’s a bit higher than the 60/40 breakdown from before, but not entirely unexpected. For starters, the 60/40 breakdown lumps together all teams regardless of changes in quarterback play: if we restricted that study to all games with the same quarterback, I suspect the numbers would diverge even more.

Then I did the same thing but used only seasons since 2000. The best-fit formula became:

0.748 * Era-Adjusted Expected Passing Yards + 0.247 * Era-Adjusted Expected Pass Yards allowed by the Defense

That jumps it from 73.4% quarterback to 75.1% quarterback. I also ran the numbers just since 2008, and the effect flipped, with the quarterback being responsible for 72.1% of the passing yards in a game.

One other note: The R^2 was 0.14 on the original equation, which is pretty low. That means a whole lot more goes into how many passing yards a player will have against a team than the average production of the player and the team. Perhaps something like Game Scripts? That’s food for another day, but I did run a few regressions, with no particularly interesting results. 2 In any event, I think we can safely conclude that the amount of passing yards a quarterback scores is roughly three parts quarterback, one part defense.

1. Brees is 90 yards above average, and 90 * 73.4% is 66 yards. The defense is 30 yards better than average, and -30 * 26.6% is -8 yards. Therefore, we project Brees at 58 yards above average, which is equal to 288 yards. Note that in the regression equation, the coefficients add up to “only” 0.958. That, I think, reflects that the quarterback has a chance that he won’t play the entire game due to injury or a blowout. I think it makes more sense to project the quarterback as if is going to play the entire game. []
2. Okay, here they are:

On all games since 1978: 0.252 * Defense + 0.696 * Quarterback + 0.25 * Game Script. This implies that teams who are leading in games tend to throw for more yards. But I think this is a function of historical data.

On all games since 2000, the coefficient on the Game Scripts variable was -0.04 and nowhere close to being statistically significant. On the data over the past five years, the coefficient was -0.01, and nowhere close to significant. []

• Sunrise089

“Put Brees against a defense that allows 300 yards per game through the air, and his projection bumps up to 315 yards.”

Typo?

I know you don’t claim otherwise, but the 73.4 -> 75.1 -> 72.1 QB percentages seem like noise relative to one another. Doesn’t change the conclusion that the QB matters about 3x as much as the defense, but no way do we know the ratio in any predictive sense to three significant figures.

• Chase Stuart

No typo. With an outlier like Brees, you are bound to get some results that feel off using a linear regression model for every quarterback.

And yes, I agree that difference in those is numbers noise.

• Red

My first thought is that SoS adjustments for QB’s need to be heavily regressed, or more conveniently just ignored altogether. If the passer makes up 75% of the variance and the defense only 25%, shouldn’t we multiply SoS by .33 to properly account for the discrepancy in magnitude? In other words, if a given team allows 0.9 ANY/A less than average, the SoS adjustment for a QB facing that team would only be +0.3.

• Chase Stuart

To be clear, this study looked at just passing yards, not ANY/A.

• Guru

I take it that when you say passing yards is “three parts quarterback”, you mean “three parts quarterback and his wide receiver support”, correct?

• Guru

Ah, missed where you stated that in the 8th paragraph. Never mind.

• Kibbles

If Drew Brees is a 320 yard passer, and he’s facing a well below average defense, why is he expected to average FEWER than 320 yards? If 320 yards is Drew Brees’s “true talent level”, shouldn’t we expect him to average fewer than that against great defenses and more than that against terrible defenses?

• Chase Stuart

Good question, Kibbles. The short answer is, “it’s complicated.” A longer answer would say that a tool used to project all quarterbacks might not be the best one to project Drew Brees.

But to get to the specific math portion of the question, I used poor shorthand in the original post. When I wrote:

“Let’s say we think Drew Brees is a 320-yards-per-game passer in an environment where the average team throws for 230 yards.”

I should have written:

“Let’s say that Drew Brees averages 320 yards per game in his other 15 games in an environment where the average team throws for 230 yards.”

Those are two very different statements, and what I really meant what the latter. In that case, our best guess as to Brees’ true ability is 90*.704 = 63. So his true ability is more like 293 yards per game. In that case, the 315 makes more sense. It’s above his true ability.

The first statement I wrote made it sound like it was an unimpeachable fact that Brees is a 320-yard/game passer. The more accurate statement would be that if Brees is a 320 y/g passer in 15 other games in an environment where the average quarterback throws for just 230 yards per game, then Brees’ true ability is probably lower than 320 yards per game. It’s probably more like 293.

• Kibbles

That makes sense- you were saying Brees was a 320 y/g passer in a descriptive sense (“this is the level he has performed at so far”), whereas I was reading it in a predictive sense (“this is the level his performance level will approach if given a sufficiently large sample size”).

• Ty

It would be interesting to see the percentage a QB’s influence has for Y/A, NY/A, and ANY/A.