**adjusts for how frequently a player’s team passes**, adjusts for the league passing environment, and adjusts for the number of games scheduled for the team that season.

The bolded adjustment caused some consternation for some of the commenters. Some folks feel that if Team X passes twice as often as Team Y, we shouldn’t just expect the top WR on Team X to gain as many receiving yards as the top WR on Team Y. There’s some merit to that argument: in general, high-pass teams probably have more receivers on the field on a given play, and low-pass teams are often in more favorable situations. Being the only wide receiver on the field on a play-action pass is probably an easier situation to gain yards than being one of four wide receivers in an obvious passing situation.

In other words, some feel that we shouldn’t expect a one-to-one increase in receiving yards (or Adjusted Catch Yards) relative to team pass attempts. Is there a way to test that? Neil and I came up with three such methods.

**Year-to-Year Case Study**

Neil looked at all wide receivers since 1970, ages 23-30, who started every game for the same team in back-to-back seasons (Years Y and Y+1). The sample was 245 pairs of player-seasons, after removing 2 extreme outliers: Drew Hill 1985 (went from 335 True Receiving Yards without the team attempt adjustment in 1984 to 1048 in 1985) and Roger Carr 1976 (went from 591 to 1438).

The experiment was set up as follows: for each player-season in the pair, Neil recorded how many True Receiving Yards they had *before * including the team passing attempts adjustment (i.e., the only adjustment to Adjusted Catch Yards was based on the league passing environment as a whole and the number of games in the season). Neil also recorded the player’s team’s ratio of dropbacks to the average team’s dropbacks. If a 1-to-1 effect exists with respect to number of team dropbacks and receiver production (which is assumed in the first version of True Receiving Yards), then you could predict the percentage change in True Receiving Yards from Year Y to Y+1 by looking at the percentage change in the ratio of Team-to-League Dropbacks from Year Y to Y+1.

As an example, in 1979, Alfred Jenkins had 831 TRY before the team attempts adjustment, and the Falcons’ ratio of dropbacks to the NFL average was 1.063 that season. In 1980, Jenkins again started 16 games for Atlanta, but this time, the team’s ratio of dropbacks to league average was 0.954. That means we’d expect Jenkins’ production to decline by 10.3%^{1}; in reality, Jenkins actually saw his True Receiving Yards increase by 14%. That’s one data point; do this for all 245 pairs, and we might learn something.

Neil ran a regression on those 245 pairs of seasons, and found the coefficient on the expected difference to be 0.403 (and it was statistically significant). That basically means that if there’s a 30% increase in team dropbacks, we should predict a 12% increase (40% of the difference between zero and 30%) in productivity.

That means Hines Ward, for instance, should only be getting a 15.7% productivity boost in 2004 when Pittsburgh threw 39% less frequently than the average team. And Calvin Johnson 2012 should only be dinged by 9.2% instead of 23%. Johnson’s record-breaking 2012 ranked 183rd in Neil’s original formula due to the heavy discount applied to the Lions offense: after all, Detroit did break the record for pass attempts in a season. By making the team pass attempts adjustment equal to 40% instead of 100%, Johnson moves up to 37th. Neil, like everyone else, likes Johnson, so he approves of this change.

**The Individual Games Approach**

While Neil was in Philadelphia monkeying around on his laptop trying to figure out the marginal value of a pass attempt to a receiver, I was doing the same thing in New York. Only I went about it in a completely different way.

I looked at all receivers since 1970 to play in at least 12 games and to have at least 400 receiving yards. I then separated the receivers, by game, into their three highest attempt games and their three lowest attempt games. My goal was to see how many Adjusted Catch Yards and Adjusted Catch Yards per Attempt these receivers were recording in high-attempt and low-attempt games. We’re looking at the same receiver, playing for the same team, in the same season: this is as close to a control experiment as you can get in football, even if it admittedly isn’t very close.

In the **high attempt games**, these receivers played for teams that passed 45.7 times, and the receivers recorded 91.7 Adjusted Catch Yards, giving them an average of 2.01 ACY/Att.^{2}

In the **low attempt games**, the teams passed 24.6 times, and the receivers recorded 63.6 ACY, giving them a 2.58 ACY/Att average.

So, in low-attempt games, the wide receivers did average more ACY/Attempt. Part of the reason for this, undoubtedly, is because the receivers were doing a better job in low-attempt games than high-attempt games. A few big plays early can lead to a low-attempt game, while a bunch of bad routes or drops in the first half can lead to a lot of second-half passes. Putting that issue aside, what can we take from this data?

When you increase pass attempts by 85.5%, you increase Adjusted Catch Yards by 44.3%; that implies a 52% ratio, not the 40% Neil got. I also ran the numbers on simply the single (instead of top three) highest- and lowest-attempt games. The results:

High: 49.8 Att, 98.0 ACY, 1.97 ACY/Att;

Low: 21.8 Att, 60.0 ACY, 2.75 ACY/Att.

This implies that for a 128% increase in pass attempts, you only see a 63% increase in ACY, which implies a 50% ratio.

**Half-Season Approach**

One final study, courtesy of Neil. He took the same sample from his year-to-year study and re-ran the same experiment as before, except he used in-season splits instead of back-to-back seasons. Neil calculated the ratio of team dropbacks in odd-numbered games to dropbacks in even-numbered games to try to predict the change in Adjusted Catch Yards from even to odd-numbered games.

This time, the statistically significant coefficient on expected change in ACY was 0.779, representing a significant change from the 42% earlier.

As you can see, the results are a bit all over the place. Using the 50% number is certainly convenient, and it does split the difference between all of the different results we’ve seen. In True Receiving Yards version 2.0, perhaps that’s the right approach: make all the adjustments as before, but only take half the difference between the player’s team pass attempts and the league average. This will still help Calvin Johnson (it will bring him up to 47th), and might limit some of the outliers who played on really low-attempt teams.

But I’m opening this one up to the crowd. What are your thoughts? And, of course, there are other applications of this to consider: after all, how would this impact our decision to use Yards per Attempt (or NY/A or ANY/A or passer rating), which implicitly assumes a one-to-one ratio of productivity to attempts?

{ 10 comments… read them below or add one }

FWIW, if we use Bayes’ Theorem on the means and standard errors of my two very different coefficients, you get:

Updated Mean = ((mean_prior/stderr_prior^2)+(mean_obs/stderr_obs^2))/((1/stderr_prior^2)+(1/stderr_obs^2))

=((.403/.146^2)+(.779/.165^2))/((1/.146^2)+(1/.165^2))

=0.568

Combine that with Chase’s multiple findings of 50%, and that’s why I think ballparking it at 50% is the best estimate we have right now.

(Btw, Calvin Johnson 2012 ranks 47th since 1970 — and still #2 in 2012 — if we set the ratio to 50% instead of 40%. Cliff Branch ’74 is #1 no matter what we do; Harold Jackson ’73 is 6th; Marvin Harrison ’02 just edges ’99 as his best season; and Hines Ward’s career numbers might still be vexing to Chase.)

1) What does a scatterplot look of the X and Y variables in these regressions look like?

2) What does a scatterplot of the regression residuals look like?

3) What does a histogram of the Y variable look like?

If I understood the previous thread correctly, the issue being raised was that the relationship between team pass attempts and receiving yards (or ACY in this particular case) isn’t linear. If it isn’t, then a linear regression is the wrong method right out of the box, so the previous analysis and these three are prone to be screwy (for lack of a better term) from jump street. The scatterplots would tell you if it’s a nonlinear phenomenon or not.

The histogram would tell you if you need to do some transformation to make the dependent variable normally distributed, which is another assumption of linear regression.

Incidentally, all of these points are things about regression with which I’m sure you’re both aware. Just throwing s*** out there to eliminate the most obvious potential issues, so we can really get down to brass tacks.

Thanks, Danny. These are good questions, probably best fielded by Neil since I didn’t end up using a regression model.

I ran Q-Q plot on each dataset and they’re pretty normal. Now, the R^2 on these regressions are practically 0, so the scatterplot looks like you’d expect for that kind of relationship — no discernible shape, scattered in a “blob” around the axes (like the two on either side of top-middle here: http://upload.wikimedia.org/wikipedia/commons/d/d4/Correlation_examples2.svg).

Without a relationship, you might be asking if it’s appropriate to even use regression in this instance. I think it still is, because the coefficients for expected production increase/decrease are highly significant. Take a look at this post for a very detailed rationale:

http://blog.philbirnbaum.com/2009/05/regression-equation-versus-r-squared.html

Chase,

I might be mischaracterizing this somewhat, but I think you’ve said before that you view WR targets as a positive indicator of quality even if receptions don’t follow since a target indicates the receiver was the best option on that play. If I’m remembering that right, would it make any sense to not discount for pass attempt totals at all since they indicate the WRs were doing a better job than the RBs at producing yardage?

I don’t actually believe this myself, but I also think I feel NFL coaches are significantly less skilled at the game-strategy part of their jobs than I think you do.

Generally, it makes sense to discount for pass attempt totals since I think pass attempt totals are largely a function of game script. In that way, if we assume the best receivers are on the best teams, and the worst receivers are on the worst teams, then the worst receivers would play on the teams that pass the most, and the best receivers would pass on the teams that pass the least. Hence the need to adjust for attempts.

However, you did just make me think of something.

In this post, I looked at Game Scripts and Pass Identity of each team: http://www.footballperspective.com/game-scripts-the-best-teams-of-2012/

After adjusting for Game Script, the Falcons were the most pass-happy team last year. Not coincidentally, they have the best 1-2-3 set of receivers in the league. Kansas City had the most run-heavy identity, and the Chiefs wide receivers are pretty weak. So perhaps after adjusting for Game Script, pass attempts *would* be an indicator of quality. Of course, I suspect that the talent of the QB and the OC would be more relevant factors than the quality of the WRs, but it’s an interesting thought to contemplate.

I like (A)(N)YPA because it’s a simple statistic that can be easily calculated that does a good job for a simple statistic of giving you a very big-picture overview. It’s not a complete picture, by any means. To put it in FO parlance, ANYPA is DVOA, but you need both DVOA and DYAR to analyze a player’s season.

If you have the necessary data, I would try the following:

1. Use ACY to separate skill position players on a team into legitimate receiving threats and those who are mainly blockers/runners.

2. Separate team pass attempts into bins based on how many receiving threats are on the field.

3. Within each bin repeat your studies of the relationship between ACY and team pass attempts.