Rushing EPA and Yards per Carry

by Chase Stuart on July 15, 2014

Today I want to look at how traditional rushing statistics compare to rushing Expected Points Added, one of the main stats used over at Advanced Football Analytics. In my analysis, I used the EPA numbers for each team in each season from 2002 to 2013.

Stickiness from year to year

Yards per carry is not a sticky metric: by that, I mean, it is not very consistent from year to year. The correlation coefficient between a team’s yards per carry in Year N and yards per carry in Year N+1 was just 0.31. Sometimes the square of the correlation coefficient is described in terms of “explanatory power”: loosely speaking, this means roughly 10% of a team’s YPC average in Year N+1 can be explained by its YPC average in Year N.

Now, a lot of metrics aren’t sticky from year to year, because the NFL is a highly competitive league. In fact, Rushing EPA per play has a lower correlation coefficient from year to year at just 0.30. That’s a strike against EPA. On the other hand, Burke’s success rate metric has a CC of 0.39, which is more impressive. The CC for Net Passing Yards per Attempt year over year is 0.43.

Correlations between EPA and Other Metrics

Let’s switch from the same metric in different years to different metrics in the same year. The CC between EPA and rushing yards per carry is 0.77. That sounds pretty high, but the CC between EPA and rushing yards — that is, comparing an efficiency statistic to a gross one — is 0.70.

The CC between EPA and success rate is 0.80; I’m not quite sure what to think about that. If we add a 20-yard bonus for every touchdown, and calculate the CC between Rush EPA and this adjusted form of YPC, the CC is 0.83. But if we also provide a 9-yard bonus for first downs, that vaults the CC to 0.87. In other words, this provides support for the great work Brian Burke produced a couple of weeks ago. Using Adjusted Yards per Carry, with a 20-yard bonus for touchdowns and a 9-yard bonus for first downs, corresponds very well with AFA’s EPA metric.

Regression Analysis

What if use three variables — yards per carry, touchdowns per carry, and first downs per carry — to predict EPA? We get the following formula using a linear regression:

EPA/Carry = -0.49 + 0.058 * YPC + 1.29 * TD/Carry + 0.90 FD/Carry

The R^2 is 0.87, but the more important thing we can tease out of the results is the relationships among the coefficients. For example, the weight on the TD/Carry variable is 22.3 times as large as the weight on the Yards/Carry variable. And the weight on the First Downs/Carry variable is 15.5 times as large as the weight on the Yards/Carry variable.

Does this mean the 20 and 9 weights are wrong? Perhaps, although note that the correlation coefficient is the same using either weight. ^[1]Okay, it’s slightly higher: 0.874 to 0.870. This means that while the regression implies that different weights should be used, changing the weights doesn’t seem to make much of a difference. I suspect that the weights are higher here because first downs and touchdowns are correlated with other types of good play, so I’m okay for now keeping the weights at 20 and 9.

What if we perform the same regression, but use success rate as out dependent variable instead of EPA?

Success Rate = 16.8 + 0.92 * YPC + 45.9 * TD/Carry + 81.2 * FD/Carry

Look at how unimportant the YPC variable is to predicting success rate: going from 3.5 YPC to 5.5 YPC only predicts an increase in success rate of 1.8%, assuming first down and touchdown rates remain the same. The coefficient on the first down variable is 92.4 times as large as the variable on yards, while the variable for TDs is 49.8 times as large as the yards variable (presumably, this is because touchdowns are much more infrequent than first downs, so knowing the number of touchdowns a team scores won’t tell you much about the team’s overall success rate).

I don’t have too much else to add, so let me know your thoughts in the comments. I have a lot of reservations about using YPC, but I think AYPC is a big step in the right direction. And for those curious, the correlation coefficient between AYPC in Year N and AYPC in Year N+1 is 0.35.

References[+]

References
↑1	Okay, it’s slightly higher: 0.874 to 0.870.

Tagged as: Advanced NFL Stats

{ 28 comments }