≡ Menu

As most of you know, I also write for Footballguys.com, what I consider to be the best place around for fantasy football information. If you’re interested in fantasy football or like reading about regression analysis, you can check out my article over at Footballguys on how to derive a better starting point for running back projections:

Most people will use last year’s statistics (or a three-year weighted average) as the starting point for their 2013 projections. From there, fantasy players modify those numbers up or down based on factors such as talent, key off-season changes, player development, risk of injury, etc. But in this article, I’m advocating that you use something besides last year’s numbers as your starting point.

There is a way to improve on last year’s numbers without introducing any subjective reasoning. When you base a player’s fantasy projections off of his fantasy stats from last year, you are implying that all fantasy points are created equally. But that’s not true: a player with 1100 yards and 5 touchdowns is different than a runner with 800 yards and 10 touchdowns.

Fantasy points come from rushing yards, rushing touchdowns, receptions, receiving yards, and receiving touchdowns. Since some of those variables are more consistent year to year than others, your starting fantasy projections should reflect that fact.

The Fine Print: How to Calculate Future Projections

There is a method that allows you to take certain metrics (such as rush attempts and yards per carry) to predict a separate variable (like future rushing yards). It’s called multivariate linear regression. If you’re a regression pro, great. If not, don’t sweat it — I won’t bore you with any details. Here’s the short version: I looked at the 600 running backs to finish in the top 40 in each season from 1997 to 2011. I then eliminated all players who did not play for the same team in the following season. I chose to use per-game statistics (pro-rated to 16 games) instead of year-end results to avoid having injuries complicated the data set (but I have removed from the sample every player who played in fewer than 10 games).

So what did the regression tell us about the five statistics that yield fantasy points? A regression informs you about both the “stickiness” of the projection — i.e., how easy it is to predict the future variable using the statistics we fed into the formula — and the best formula to make those projections. Loosely speaking, the R^2 number below tells us how easy that metric is to predict, and a higher number means that statistic is easier to predict. Without further ado, in ascending order of randomness, from least to most random, here is how to predict 2013 performance for each running back based on his 2012 statistics:

You can read the full article here.

  • mrh

    Why does the rushing yards formula include yards/rush but the receiving yards formula use total receiving yards? Did you run two separate rushing yards regressions for yd/rush and total yards and find the R2 was higher when using yds/rush? And similarly that total receiving yards yielded a higher R2 than using yds/reception?

    • Chase Stuart

      Good question.

      Yards per reception for RBs was nowhere near statistically significant, IIRC, so I chose to just use receiving yards.

      As for yards/rush over rushing yards, whether I used both yards per rush and rush attempts or rushing yards, the correlation coefficient was roughly the same. But it seemed preferable to split up rushing yards in this manner; even if it doesn’t make a big difference when you look at hundreds of running backs, it felt like the right thing to do.

      For what it’s worth, the formula spits out these projections:

      300 carries for 100 yards = 988

      200 carries for 100 yards = 915

      I think that makes sense: I would imagine that extreme YPC averages on both sides are likely to regress to the mean, whereas number of carries is a very telling statistic.