≡ Menu

Guest Post: Putting Lipstick on the YPC pig

Brian Malone, a writer for dynastyleaguefootball.com, has put together a great guest post today. You can follow him on Twitter at @julesdynasty. Thanks to Brian for today’s article!

Putting lipstick on the YPC pig

We all know that that yards per carry is, as Danny Tuccitto puts it, nearly a “bunkum stat” in terms of predictive power.   Even as a descriptive tool, YPC is tolerable but unsatisfying.  Matt Forte (4.12) and Chris Johnson (4.15) had nearly identical YPC in 2015, but their paths to these numbers were notably different.  Forte rarely got stuffed behind the line of scrimmage, and he was well above average at posting four-yard gains.  Johnson, in contrast, was a home run hitter, padding his YPC with runs longer than 20 yards.

Painting a better picture

We could supplement YPC with the standard deviation of a player’s runs.  Or, as Jeff Levy suggests, we could include confidence intervals to define a player’s “true” YPC.  Both supplements add useful information, but neither smacks the reader in the face with the contrast between Forte and Johnson.  For that, we may need a visual.

forte johnson

This shows what portion of a player’s runs went for at least x yards, compared to the average portion among all running backs with at least 25 carries in the same season.  So Forte was roughly one standard deviation better than average at getting past -2, 1, and 4 yards, but average to below average at any distance beyond 8 yards.  Johnson, on the other hand, was middling across the board.  His biggest strength was a half-standard deviation advantage in breaking long runs.

One of 2015’s biggest home run hitters was Todd Gurley (4.83 YPC).  He didn’t have a “YPC twin” like Forte and Johnson, but a quick search showed that Alfred Morris – decidedly not a home run hitter – put up 4.81 YPC in 2012.  Here’s how they each posted those numbers:

gurley morris

Gurley was met in the backfield often, but this also suggests he wasn’t very successful getting through the second level of the defense.  He made hay on long runs, but relying on those big gains is a dangerous proposition.

Gurley’s struggles behind the line of scrimmage prompted me to look at fellow freshman Melvin Gordon (3.48 YPC).  Contrary to the “Chargers don’t have an offensive line” narrative, Gordon wasn’t obviously worse than Gurley at getting a few yards past the line of scrimmage.  Gordon’s low YPC stems from his failure to break off runs longer than 4 yards.

gurley gordon

Looking at runs less than 8 yards, it’s tough to tell between Gordon and Gurley.  But Gurley turned those 4-yard runs into 8-, 14-, 20-, and 30-yard runs at a much better rate.

Building a better metric

Now that we’ve created something that illustrates rushing production better than YPC, maybe we can use it to predict rushing production better than YPC.  It’s a low bar, after all.

Indeed, we can clear the YPC bar, but maybe not by much.  Using data from 2000 to 2009 and the buckets included in the above visualizations, I data snooped my way to a model1 that predicted season n+1 YPC better than season n YPC does (for running backs with at least 100 carries in each season).  I then tested that model on 2010 to 2014 data, and the results were positive.  Indeed, the model actually improved when applied to the testing data.  Here’s the results across all seasons (n = 2000-2014).

season xypc

Nothing groundbreaking here.  Each full YPC predicts an extra 0.24 YPC the following season, while each full xYPC predicts an extra 0.33 YPC the following season.  The good news is we can probably build a better model than this one.  My main goal was to illustrate running back performance; any predictive power is a bonus.

Speculating a better speculation

Even if this model isn’t optimal, I can’t not use it.  Let’s see what it predicts for 2016.

The model doesn’t like Gurley (4.8 YPC; 4.2 xYPC) or Lamar Miller (4.5 YPC; 4.1 xYPC).  That’s probably not a good sign for the model.  For what it’s worth, Lamar Miller’s 2015 looks much like, say, Charcandrick West’s (4.0 YPC; 4.1 xYPC):

miller west

The model prefers mostly low-sample rushers.  Perhaps most interestingly, it likes Forte (4.1 YPC; 4.4 xYPC), Jeremy Langford (3.6 YPC; 4.2 xYPC), and Ka’Deem Carey (3.7 YPC; 4.5 xYPC).  That oddity prompted me to make one more visualization.  I’ll let you decide how to interpret it.

langford carey

  1. Without getting too deep into the weeds, the model assigns the following weights to standard deviations from the mean of the portion of a player’s runs gaining at least x yards: Screen Shot 2016-03-23 at 7.36.04 AM []
  • sacramento gold miners

    Barry Sanders was on both sides of this coin. Lost yardage or was stuffed on a number of carries, but those were offset with his longer gains. Often was taken out of the game inside the opponents ten yard line, which must have been very frustrating to such a great RB.

    • sn0mm1s

      He wasn’t often taken out – that is a myth.

      • sacramento gold miners

        I guess it depends on your definition of often, I saw many Detroit games in which that happened, and the announcers commented on it. But other Lions games, it may not have happened as much. It’s extremely rare to see other HOF backs pulled in that kind of situation.

      • Richie

        I don’t know whether it’s true or not.

        However, from 1994-1998, Sanders had 70 rush attempts in “goal-to-go” situations. The Lions ran 245 total such plays. Sanders carried the ball on 29% of the Lion’s goal-to-go plays. (He also caught 6 passes.)

        By comparison, from 94-98 Emmitt Smith carried 145 times (48%) of Dallas’ goal-to-go plays.

        During the period 1994-98, Sanders ranks 13th in carries in goal-to-go situations, behind guys like Karim Abdul-Jabbar, Natrone Means and Bam Morris.

        But, Sanders ranked second in TOTAL rush attempts during those years, only slightly behind Emmitt Smith, but way more than Means, Morris and Abdul-Jabbar.

        So, maybe he was in the game, but the team definitely wasn’t giving him the ball like other bell cow RB’s of the day. Not sure if this is more an indictment of Sanders or Wayne Fontes and Bobby Ross.

        • sn0mm1s

          1989-1992 he was the only non-QB on the Lions to rush for a TD. He also was the only non-QB in 1996. No other RB in history is close to doing that. He scored pretty much the exact same % of non-QB rushing TDs as Emmitt Smith. I recall an interview with Fontes in which he basically said he starting subbing out Barry because he had an injury in 1993 and, since any RB can plunge into the endzone, he thought it would eliminate some of the pounding on Barry but not the ability to score.

          • Richie

            It looks like both sides may be correct.

            I was using 1994-1998 data because that’s as far back as the play finder goes.

            But we can look up just the touchdowns. From 1989-1992, Sanders scored 31 rushing TD’s of 1-10 yards, which led the league during that timeframe. http://pfref.com/tiny/7nPmf
            Smith and Okoye were right behind with 29 TD’s, but Smith played one fewer season.

            Then, from 1994-1998 (one season more than the previous data set), http://pfref.com/tiny/r69QT Sanders had only 16 rushing TD’s of 1-10 yards, which was 23rd-best and half as many as he scored at the beginning of his career. (Emmitt Smith took off, with 64 <10-yard TD's!!!)

            So, it looks like Sanders was getting used near the goal line early in his career, but not as much at the end of his career.

            • sn0mm1s

              There is also the issue of opportunity. Excluding QB carries (since handing the ball off to someone else is what we are really debating here) from 1994-1998 Barry took 65% of the GtG carries, Smith took 77%, TD took 63%, Kareem (1996-1998) to 68%, Faulk 70%, Marcus Allen (to 1997) 65%, Thurman Thomas (to 1997) 58%. I didn’t tally the guys that played for multiple teams but from those #’s I wouldn’t say the Lions are handing off the ball to someone other than Barry much differently than other RBs (aside from Smith).

              Here is why it was seemed like such a big deal. The Lions didn’t have a FB. I don’t know the exact percent of plays that they ran with a single back but it was very high. When there are two RBs on the field and you don’t give it to your star it doesn’t seem like a big deal. However, it is very noticeable when you remove your star RB and put in someone else.

              Also, Barry took the highest % of his teams’ non-QB carries in NFL history. He took the handoff 86% of the time which dwarfed all other players.

  • Richie

    It looks like you touched on 2 topics here: predictability and description.

    One thing I am more interested in is description. I can’t remember if anybody has ever done any work with median carry length. In the case of Forte vs Johnson, I would be curious to know what their median length of carry was. Does that help tell us which guys are breaking off long runs?

    • sn0mm1s

      IIRC, median carry for almost all RBs is 3, mode is 2, average is usually low 4s. Not sure where I read that but that info has been out for a while. I have argued that YPC is a good stat for measuring how good an RB is. It isn’t a predictive stat because it is influenced heavily by long runs – and long runs are hard to break. The model above isn’t big on Gurley – but I am curious as to what it would say regarding the other RBs on the Rams. At a quick glance, Mason averaged 2.8 YPC on 75 carries and Cunningham averaged 3.8 on 37.

      • BK

        Right, median carry will be the same for almost everyone. The best
        substitute, I think, is rate of 4+ yard runs. As 4+ yard runs
        increases, either YPC increases or long runs decrease.

      • BK

        I’ve got Mason at 3.9 xYPC, which is about what you’d expect. Cunningham is 4.6 xYPC by dint of his above-average rate of 4+ yard runs.

        • sn0mm1s

          That is interesting. So, the metric actually likes Cunningham more than Gurley.

          • BK

            In terms of YPC, yes. But it was created and tested on guys with 100+ carries.

  • Cromulent

    Wait. The weights flip sign at 20? Not bueno.

    • BK

      Agreed. I suspect it’s because the thresholds are too close together, especially on long runs. I chose these thresholds for the visual, not a model. If I do serious work on a model, the setup must be refined.

      — Brian

  • Danny Tuccitto

    Brian, just wanna say two related things that make this piece get a big thumbs up from me. First, thanks for shouting out my previous research on YPC. Quick google searches and proper citations tend to be far too underutilized in NFL analytics (as anyone who knows or follows me knows is a Top 3 pet peeve of mine). Relatedly, nice job actually improving on previous research (i.e., advancing our knowledge base) rather than just reinventing the wheel.