≡ Menu

Moral Margin of Victory

Debating whether Moral Margin of Victory is the most superior model ever devised

Our Moral Margin of Victory is what's important, Bill.

Suppose you watch an entire football game. Your job is to put a single number on the degree to which the winning team beat the losing team. Qualitatively, the scale runs from “had any number of things gone differently at the end, the winning team would have lost” to “the winning team was in control for most of the game” to “this game was never in question.”

I want to quantify that qualitative scale. And I want to do it in a retrodictive way. In other words, I’m not as interested in the degree to which the winning team outplayed the losing team as I am in the degree to which the winning team was in control of the game. To see the difference, imagine a game where one team opens up a 14-0 lead on a kickoff return touchdown and a fluke turnover that leads to a score, then cruises to an uneventful 31-17 win. The advanced stats might even show that the losing team was more efficient. The predictive measures might give the losing team a better grade, because the reasons the winning team won were not things that are likely to carry over to future games. I don’t care about any of that. The kick return happened, and the turnover happened, and the result was that the game was never in any serious doubt.

The easiest way to do this is to use margin of victory, and that works well in most cases, but there are obvious outliers. Consider the Green Bay – Washington game from week two, which was 24-0 midway through the second quarter and never really got any closer, and the Colts-49ers game, which was a one-score game with five minutes remaining. The latter game finished with a larger margin of victory. Again, if you’re interested in predictive measures, you probably do want to record that Robert Griffin III was able to generate a couple of late TDs and that the Colts were able to put away the 49ers so quickly and thoroughly. But I’m not interested in that here.

Another natural answer would be to use Chase’s game scripts. Or, if you wanted to fancy up the same concept, you could compute the average win probability throughout the game. This too would work in the majority of cases, but not always. If a game is tied with two minutes left, that’s really all I need to know: the game should be graded as “could’ve gone either way.” But game scripts (or average WP) would be sensitive to how the game progressed for its first 58 minutes. Whether one team went up 21-0 and then the other team came back to tie it, or the game was a seesaw affair, all that really matters that the game was still very much in question at the end.

In 2008, I borrowed an idea that the great Matt Hinton called Time of Knockout. Chase later refined the idea with these two posts. Those were a couple more attempts to get at what I’m trying to get at above. These are fun, but they are flawed in ways similar to margin of victory and game script. The comments to Chase’s posts contain a lot of the ideas in the discussion above.

Now I’m going to tell you my answer. Then you’ll use the comments to tell me how to improve it.

First we’ll define time_1 to be the point in the game (measured in minutes elapsed since the opening kickoff) when the eventual losing team last took an offensive snap while trailing by 8 or fewer points. And we’ll define time_2 to the be the point in the game when the eventual losing team last took an offensive snap while trailing by 16 or fewer points. Then we declare that

Moral Margin of Victory = 39.17 - .1947*time_1 - .3681*time_2

Hot, huh?

You can probably guess where this came from. I computed time_1 and time_2 for every NFL game since 2001, then I ran a regression of margin of victory on time_1 and time_2. I spent a lot of time debating whether straight linear regression was the right tool for the job, eventually deciding that I like it. But I’m certainly listening to your improvement suggestions. I’ll put a table of all the 2013 NFL games through October 27th at the end of the post, so you can find the ones you watched and determine if the MMOVs feel right.

The formula and the explanation make the idea seem more complicated than it is. Essentially, the winning team gets points for every minute after which the game was not in imminent danger, and they get bonus points for every minute after which it wasn’t in imminent danger of being in imminent danger. Pretty simple. Of course, there’s some arbitrarity here. I’m using two levels of danger here, and I could have used one or three. There are probably other reasonable definitions of danger. Again, I’d be interested in hearing if it feels about right.

Why do we need this?

For the NFL, we don’t. But I believe that a rating system incorporating this kind of data is exactly what college football needs. I am aware that the computer component will no longer be an official part of the formula determining who gets ranked where, but I would like to hope that the new committee will at least consider some set of algorithmic rankings (in the same way that the basketball people consult RPI). For many years, margin of victory was forbidden in the official BCS ranking systems for fear that it would lead to poor sportsmanship. I can understand that. But if you use the above MMOVs in place of raw margin of victory in the SRS (or any other margin-based system), you’ve got yourself a much more informative ranking and there are absolutely no sportsmanship concerns. Your goal as the coach of a national title-contending team against an overmatched opponent is to make it a three possession game as quickly as possible, and then keep it there. There is nothing unsportsmanlike about that, nor does it alter anyone’s incentives in any meaningful way.

A ranking system based on MMOV would be neither predictive nor retrodictive, rather inhabiting a middle ground that might be referred to as “resume-dictive,” which is exactly what the NCAA’s committee needs as a guide. If Florida State beats Clemson (which we’ll presume for now to be a very good team) by 37, the degree of that victory most definitely should be a part of Florida State’s resume. The fact that 37 is a large number is not what made the victory impressive; what made it impressive is the fact that the game was never in any real doubt after the first few minutes.

My goal is to publish this ranking system for college football in the coming weeks. While I’m beating the data into a usable format, tell me how I can improve the system.

And one more thing: Chase hates the name. Tell him how wrong he is.

  • Arif Hasan

    Can’t wait to see how the Vikings-Packers game shapes out in this metric. Would give some context to the Vikings’ 31 points.

    • Chase Stuart

      Well, Doug gave you the formula, so…. 😉

      • Arif Hasan

        Damn. It comes out to 0, because the Vikings finished within 7 despite the game not being in question.

        • Chase Stuart

          Looks to me like Minnesota last had the ball trailing by <9 with 2 minutes left in the first half, or 28 minutes elapsed. They last had the ball in a two-score game with 5:26 left in the 3Q, or 39.43 minutes elapsed. Assuming both of those numbers are correct, Doug's formula would put the MMOV of this game at 19.2.

  • I’m with Chase on the name, sorry. 🙂

    Very, very, very excited to see you writing, Doug, but I am too tired to offer thoughts for now and have an early workday tomorrow, so I will be back to comment more specifically. I just couldn’t skip commenting to say how glad I was!

    • Chase Stuart

      Me too!

  • Sunrise089

    Horray for Doug. This sounds like a great concept to discuss during a podcast…

  • AS

    Neat concept, but one question: In the DAL-STL game, how is time_2 only 13.4? Dallas didn’t go up by 17-0 until the second quarter. Similar thing for the GB-WAS game ranked second. Am I missing something?

    • Chase Stuart

      The time is based on when the losing team last had the ball, not when the winning team scored. In Green Bay-Washington, the Packers got the ball first, scored a FG, the Redskins punted, the Packers punted, the Redskins punted, the Packers scored a TD (10-0), the Redskins punted, the Packers scored a TD (17-0), and the lead was never less than 17 again.

      So the last time Washington had the ball when it was a one-score game was on their second drive. The third down play on that drive began with 3:22 left in the first quarter, which means there was 3.37 minutes left in the first quarter, or 11.63 minutes had elapsed.

      The last time Washington had the ball when it was a two-score was on their third drive. The third down play there began with 12 seconds left, which would be with 14.8 minutes left. I’m not sure where Doug got 13.9, but I assume that just to be a small error in pulling PBP files.

      Using Doug’s formula and 11.63/14.8 would give you a MMOV of 31.5 instead of the 31.8. Not a big deal.

      In the Rams game, they last had the ball in a one score game with 6:24 left in the first quarter, or 8.6 minutes into the game. They last ran an offensive play trailing by <17 with 16 seconds left in the 1Q, so that was with 14.73 minutes elapsed. Again, I'm getting slightly different numbers than Doug here (I suppose this is where I say 'this is why we workshop this stuff'), but my #s would produce a score of 32.1.

      • AS

        Makes perfect sense. Thanks for the explanation.

      • Doug

        Yeah, something is funny (though only slightly so) with the numbers there. I will look into it. TWWWTS.

      • The third down play there began with 12 seconds left, which would be with 14.8 minutes left. I’m not sure where Doug got 13.9

        13.9 would be 1:06 which is when Washington ran its 2nd-down play.

        The other day I noticed a glitch on the PFR drive charts, where if a drive begins with under 1 minute remaining, that the drive is listed as having begun in the previous quarter. Look at that Was-GB game. If you go to the “Washington Redskins Drive” chart it lists Washington’s Drive #6 as having begun in Quarter 1.

        Perhaps this is where the difference between Chase and Doug’s calculations is coming from?

        • BTW, this is another reason that I am in favor of “Metric Time”. It would make database and spreadsheet designers much more happy if they didn’t have to convert HH:MM:SS into decimals.

  • Kibbles

    I like the name “Moral Margin of Victory”, but not for this metric. To me, MMoV seems like it should be a tool to grade “moral victories”, which would suggest that it somehow incorporated the pre-game spread or money line. For instance, the Denver/Jacksonville game was the largest spread in history, but Jacksonville kept it surprisingly close, which ranked as a huge “moral victory” for them. How big of a moral victory? Well, that sounds like exactly the sort of think Moral Margin of Victory was designed to measure.

    • Doug

      That’s Margin of Moral Victory, not Moral Margin of Victory. 🙂

  • Dan

    What about using a variation on win probability which you might call “Retroactively Downgraded Win Probability”? A team’s RDWP at time t is equal to the lowest win probability that they have from time t until the end of the game. Then you just integrate RDWP over the full game to get a measure of how much the winning team was in control of the game, on a scale from 0 to 1.

    This captures the idea that if you had an advantage early, but the other team then erases it and forces you to comeback, then your earlier advantage was all for naught. Your team had so little control of the game that you were forced to mount a comeback; your earlier apparent advantage was just a mirage.

    That is reflected in the RDWP numbers by downgrading the win probability early in the game to the level where it was at the team’s lowest point. If the low point of the game for your team comes at the end of the third quarter, when you have a 22% win probability, then your RDWP counts the full first three quarters as all being at a 22% (retroactively downgraded) win probability. Only when your win probability goes up for good does your RDWP goes up with it.

    (I’m not thrilled about the name RDWP.)

    • Doug

      That’s a really excellent idea, Dan. I suppose the only questions would be:

      1. do I have what it takes to program it up?

      2. assuming so, would the extra accuracy be worth any objections Occam might raise?

      • Dan

        RE Occam, the math is a bit fancier but I think that RDWP is actually simpler in an important sense. There are no arbitrary numbers to set.

        Another possible stat you could use is a variation of time of knockout based on win probability, which is the amount of time remaining when win probability goes above a certain cutoff for good (say, 0.90). I believe that RDWP is equivalent to time of knockout based on win probability, averaged across all possible cutoffs. Which, to me at least, has a kind of elegance that I associate with simplicity.

        • Dan Swenson


          Hope I’m not too late to the discussion. Dan, I like the idea of your RDWP function (in my head I’m pronouncing it R-Dip).

          Doug, I don’t think it would be hard to implement this idea. First, when Dan (earlier Dan, not me) says “integrate”, he really just means “average”. I think the thing to do is just break the game into 3600 one-second intervals, and then measure the RDWP function at the end of each interval. That is, find RDWP after the first second, the second second, etc….or in other words, find RDWP at the 14:59 mark of the first quarter, the 14:58 mark of the first quarter, and so on, up to the 0:00 mark of the fourth quarter.

          Now just add all 3600 of these values together, and divide by 3600 to get an average RDWP over the entire game. A computer could easily add 3600 values together and divide by 3600, so that’s not a problem…but how do you calculate the RDWP function at each of these points?

          Well, let’s let W(t) be the win probability at time t, and let R(t) be the RDWP function at time t. Then R(t) is the minimum of W(u), over all u greater than or equal to t. But then I believe it turns out that:

          R(3600) = 1 (for the moment, let’s forget about overtime games), and
          R(t) = min( W(t), R(t+1) ), for all t 0:
          total = total + r
          t = t-1
          r = min(w[t], r)
          total = total / 3600.0

          It seems to me that overtime games could be handled similarly, but you’d probably want to start at a higher t-value (i.e., whenever the game actually ended), and divide by that same number at the end, rather than 3600. Of course, I suspect that most overtime games will probably fit into the “had any number of things gone differently at the end, the winning team would have lost” category.

          • Dan Swenson

            Very sorry about the end of my last comment: I used a “less than” sign and a “greater than” sign, and it took it as formatting. I can’t seem to edit my comment. Here’s what I meant to say:

            …But then I believe it turns out that:

            R(3600) = 1 (for the moment, let’s forget about overtime games), and
            R(t) = min( W(t), R(t+1) ), for all t less than 3600.

            So you can start at t=3600, and proceed backward: at every time t, the value of R(t) is the smaller of W(t) and R(t+1). You can also do your “integration” from right to left, since addition is commutative! So as you find each value of R(t), add it to your running total. What follows is some pseudo code:

            total = 0
            t = 3600
            r = 1
            while t is greater than 0:
            total = total + r
            t = t-1
            r = min(w[t], r)
            total = total / 3600.0

            Again, sorry I screwed up the comment. Thanks for an interesting discussion!

  • I’m working on some other stuff, so I’ll have to offer my comments later.

    Good to see that Doug is back, even if the name for this stinks (may have said that just to keep the flames fanned).

  • James

    Assuming my math is right, if the winning team trails in overtime they could get a negative MMOV score! Maybe there should be a floor on the minimum value?

    And speaking of a minimum value, if the losing team ran a play as time expired down one score the MMOV is 5.402. That’s an awful number! You should change the constant to make the minimum 1, since that’s the actual lowest margin of victory and would appropriately represent the game. Along those lines I think it’d be nice if MMOV scaled to actual MOV, which would require some extra math steps but it would help put the MMOV in context (similar to how FIP is scaled to ERA in baseball).

    • James

      Wait, last second comebacks can really screw this up. If I’m doing this right the MMOV for the Cowboys-Lions game yesterday hinged on the Cowboys final play with 7 seconds left, giving the game an MMOV of 5.47. However, if time had run out on the kickoff, then the last offensive play the Cowboys trailed would when they scored a TD with 46 seconds left in the first half, for an MMOV of 22.3.

      That game probably represents the most extreme edge case you can get, but it certainly seems wrong that one meaningless hook and lateral play had such a large impact on the MMOV.

      • Doug

        I think you might have a point, James, but I don’t think the Dallas-Detroit game illustrates it.

        Ah, I think I see the issue. I worded things incompletely. A lead of any amount would be considered “a deficit of 8 or fewer points.” In other words, instead of this:

        “we’ll define time_1 to be the point in the game (measured in minutes elapsed since the opening kickoff) when the eventual losing team last took an offensive snap while trailing by 8 or fewer points.”

        I should have said this:

        “we’ll define time_1 to be the point in the game (measured in minutes elapsed since the opening kickoff) when the eventual losing team last took an offensive snap while leading or trailing by 8 or fewer points.”

        So, for the Detroit/Dallas game, both time_1 and time_2 were 59.9ish. Had time expired on the kick, they both would have been about 58.75 (corresponding to Dallas’ previous drive). So the MMOV difference is about a half a point. I’m OK with that. Either way, MMOV says this one is about as close as you can get.

        Now, the Pittsburgh/Oakland game from yesterday is one where the point you raise is a concern. Pittsburgh, trailing comfortably for virtually the entire game, got the ball back down three with 12 seconds left at their own 5 yard line. That swung the MMOV pretty substantially. MMOV calls this a very close game. Is that appropriate? I don’t know. In some sense, it doesn’t capture the “feel” of the game. On the other hand, Oakland never put this game away. If you’re giving the other team a chance to return a punt to win the game (DeSean Jackson, anyone?), then you can’t really say you won the game convincingly.

        • James

          Thanks for the response, that makes a lot more sense.

          The Pitt/Oak example is a good one too. I had thought about those types of scenarios as well (down 14 all game, but score TD and recover onside kick with 5 seconds left), but if you want to keep your formula simple that’s just the way it’s going to be since that play COULD have been in field goal or Hail Mary range. If you want better than that, well that’s what WP is for and then it gets much more complicated very quickly.

          I still think you should lower the constant to 34.77 so close games don’t have 11 point MMOV swings depending who wins.

          • Doug

            “I still think you should lower the constant to 34.77 so close games don’t have 11 point MMOV swings depending who wins.”

            The 11-point swing is in keeping with the axiom that winning a tossup game looks much different on your resume than losing a tossup game does. It’s very similar to Chase’s version of SRS, in fact, which counts (almost) all wins as 7-point (or more) wins.

            From a predictive standpoint, we’d want to record these games as “ties,” but from a resume standpoint, winning vs losing is a big distinction.

  • Nate

    > Your goal as the coach of a national title-contending team against an overmatched opponent is to make it a three possession game as quickly as
    > possible, and then keep it there.

    If you want to encourage that sort of behavior, then, maybe you’d be better off taking the average of (min(lead,21)) over the course of the game. If a team holds a 21 point lead for the first 45 minutes and then has a brief lapse before reestablishing it to the end of the game, that seems like it should be as valuable as establishing the lead, and having the same lapse around 15 minutes in.

    It also seems a little strange to only credit the winning team.

  • He lives!

  • Doug

    The table is now updated to include yesterday’s games. A couple of bugs were also fixed: (1) the one mentioned by Richie about plays starting with less than a minute left being ignored or misclassified, and (2) I fixed the overtime games. I didn’t mention this, but I treat all OT wins as having time_1 = time_2 = 60.

  • Chase Stuart

    Seattle totally had a tiny MMOV against St. Louis last night.

  • SK

    Good stuff. I do think the “moral” in the name isn’t right. Adjusted Victory Margin?