*(I originally posted this at the S-R Blog, but I thought it would be very appropriate here as well.)*

**WARNING:** Math post.

PFR user Brad emailed over the weekend with an interesting question:

“Wondering if you’ve ever tracked or how it would be possible to find records vs. records statistics….for instance a 3-4 team vs. a 5-2 team…which record wins how often? but for every record matchup in every week.”

That’s a cool concept, and one that I could answer historically with a query when I get the time. But in the meantime, here’s what I believe is a valid way to estimate that probability…

- Add eleven games of .500 ball to the team’s current record (at any point in the season). So if a team is 3-4, their “true” wpct talent is (3 + 5.5) / (7 + 11) = .472. If their opponent is 5-2, it would be (5 + 5.5) / (7 + 11) = .583.
- Use the following equation to estimate the probability of Team A beating Team B at a neutral site:

p(Team A Win) = Team A true_win% *(1 – Team B true_win%)/(Team A true_win% * (1 – Team B true_win%) + (1 – Team A true_win%) * Team B true_win%)

- You can even factor in home-field advantage like so:

p (Team A Win) = [(Team A true_win%) * (1 - Team B true_win%) * HFA]/[(Team A true_win%) * (1 - Team B true_win%) * HFA +(1 - Team A true_win%) * (Team B true_win%) * (1 - HFA)]

In the NFL, home teams win roughly 57% of the time, so HFA = 0.57.

This means in Brad’s hypothetical matchup of a 5-2 team vs. a 3-4 team, we would expect the 5-2 team to win .583 *(1 – .472)/(.583 * (1 – .472) + (1 – .583) * .472) = 61% of the time at a neutral site.

**Really Technical Stuff:**

Now, you may be wondering where I came up with the “add 11 games of .500 ball” part. That comes from this Tangotiger post about true talent levels for sports leagues.

Since the NFL expanded to 32 teams in 2002, the yearly standard deviation of team winning percentage is, on average, 0.195. This means var(observed) = 0.195^2 = 0.038. The random standard deviation of NFL records in a 16-game season would be sqrt(0.5*0.5/16) = 0.125, meaning var(random) = 0.125^2 = 0.016.

var(true) = var(observed) – var(random), so in this case var(true) = 0.038 – 0.016 = 0.022. The square root of 0.022 is 0.15, so 0.15 is stdev(true), the standard deviation of true winning percentage talent in the current NFL.

Armed with that number, we can calculate the number of games a season would need to contain in order for var(true) to equal var(random) using:

0.25/stdev(true)^2

In the NFL, that number is 11 (more accurately, it’s 11.1583, but it’s easier to just use 11). So when you want to regress an NFL team’s W-L record to the mean, at any point during the season, take eleven games of .500 ball (5.5-5.5), and add them to the actual record. This will give you the best estimate of the team’s “true” winning percentage talent going forward.

That’s why you use the “true” wpct number to plug into Bill James’ log5 formula (see step 2 above), instead of the teams’ actual winning percentages. Even a 16-0 team doesn’t have a 100% probability of winning going forward — instead, their expected true wpct talent is something like (16 + 5.5) / (16 + 11) = .796.

(For more info, see this post, and for a proof of this method, read what Phil Birnbaum wrote in 2011.)

{ 7 comments… read them below or add one }

What’s really great is that you can prove this using Bayes’ Theorem…

Say we have a 3-4 team. Their observed (mean) record is 3 / 7 = 0.429, and the binomial standard deviation of that is (sqrt((W+L)*WPct*(1-WPct)))/(W+L) = (sqrt((3+4)*0.429*(1-0.429)))/(3+4) = 0.187. Since we’re regressing halfway to the mean, we’ll use a 0.500 WPct as the Bayesian prior mean, with a standard deviation of 0.15 (aka the standard deviation of true NFL winning percentage talent that we derived in the post).

Bayes’ Theorem states that:

Result_mean = ((prior_mean/prior_stdev^2)+(observed_mean/observed_stdev^2))/((1/prior_stdev^2)+(1/observed_stdev^2))

Plugging in the means and standard deviations we found above, we get:

Result_mean = ((0.5/0.15^2)+(0.429/0.187^2))/((1/0.15^2)+(1/0.187^2))

Which equals… 0.472. Or, exactly the same “true” WPct talent we found via (W + 5.5) / (G + 11).

Pretty cool, right?

The term “Bayes’ Theorem” gives me flashbacks to a class I couldn’t pass in college a decade ago. Armed with better knowledge now, I wouldn’t mind going back and trying again.

Totally this – certain aspects of maths didn’t seem wonderfully relevant a decade or so ago (certain things I got e.g. structural mechanics and certain things I didn’t e.g. advanced engineering maths) but now I can increasingly see the point of some of the things I didn’t specifically get. Practical applications of things such as the above make things a lot more relevant. Great post.

Does this mean you’d add 11 games of 0.500 ball to last season’s record to get the best projection of this season’s record?

If you wanted to use what the “best guess” of their talent was last season (from strictly W-L, so no pt diff, no SOS adj) to predict this season, the number above would be what you would use. I haven’t looked into predicting subsequent seasons, though. This post strictly applies to the current team in the current season. Because of player movement, I have a feeling you would need to regress to the mean even more than 11 games of .500 if you want to predict Year Y+1 from Year Y’s record. That’s my guess.

Then there is the more substantial case: how does one incorporate LY during the season? Let’s suppose that 3-4 team was 9-7 LY? Do we add 11 to that record too and average LY with this season’s number?

In terms of the NFL, do you find that “true” winning percentage of adding 11 games of .500 ball is more accurate than point differential? Or what about adding 261.1 points to the points scored and points allowed when calculating a Pythagorean winning percentage [step 2 above]? (261.1 = 23.4 [average score] x 11.1583)

{ 11 trackbacks }