Monday, July 22, 2013

Are Underdogs Winning the Super Bowl More Often than they Should?

Introduction
In the 2004 Super Bowl the first-seeded New England Patriots beat the third-seeded Carolina Panthers by three points to win their second NFL championship. 2010 featured the top-ranked New Orleans Saints earning their franchise's first Super Bowl win.

In the 10 seasons since the NFL's last re-alignment (before the 2002 season) these are the only two times a #1 seed has won the big game. It seems pretty odd that the top seeds, teams which only have to win two home games to make it to the big game, are only batting .200.

There's obviously a lot of potential reasons for this discrepancy. One that tends to get mentioned frequently is the first-round bye given to the top two seeds. The logic goes that the week off, rather than helping a team rest up and prepare for the Divisional round, somehow hurts them, possibly by disrupting the natural rhythm of the week.
I've showed that on average the home team wins about 57% of the time during meaningful games of the regular season. If the bye week is the cause of this Super Bowl drought then it seems reasonable that we should find that the first and second seeds are winning their home playoff games at a lower frequency than expected.


Data
A list of the seeding of the teams in the last 10 Super Bowls is all that's necessary for this experiment, so I simply made the list by hand from Wikipedia, which has fairly comprehensive coverage of each year's playoffs.

Wikipedia also has a comprehensive page on the Monte Carlo method, but in short it works by repeatedly generating random realizations of the problem at hand and comparing the results of the randomized trials to the real data. Given enough runs, the Monte Carlo method should converge to a stable result, allowing us to see if the assumptions that went into the Monte Carlo simulation are valid statistical representations of reality.

The Monte Carlo algorithm was set up to predict the expected number of Super Bowl appearances for each seed, under the assumptions that home field advantage was a flat 57% and that the different rankings of the teams had no bearing on game outcomes. No additional advantage from the bye week was programmed into the model.

Results
The number of Super Bowl appearances for each seed (AFC and NFC seeds combined) is shown in Table 1. Note that even a 7% home-field advantage results in an additional ~1.5 Super Bowl appearances per decade for the #1 seeds than if there was no home field advantage (with no home field advantage both the #1 and 2 seeds each make it to 5 out of 10 Super Bowls, as would be expected).
Table 1: Playoff Model Predictions
SeedPredicted # of AppearancesActual # of AppearancesPredicted # of WinsActual # of Wins
16.5+/-2.0983.2+/-1.482
25.6+/-2.0052.8+/-1.423
32.5+/-1.4821.2+/-1.011
42.2+/-1.3931.1+/-1.012
51.9+/-1.2910.9+/-0.911
61.7+/-1.2410.8+/-0.881

The standard deviations of all the results are listed next to each predicted value; the relatively small sample of Super Bowls results in fairly large margins of error in the simulation.

Regardless, it's pretty clear that the extra bye week isn't hampering the first two seeds from getting to the championship. The #2 seed has made almost as many appearances as predicted, while the #1 seed is, if anything, reaching the Super Bowl more often than they should be.

Because I was interested, I also computed the number of times each seed wins the Super Bowl. For this calculation I made the additional assumption that there is no home field advantage in the Super Bowl, which seemed reasonable given that the game is held on neutral ground. Those results are also presented in Table 1.

Discussion and Conclusions
The errors are fairly large, but the overall match between the model and the data indicates that there is neither an extra advantage or disadvantage to having the bye (although there is tantalizing — but not quite significant —evidence that #1 seeds aren't winning as many Super Bowls as they should). Without a larger sample size, however, any firm conclusions would be premature.

Unfortunately, when it comes to the Super Bowl you only get one new data point a year, so it's going to be quite awhile before the signal may stand out from the noise. One interesting note to mull over while waiting for more data: in the first five of the post-realignment playoffs, a #1 seed reached the Super Bowl all five years. Since then, only three top seeds have made it to the big game, while the last three Super Bowl winners were all ranked 4th or lower.

1 comment:

  1. As a paper referee, I would ask couldn't you also test the impact of a bye week by looking at the regular season game following a bye week? Larger sample size! But maybe you could add the lack of refs as an advantage to your blog format. ;)

    PS, loving these!

    ReplyDelete

Social Media Bar

Get Widget