Monday, July 22, 2013

Are Underdogs Winning the Super Bowl More Often than they Should?

In the 2004 Super Bowl the first-seeded New England Patriots beat the third-seeded Carolina Panthers by three points to win their second NFL championship. 2010 featured the top-ranked New Orleans Saints earning their franchise's first Super Bowl win.

In the 10 seasons since the NFL's last re-alignment (before the 2002 season) these are the only two times a #1 seed has won the big game. It seems pretty odd that the top seeds, teams which only have to win two home games to make it to the big game, are only batting .200.

There's obviously a lot of potential reasons for this discrepancy. One that tends to get mentioned frequently is the first-round bye given to the top two seeds. The logic goes that the week off, rather than helping a team rest up and prepare for the Divisional round, somehow hurts them, possibly by disrupting the natural rhythm of the week.
I've showed that on average the home team wins about 57% of the time during meaningful games of the regular season. If the bye week is the cause of this Super Bowl drought then it seems reasonable that we should find that the first and second seeds are winning their home playoff games at a lower frequency than expected.

A list of the seeding of the teams in the last 10 Super Bowls is all that's necessary for this experiment, so I simply made the list by hand from Wikipedia, which has fairly comprehensive coverage of each year's playoffs.

Wikipedia also has a comprehensive page on the Monte Carlo method, but in short it works by repeatedly generating random realizations of the problem at hand and comparing the results of the randomized trials to the real data. Given enough runs, the Monte Carlo method should converge to a stable result, allowing us to see if the assumptions that went into the Monte Carlo simulation are valid statistical representations of reality.

The Monte Carlo algorithm was set up to predict the expected number of Super Bowl appearances for each seed, under the assumptions that home field advantage was a flat 57% and that the different rankings of the teams had no bearing on game outcomes. No additional advantage from the bye week was programmed into the model.

The number of Super Bowl appearances for each seed (AFC and NFC seeds combined) is shown in Table 1. Note that even a 7% home-field advantage results in an additional ~1.5 Super Bowl appearances per decade for the #1 seeds than if there was no home field advantage (with no home field advantage both the #1 and 2 seeds each make it to 5 out of 10 Super Bowls, as would be expected).
Table 1: Playoff Model Predictions
SeedPredicted # of AppearancesActual # of AppearancesPredicted # of WinsActual # of Wins

The standard deviations of all the results are listed next to each predicted value; the relatively small sample of Super Bowls results in fairly large margins of error in the simulation.

Regardless, it's pretty clear that the extra bye week isn't hampering the first two seeds from getting to the championship. The #2 seed has made almost as many appearances as predicted, while the #1 seed is, if anything, reaching the Super Bowl more often than they should be.

Because I was interested, I also computed the number of times each seed wins the Super Bowl. For this calculation I made the additional assumption that there is no home field advantage in the Super Bowl, which seemed reasonable given that the game is held on neutral ground. Those results are also presented in Table 1.

Discussion and Conclusions
The errors are fairly large, but the overall match between the model and the data indicates that there is neither an extra advantage or disadvantage to having the bye (although there is tantalizing — but not quite significant —evidence that #1 seeds aren't winning as many Super Bowls as they should). Without a larger sample size, however, any firm conclusions would be premature.

Unfortunately, when it comes to the Super Bowl you only get one new data point a year, so it's going to be quite awhile before the signal may stand out from the noise. One interesting note to mull over while waiting for more data: in the first five of the post-realignment playoffs, a #1 seed reached the Super Bowl all five years. Since then, only three top seeds have made it to the big game, while the last three Super Bowl winners were all ranked 4th or lower.

Monday, July 8, 2013

Home Field Advantage II: The Cold Weather Edge

To investigate the effect that weather has on home field advantage, I've compared the average temperature difference between home and visiting teams over more than a decade's worth of games. I find that when the temperature differential is larger than 20° F the team coming from the colder city always has an advantage against the warmer-weather franchise compared to the overall home team win percentage, even when the cold-weather team is the visitor. This result persists even after the data are corrected for the effect of teams which have played against each other multiple times, and indicates that there may be some persistent advantage gained by teams which become acclimatized to poor playing conditions, although why this should be is unclear.

A few posts ago I investigated the effect that distance has on home field advantage, and found that teams traveling East had a much more difficult time playing on the road than visiting franchises coming from the West (or traveling North/South). However, as I noted in that post, distance is but one of many possible components of home field advantage.

Because NFL teams are scattered all over the country, many games (especially toward the end of the season) happen between teams used to dramatically different climates. Along with distance, the temperature differential is extensively discussed in the lead-up to a big game. The most notable example of this trend is the coverage of the Tampa Bay Buccaneers longstanding cold-weather futility. (This coverage, interestingly, largely ceased after the Bucs beat the Eagles in Philadelphia in the NFC championship game — only their second-ever win in temperatures below 40° Fahrenheit — en route to winning Super Bowl XXXVII.)

Of course, just because pundits and announcers like to talk about the weather doesn't mean it actually has any impact on the outcome. And the Buccaneers were a historically bad franchise for over a decade before their Super Bowl win. Let's dig in and find out exactly what (if any) impact the weather really has.

While my other home-field advantage study used game results I personally downloaded from, that data did not include any temperature information. The Armchair Analysis database, however, has plenty of information on game conditions.  From this database I obtained game results as well as weather information for every regular season game between 2000 and 2011.

Before digging into the temperature data I first computed the home team win percentage for the entire Armchair Analysis sample. Overall, the home team wins 56.9% of the time —only 1.1% less than for my data. This consistency is very encouraging, and indicates that results obtained with one data set can be accurately compared with the other.

To integrate the temperature data into the win-loss results I first computed the average temperature for every stadium in the league for each week of the regular season (Figure 1). Because the sample size for a given week is fairly small (roughly 5 games per week per field) I included the temperatures for the weeks immediately before and after as well, which helped to smooth out the 'wrinkles' and should provide more accurate averages.
Figure 1: Average home-field temperature for each team over the course of the regular season. Hotter temperatures are red, while colder temperatures are blue.
For teams playing in a dome I set the temperature at 72° F. For stadiums with a retractable roof I used the ambient temperature when the roof was open and 72° when the roof was closed.

Most of Figure 1 makes sense — Green Bay gets frighteningly cold in December and January, while all three Floridian teams play in fairly warm conditions. Kansas City is colder than I would have thought, however, and Pittsburgh has comparable weather to icy Buffalo. But overall it seems as though there is enough data in the sample to produce reasonable weekly averages.

With temperature averages established, I next determined the expected temperature differential between the home and away teams for every game in my sample. Note that in the following analysis I am not using the actual temperatures for each game but rather the averages for that week for the two teams. While somewhat more abstracted than using the real temperatures, sticking with the averages significantly simplifies things — if I used the specific game time temperatures for each game I would have to compare the expected temperatures for both the home and away teams to the actual conditions. Seeing how really extreme weather affects teams would be interesting, but that analysis is for another post.

Figure 2 shows the home team's win percentage as a function of the average temperature differential. The overall home team winning percentage is also shown, as are 1-sigma bootstrapped error bars.
Figure 2: Home team win percentage broken up by average temperature differential. The red line shows the home team's win percentage for the entire sample.
When the visiting team comes from a city with a temperature less than 20° different than the home city there is essentially no change in home field advantage (although there is weak evidence that the away team does better when visiting cities with similar weather). However, there is a dramatic shift for temperature differentials larger than ±20° — When a warm-weather team travels to the frozen North, they are almost 10% less likely to win than on average, while the situation reverses completely when a team used to duking it out in the cold road-trips to more tropical climes.

Discussion and Conclusions
Before digging in to the provocative results in Figure 2, some caution is advised. While each bin has several hundred individual games, it is possible that a few specific matchups between divisional rivals could be biasing the results. For instance, the NFC North has two teams with some of the coldest weather in the league (Green Bay and Chicago) as well as two teams which play in domes (Detroit and Minnesota). Depending on how the League draws up the schedule this division could contribute up to four games a year in the most extreme temperature differential bins — exactly the ones which show a significant change in home field advantage.

So how do we know that the apparent trend with temperature isn't the merely the result of the Packers and Bears beating up on the Lions and Vikings over the past decade (or Patriots-Dolphins, or Chiefs-Chargers, etc)?  Controlling for this potential source of bias is actually fairly simple — just give every distinct matchup is given the same weight in the computation.

Basically, for every combination of home and away teams present in a bin, I've computed the total home team winning percentage instead of treating each game as a separate event. These matchup winning percentages are added together in the same way that the original games were in Figure 2 to produce a corrected temperature differential histogram — Figure 3.
Figure 3:
Despite all of my concern, Figure 3 shows only a slight reduction in the trends when teams with multiple matchups are taken into account. Now it's possible to evaluate these results with at least some confidence that they aren't being dominated by just a few teams.

And the results are certainly interesting — especially if you root for a cold weather team! Not only does playing up North make it tough on visitors, with the home team winning ~7% more games than for teams in moderate climate and nearly 65% of the time overall, but the advantages provided by a frigid home environment appear to persist even when traveling.

It's not too surprising to find that rough weather can be difficult for a team which isn't used to it, but I wouldn't have predicted that fair-weather franchises would have just as much trouble when hosting teams used to the cold. I couldn't say how — perhaps teams used to playing in unpleasant conditions simply become extra excited about games where they know they won't need to worry about wearing gloves and sleeves!

Social Media Bar

Get Widget