Monday, February 3, 2014

Do Defenses Get Tired?

One of the hallmarks of good science is reproducibility – the ability for other researchers to repeat (and thus verify or disprove) your work. While I hope I have laid out enough details in each of my posts for anyone interested to check my analyses, I am happy to report that I will now be uploading my code for each post to GitHub. Check it out!

Abstract
One of NFL announcers' favorite statistics is the time of possession, which is usually discussed in the context of how tired the defense must be when they've been on the field for a long time. But do defenses actually get fatigued over the course of the game? To answer that question I used the raw number of plays a defense is on the field (rather than the less accurate time of possession) and computed the probability that the offense will score as a function of this number. Ultimately, even after 70+ plays there is no increase in the offense's point production – a clear indication that defensive players have plenty of endurance to make it through even the longest games.

Introduction
A common statistic to see quoted during a game is time of possession (lazily referred to as ToP in the rest of this post). Usually referenced between quarters or near the end of the game, commentators generally talk about ToP in the context of noting how long one team's defense has spent on the field. (Offenses generally have more flexibility in keeping their players fresh through skill package substitutions.) The not-very-subtle implication is that the defense is getting worn down by the amount of time they've been playing and will therefore be more likely to allow points.

This is, of course, largely bullshit. Since so much more time is spent between running plays than actually ticks by while the football is in motion, ToP is really only a good indicator of how much standing around the teams are doing. Additionally, since the game clock stops for an incomplete pass (and pass-heavy offenses tend to pick up yards in chunks and have shorter drives as a result) ToP is naturally skewed towards favoring rushing offenses. If ToP was only collected during a play it might have some value, or better yet just strap some pedometers onto the players and figure out how much they're really running around on the field. 

The idea at the core of ToP, however – that a defense spending more energy on the field may eventually show signs of fatigue and therefore allow more points – is not unreasonable. The current ToP statistic is just a terrible way of measuring it. This question is especially interesting because if defenses do get tired over the course of a game it would add more value to a strong rushing attack, a facet of the offense that has come
under significant fire in recent years as being strictly inferior to the passing game. 

While perhaps not quite as good as my earlier pedometer suggestion, the raw number of plays run should be a much better proxy than ToP for investigating whether defenses get tired. By comparing the results of drives as a function of the number of plays run will therefore indicate whether defenses ever become fatigued enough to affect play.

Data
I started with all the play-by-play data in the Armchair Analysis database, and computed the beginning and end of each drive as well as whether any points were scored. By separating this data out between the home and away teams for each game I constructed a running tally of the number of plays run by the offense at the start of each drive.

Before getting into the results it's important to note that for this analysis the devil really is in the details. The data can be biased in many ways, some subtle and some not.  First and perhaps most obvious is that while all games can be expected to start in a similar way, a drive in the 4th quarter of a blowout is going to look much different than one in a close game. To avoid this problem I restricted my sample only to games where the final tally is within one score (8 points). I also throw out special teams plays, as I am most focused on how the defense plays as a unit (although note that on most teams at least some special teams players will see snaps on the offense or defense).

Another issue is penalties. Most infractions are only called after the play is over, and even though (if the penalty is accepted) the original play doesn't count for statistical purposes I still want to count it for this analysis. Some penalties, however, result in the refs immediately blowing the play dead (the most notable examples of this being false starts and encroachment). These penalties I strip out from the final play-counts. Occasionally a penalty occurs after the play is over (e.g. many unsportsmanlike conduct calls). A dead-ball foul should be purged from the data; unfortunately (as far as I can tell) there is no indication in the database whether a penalty is a dead-ball infraction or not, so I choose to leave all of these penalties in my sample. Fortunately these types of penalties are relatively infrequent, and therefore shouldn't significantly affect the results.

Lastly, drives near the end of halves create significant additional bias as well, since many of them are kneel-downs or result in unusual play-calling (Hail Mary passes, record-setting field goals, etc). I cut out the result of any drive that starts within the 2-minute warning of either half, although I include the plays run on those drives in the running totals of plays run during the game.

It is also worth noting that occasionally there are errors in the database, where the down sequence counter I use to determine the length of each drive is not reset between possessions. This issue is most obvious in the existence of some unusually long (20+ play) drives, although it likely affects shorter drives as well. Generally the incidence of these errors is very low (there are only ~10 of these very long drives in the entire sample, for instance), so I do not believe they will bias the results – especially not for shorter drives, where the sheer number of actual drives should drown out the few erroneous ones.

Results
Before diving into the full analysis, I think it's interesting to look at some raw numbers about NFL drives that aren't usually discussed. Take a look at the distribution of drive lengths in Figure 1, and the distribution of drives per game in Figure 2. The plurality of drives take 3 plays, which makes sense as these are 3-and-out possessions. The occurrence of drives longer than ~6 plays is fairly well described by a power law (a straight line on this log-normal plot) with a cutoff at 21 plays (The few plays above this threshold are likely all spurious results as mentioned above).  Note that, assuming a team would punt on any 4th down, the maximum number of plays an NFL drive could take would be 30.
Figure 1: Distribution of drive lengths. After about 5 plays the frequency of drive length decreases quickly, and very few drives take more than ~15 plays.

While Figure 1 has home and away drives lumped together, I've left them separate in Figure 2 – it's pretty clear that there's no significant difference in the number of drives per game between the home team and the visitors. The distributions are well fit by a Gaussian distribution with an offset of almost exactly 10 drives and a standard deviation of a little less than two drives. This indicates that in a normal game a team will have less than 12 chances to score points – not a lot of opportunities! (It also implies that a team scoring 40+ points in a game is reaching the endzone on at least half of their possessions.)

Figure 2: Distribution of drives per game, for both home and away teams. There is very little difference between the home and away histograms. Solid lines show Gaussian fits to the data, which peak around 10 drives.
With the basics out of the way, now let's delve into the good stuff. I have the number of plays already run by the offense at the start of every drive lined up with the result of that drive. From there it's fairly straightforward to calculate the fraction of drives that end in scores as a function of the number of plays that have been run, which is shown in Figure 3.

Figure 3: Fraction of drives resulting in scores as a result of plays run. No trend is observed.
The errors on Figure 3 come from simple counting statistics, and the bin widths are adaptively chosen to have similar errors. If defenses really did fatigue as they spend more time running around on the field, the percentage of drives ending with points should increase as a function of the number of plays, but there is no evidence for this trend. If you look at touchdowns or field goals individually the picture remains the same – even if an offense runs 70+ plays the defense doesn't budge an inch. 

Discussion and Conclusions
It's pretty obvious from Figure 3 that defenses don't get fatigued during games. On a given drive the offense has a ~35% chance of scoring regardless of how much the defense has been on the field. If you look at how rushing averages change over the course of a game you reach the same essential conclusion, which is a good indication that my results are indeed accurate. While on the surface it seems totally reasonable that players would wear down as the game wears on, given the fact that the number of plays a team runs per game is a well known quantity it makes sense that players would have enough conditioning to make it well beyond even the longest of games. (It would be interesting to repeat this analysis for overtime games but my sample size is far too small.) 

So what are the implications of this result? Well, for one it means that announcers should stop talking about how long the defense has been on the field over the course of a game! More importantly, it means that there's one less reason for teams to rely on running the ball – if a coach feels that throwing deep every play best suits the talent on his offense, they should feel free to do so without consideration for their defense.

Social Media Bar

Get Widget