## Monday, June 24, 2013

### Field Position and Scoring Probabilities: Half of the Red Zone is a Dead Zone (for Touchdowns)

Abstract
Any drive's scoring chances increase as the offense moves down the field, but exactly what impact an additional X yards gained provides is not generally known (or at least not commonly discussed). In this post I've charted out a team's scoring chances for a first-down situation at any point on the field. In addition to a dramatic increase in touchdown percentage for all drives that have a first down within 10 yards of the end zone, there is a leveling off in the fraction of drives ending in touchdowns right outside of this zone. While the root causes of these features are not made clear by this analysis, they may be due to the necessity for different offensive and defensive tactics near the endzone.

Introduction
As a team drives down the field, excitement naturally builds. Each first down brings them closer to the end zone and a touchdown. At least, it should. How much does each first down improve your chances of scoring, and are there any parts of the field where having a first down closer to the goal line doesn't help matters?

Data
To obtain the necessary data I queried my copy of the Armchair Analysis database for all plays in the first three quarters. I ignored the final period so as not to bias the results with desperation drives from teams attempting a late rally. I then used a python script to find all first-down plays and the end result of the drive they occurred on.

This resulted in 63182 first downs over 17164 scoring drives. Roughly 60% of these plays were on touchdown drives, while the rest were on series that resulted in field goals (I completely ignored safeties, for the record). This uneven distribution is unsurprising, given that TD drives generally cover more of the field (and thus generate more first downs) than FG drives.

Results
A plot of how likely a drive is to end in points as a function of field position is shown in Figure 1. It shows the fraction of scoring drives that result from a first down at a given yard line, with the opponent's end zone denoted by zero. Errors were determined via bootstrapping, and due to the sheer number of samples in this data set they are small.
 Figure 1: On any given drive, having a first down at a given point on the field is plotted against the probability of the drive ending with a score.

As expected, the likelihood of scoring any points increases monotonically (aside from a couple of bumps and wiggles most likely due to statistical fluctuations) from the offense's end zone to the other team's goal line. On a team's own side of the field the relationship is linear, with a field position boost of ten yards resulting in roughly a 10% increase in scoring probability.

Once you cross midfield, however, the odds of scoring take a distinct upturn. Looking at the data split into the different types of scores (red and blue points in Figure 1) shows that this uptick is the result of field goals, which makes some sense given that a team starting at the 50 only needs a couple of first downs in order to be in field goal range.

Inside the opponent's 30, the percentage of drives ending in field goals levels off because the offense is already within field goal range — getting additional yardage doesn't make you more able to attempt a field goal. The likelihood of ending the drive with a touchdown, however, continues to increase.

After a leveling off between 10-20 yards away from the opposing team's goal, the TD percentage rockets upwards for first-and-goal situations at the expense of field goals. Ultimately, a first-and-goal at the 1-yard line gives the offense an 85% chance of scoring a touchdown and an almost 95% chance of getting any points.

Discussion and Conclusions
It's somewhat surprising to see the dramatic increase in TD% when the offense is within the opponent's 10-yard line. This implies that there's something different about that last 10 yards — either it becomes significantly easier to score a touchdown (doubtful; I think the opposite is probably true), or teams are more likely to go for it on all four downs when they're so close to scoring. It's also possible that there's a psychological shift, providing a boost of adrenalin to the offense. A full investigation of  these possible explanations is beyond the scope of this post, but might be worth revisiting in the future.

Of further note is the lack of improvement in a team's touchdown chances inside the red zone but outside the 10-yard line. This is in stark contrast to the dramatic ramp-up of TD% once a team reaches a first-and-goal scenario. While the TD% in this region stagnates, however, FG% increases correspondingly, leaving a smooth increase in the total scoring probability.

On it's own, the leveling off of the touchdown percentage wouldn't be inconsistent with random statistical fluctuations, such as the apparent increased scatter in the total scoring percentage around the 50 yard line. But the consistency of the feature around the opponent's 10-yard line, along with the corresponding increase in the frequency of field goals, indicates that this phenomenon is real.

So it seems like there is indeed a bottleneck effect when a team gets ~15 yards away from a touchdown, likely due to the difficulty of getting a first down very close to the goal line. This bottleneck disappears once a team gets into a first-and-goal situation, possibly the result of a team's increased willingness to go for it on fourth and goal. So the next time your team has to settle for a field goal when they had first-and-10 from the 12, take a small comfort in knowing that they weren't in quite as good of a spot as it seemed.

--A huge shout out to Kenny Rudinger for noticing that my preliminary results for this post were obviously in error,  allowing me to sort out the bugs in my analysis code *before* subjecting my boneheaded mistakes to public scrutiny.

## Monday, June 10, 2013

### Quantity over Quality in the NFL Draft

Abstract
NFL teams live and die by the draft. A franchise which drafts well consistently can look forward to years of sustained success, but just a single year of bad evaluations can cripple a team for several seasons. Drafting will never be an exact science, and it's not obvious why some teams appear to be better at it. In this post I investigate what might give these teams their advantage, and find that while drafting better players is correlated with winning more games, simply acquiring more draft picks has a stronger effect on a team's success. This result indicates that teams should focus on obtaining as many selections as possible rather than staking their fortunes to a few highly rated prospects.
Introduction
One of the great things about the NFL is the level of parity. No matter how bad the previous year was, your favorite team is always 'just one year' away from turning it all around. Every year their seems to be one or two teams who dramatically improve their fortunes — look no further than the 2008 Miami Dolphins or 2012 Indianapolis Colts for examples.

Of course, these teams usually crash right back down to Earth (c.f. the 2009 Dolphins). But some teams seem to be near the top of the pack year after year.

Quantitative evidence for the above statement comes from the postseason. While 28 different teams have made the playoffs at least once since 2006 (sorry Buffalo, Cleveland, Oakland, and St. Louis fans!), only 10 have made the playoffs in more than half of those seasons. If you want teams that have made 5+ out of 7, your sample drops to five — the Colts, Patriots, Steelers, Ravens, and Giants. So why are these five teams so consistently successful, while most of the rest of the NFL is so streaky?

One possibility is that these teams win so much because they draft better than the rest of the NFL. Teams constantly have to refresh their talent pool as players age; a team which is able to more accurately evaluate college talent should have a huge advantage over teams which can't.

But is that true? Is it even possible to draft well? Some evidence would argue that it is not — there have been plenty of high-profile draft busts in recent memory (e.g. Vernon Gholston),  and undrafted stars like Arian Foster immediately tell you that good players are still falling through the cracks.

So the question becomes how to quantify drafting savvy, which is clearly a difficult thing to do. (If it wasn't, teams would have already figured out how to draft better!)

Data
I downloaded a comprehensive list of draft results between 1990 and 2011 from Pro Football Reference, which (among other things) lists year, round, team, and when the player left the league. This data isn't perfect, as there are players who spend time outside of football before re-entering the league, but those players are outliers who shouldn't affect the results very much.

Coupled with the draft data I have team win-loss records for each of the aforementioned seasons. These records were compiled from individual game scores downloaded from NFL.com. To make things a little easier I will strip out individual teams from the equation and aggregate all teams together, and then look only at how prior drafts affect current win-loss records over all franchises.

Results
If teams are truly bad at picking talent, then every pick would be essentially equivalent to rolling the dice. Now, we know that this isn't quite true, as otherwise you'd have many more first round busts and late-round diamonds. But what if it was?

If you assume that teams are totally incapable of evaluating talent, then the optimum strategy to build a winning team becomes clear: stockpile draft picks. In this scenario if you are drafting more warm bodies than other teams, by the laws of probability you will also acquire more talented players. Assuming you can separate the wheat from the chaff in training camp (a dubious assertion, I know, but let's not go down this rabbit hole now), you'll come out ahead in the long run.

Even if you make a weaker assumption about a front office's ability to diagnose talent in the draft — maybe that coaches and GMs lack the ability to discriminate between talent levels within a single round of the draft — the logic of grabbing as many picks as possible still holds. This is especially true given the low value teams seem to place on draft picks in future years. If you can give up your first round draft pick this year in exchange for a team's first round draft pick next year plus a second rounder, you essentially get an extra chance at winning the second round 'lottery' by agreeing to wait one more year before trying to get a good first-rounder.

It's simple enough to compute how additional draft picks impact win percentage. Figure 1 shows the Spearman correlation coefficient between the number of draft picks above the NFL average and win percentage. Just looking at the current year's draft isn't enough; Figure 1 shows several ranges of years — each point on the figure has a Y-to-X range of years. For example, the point at (6,2) shows that the players drafted between 2 and 6 years ago have a (relatively) strong effect on how well a team is currently doing.  It's not the most obvious plot to look at, but it conveys a lot of information in a compact way.
 Figure 1: Correlation between number of draft picks in prior seasons with win percentage. The X axis shows how far back in time we count draft picks, while the Y axis shows the minimum number of years before the current season a pick must be made to be counted. A higher Spearman coefficient indicates that surplus of draft picks in that range of years is more strongly correlated with win percentage.

Only correlations with greater than 95% significance are plotted. The strongest correlation is only 0.123, for players drafted between 2 and 10 years ago. I can further break down the data with the strongest correlation by round — Figure 2.
 Figure 2: Round-by-round analysis of the strongest correlation in Figure 1. Axes are the same as Figure 1, but with draft round numbers instead of years.

First off, if you only look in the first 1-3 rounds, there isn't any significant correlation. This is probably a function of teams' general reluctance to deal draft picks in the early rounds, which leads to a smaller sample size and therefore a weaker confidence level. The next interesting thing is the sudden pickup in significance when Round 4 is included in the calculation. And going later than Round 4 doesn't help out your win percentage. So having extra picks in Round 4 (and likely earlier) does much more for you than having extra picks in the last few rounds.

Alright, so now we know that additional draft picks can boost your win percentage by a small amount, but only if you look over several years and focus on earlier draft picks. Now we need to test how this compares to a measure of drafting skill.

Estimating how good at drafting a team is is much more difficult than just comparing win totals to the number of draft picks. While certainly not perfect, a decent proxy is the length of a player's tenure in the NFL; if, say, the Giants have drafted the same number of total players as the Cardinals over the last five years but have twice as many which are still in the league, logically it would seem that the Giants are doing a better job identifying talent.

In Figures 3 and 4 I've plotted the same metrics as in Figures 1 and 2, but looking at the number of drafted players still on the team instead of the raw draft numbers.

 Figure 3: Same as Figure 1, but computing the correlations between the number of players still in the NFL and win percentage.
 Figure 4: Same as Figure 2, but for the strongest correlation in Figure 3.

Of course, the number of players still in the league is also dependent on the total number of draft picks a team has. So we expect a correlation at least as strong as for the raw draft picks.

The first thing to notice is that many more ranges of years produce statistically significant correlations. Many of these correlations are larger than the strongest correlation from Figure 1, although the peak correlation is still in roughly the same location. Looking at this peak correlation by round, however, the largest correlations are not much larger than when looking at raw draft picks.

Discussion and Conclusions
Before really jumping into the detailed analysis, it's important to note that none of these correlations are very large — that is to say that at best historical drafting ability plays only a small role in determining how well a team will do in a given year. This is perhaps not hugely surprising, given that there are many other variables (injuries, suspensions, contract holdouts, varying strength-of-schedule) which affect a team's fortunes but have nothing to do with drafting.

Despite this, however, many of the correlations are statistically significant, which means they are very likely to be real. It's always important when looking at correlations to remember that even significant correlations do not necessarily imply causation. But in this case, when it's fairly clear that a team's ability to draft well should directly impact their on-field success, it seems reasonable to assume a causal link.

Let's first discuss the intriguing results in the round-by-round breakdown. It's clear that considering later rounds in the analysis doesn't significantly improve the correlation. The broadness of this result implies that it is not a statistical aberration, which then means that adding extra late-round picks doesn't significantly help your team — the logical conclusion here is that it makes the most sense to package up your 5th, 6th, and 7th round picks and grab extra 4th and earlier selections.

The main conclusion, however, has to be that drafting players who survive in the league isn't much more better than simply drafting extra players. It's possible (probable?) that the number of second- and third-string players that stick around the league for a long time (so-called 'career backups') are biasing the results. The best way to test this hypothesis would be to construct some way of comparing player skill and add this into the analysis, but of course such a statistic (which would have to accurately compare such disparate positions as quarterback and defensive tackle) would not be simple to create.

Looking at the data on the raw draft picks indicates that there is indeed some advantage to be gained just by stockpiling draft choices. Given how teams appear to undervalue their draft picks in future years, a forward thinking team should be able to trade away picks in a current draft in exchange for extra picks in the next year. Repeating this strategy over several years would (in theory) lead to a large surplus of picks.

The correlations are small, but every little advantage in the NFL matters. As long as teams are willing to give up many future-year and/or late-round picks in order to move up just a few spots in the first couple of rounds, there will always be opportunities for a patient team to gobble up the extra selections. Bill Belichick's Patriots — 10 playoff appearances in the 13 years — are well-known for doing just that.

Get Widget