Monday, August 19, 2013

Quarterback Rating I: Year-to-Year Progression

Using quarterback ratings I've charted out a QB's average improvement from his first season as the starter. On average a QB sees only a minor ~10-point rating boost in his second year, with his rating remaining flat (or lower) for the rest of his career. Additionally, very few (~20%) players will ever have a season with a QB rating more than 20 points higher than their first year. These results indicate that a quarterback's first season is a reliable indicator of their future success, and that passers who struggle in the early stages of their career are unlikely to show significant long-term improvement.

As the guy responsible for handling the ball on every single offensive play, the quarterback is unambiguously the most important player on a team. So when a team drafts a new quarterback the pressure is extremely high - both on the player to perform to expectations and on the management to ensure they're getting a good return on their (significant!) investment.

In recent years QBs have been asked to step in and start as rookies with increasing frequency. Last year saw a record 5 rookie signal-callers taking the majority of their team's snaps. While this year's draft appears to have a definite lack of QBs ready to start immediately, it's a virtual certainty that a few desperate teams will roll the dice on their shiny new gunslingers.

With the importance of the quarterback position and proliferation of young, untested starters, it's critical for teams to accurately evaluate QBs, not only as college prospects but even while they're playing in the NFL. While there is clearly worth in exploring how quarterback talent is evaulated for the NFL draft, the sheer number of college teams, and the limited opportunities given for the best players to play against each other, make it very difficult to perform such an analysis without more advanced tools.

Fortunately, charting the progression of quarterbacks once they enter the NFL is also interesting, and somewhat easier due to the small number of teams and the high level of competition. A good manager is always watching how their players progress, and it's highly relevant to know whether a struggling QB is merely inexperienced or a hopeless cause. There are several ways to dig into this topic: for now I'll focus on computing the average year-to-year progression of NFL quarterbacks as a general barometer for how a QB should be expected to develop.

As usual the data come from the Armchair Analysis database. I first queried the database for the identifiying information for all QBs, then fed that into a query which returned all game stats for each QB. From there season totals were computed.

Finally, the seasonal QB rating for each quarterback was determined. Because the QB rating can be highly biased if a passer only has a small number of attempts in a given year, I only took ratings from seasons in which the QB threw at least 150 passes.

A (relatively) simple way to track a signal-caller's improvement over time is to compare their QB rating from a given season to earlier seasons. An aggregate plot comparing a passer's QB rating from later seasons to their first 'full' season (full being a season where the QB attempted at least 150 passes) is shown in Figure 1. The data are shown as black points, while the averages (and standard errors) are shown in red.
Figure 1: QB rating improvement from first season as a function of years in the league. Red points show average improvements.

While there is significant scatter, it is clear that on average a QB only shows improvement between their first and second full seasons. After that, performance stabilizes until the 7th season or so, where it begins to decrease (although the data appears to show that the few QBs who make it to their 10th season are able to maintain their improved performance).

This performance boost, at only 5-10 rating points, is moderate at best, and indicates that a quarterback's first full season is a strong indicator of their future success. Of course, this is only an average and as such somewhat of an abstraction - clearly not all QBs will follow exactly this trend.

To gain more insight into the maximum potential improvement over a quarterback's career I've plotted a histogram of peak QB rating improvement (or minimum reduction, the sad reality for some passers) in Figure 2. It's clear from this figure that the majority of signal-callers never progress beyond a 20-point improvement in QB rating, even during their best seasons, with only 20% of all passers in the sample beating this threshold1.
Figure 2: Histogram of peak QB performance compared to a player's first starting season.

Discussion and Conclusions
Even at their very best, this analysis shows that most quarterbacks shouldn't be expected to show dramatic improvement at any point during their careers, and only moderate improvement from their initial starting season. This analysis indicates that even a rookie QB's ceiling can be estimated with reasonable certainty, and has clear ramifications for evaluating quarterbacks. For instance, this is bad news for Andrew Luck (first year QB rating of 76.5), Ryan Tannehill (76.1), Jake Locker (74.0), and Brandon Weeden (72.6), who are all unlikely to ever see a triple-digit rating but are tabbed as the starters heading into 2013.

These results also lend credence to the arguments of impatient fans, who expect to see immediate results from new QBs and have no patience for any 'adjustment period', 'learning curve', or any other excuse offered by a team for a young passer's poor play. I had always assumed these fans were merely short-sighted, unwilling to wait and see how a player would develop. But now it's much more difficult to dismiss their concerns so easily.

1: The two players in the sample with a 40+ point QB rating improvement? Alex Smith and Eli Manning. 

Monday, August 5, 2013

Penalties I: Referee Bias

In addition to making the occasional blown call, multiple sources have noted that referees appear to have a subtle, pervasive, likely subconscious, home-team bias. Here I attempt to quantify that bias, using different categories of penalties to highlight any discrepancy between penalties that require no interpretation (and should not be subject to this sort of bias) and penalties that involve the judgement of the referees (and therefore would be prone to bias). I find that there is a small but statistically significant discrepancy between judgement-call penalties on the home and away teams, with the visitors getting flagged an average of ~0.1 more times per game. What is most striking about this result is not its statistical significance but how small it is, a testament to the (often overlooked) fact that NFL referees are generally quite good at their jobs.

If you watch football for long enough, eventually you'll see a play that makes you uncontrollably angry—Specifically, angry at the refs. How could they have blown that call so badly? Were they even watching the play?

This outrage, however, usually fades fairly fast—you have some reluctant understanding that what's obvious to you from the super-slo-mo replay is not as crystal clear when seen at full speed, and most individual calls/non-calls have a small impact on the final score. (Of course, there are some notable exceptions).

Individual plays such as these are so infrequent that they are not well-suited to statistical analysis. However, it is also possible that referees can be biased by the location of the game, either because the refs are from the area or are subconsciously influenced by the cheering home crowd. The NFL mitigates the former issue by rotating crews between stadiums, but what about the latter?

Unfortunately, while some work has already been done on this very issue, actual numbers on any bias appear to be thin on the internet ground. General assertions from non-open-access sources1 abound, as do people using studies of soccer(!) officiating to back up their claims about the NFL. I did run across an interesting article that attempted to quantify home/away bias in individual officiating crews, but it unfortunately suffers from a small (13 weeks) sample size and a lack of errors — is calling an average of 1.5 extra penalties on the away team a significant effect or have they just shown how noisy their data is? (The fact that the sum of each crew's 'bias' is close to zero is circumstantial evidence for the latter case).

Once again my data come from the thorough folks at Armchair Analysis. In addition to providing data on individual penalties, they also aggregate the calls into one of several helpful categories. Using their categories as jumping off points, I lumped almost all penalties in the entire data set into one of four categories:
  • Judgement: Penalties like holding, pass interference, and illegal use of hands, for both offense and defense.
  • Timing: False starts, offsides, encroachment, and neutral zone infractions.
  • Positioning: All kinds of illegal blocking penalties (e.g. blocks in the back, crackback blocks, tripping).
  • Dumb: Taunting, roughing the passer, giving him the business, etc.

I split up the penalty data into home and away bins, then computed the average number of penalties per game in each category. To get a sense of the uncertainties, I bootstrapped the data. These averages are shown in Figure 1.
Figure 1:Average penalties per game in each of the four categories discussed in the Data Section.

For both penalties relating to positioning (the illegal blocks) and dumb penalties there is stastically zero referee bias; both the home and away teams get flagged at the same rate (within the errors). This is not surprising, as these calls are fairly cut-and-dried, with little room for interpretation. Also not surprising is that the away team suffers more timing penalties (~0.2 more per game) — despite also being generally black and white, things like false starts and offsides are the fouls most likely to be affected by crowd noise.

For judgement call penalties like holding or pass interference, however, there is a small but statistically significant excess of penalties for the away team, with the visitor receiving an average of 2.70±0.03 penalties while the home team only gets called 2.59±0.03 times per game. These fouls should not be significantly affected by crowd noise, and thus indicate that referees do indeed hold a slight bias in favor of the home team.

Discussion and Conclusions
So it seems that NFL refs are indeed biased. But honestly, one tenth of a penalty per game is a pretty small bias. Since teams only play 8 away games during the regular season, this is less than one extra penalty, and since each team also plays 8 home games over time things should average out. Even in the playoffs, where a #6 seed would have to play 3 away games to make it to the Super Bowl, this bias shouldn't play a large role. The real story here is how fair  NFL officials are, even when calling fouls in front of 80,000 rabid, screaming, angry fans.

1: In an interview with Wired,  one of this book's authors cites this sort of referee bias as the reason why the Seahawks lost Super Bowl XL. I find this frightening, as anyone who writes an entire book about statistics should know that you can't apply statistical trends to individual events. I assume(hope?) that he was just speaking off the cuff and was therefore not very thorough with his answer.

Shout out to Sonographer's Cup winner Andrew "Lulu" Schaffrinna, without whom this post (and indeed, any future studies of penalties) would almost certainly never have happened.

Social Media Bar

Get Widget