If you remember back to Monday's update piece on 2010 Stuffs and the original post from '09, we talked about how the NCAA's rushing yardage is all screwy because of their inclusion of sacks. And tackles-for-loss (TFLs) include all negative yardage plays, even though we know that a sack occurs on a pass play while a run stuff occurs (surprise) from a running play. My simple way around this was to remove sacks from TFLs to get a Stuff metric (i.e., TFLs - Sacks = Stuffs). This is nice because it allows for easy calculation and - using CFBStats.com - we can easily calculate conference and national numbers with the nimble use of copy/paste into an excel spreadsheet (note on execution: use "paste special" and choose Unicode Text).
For the 2010 season update, I took into account a per-rushing-play-against to try to gauge which teams were stopping opponents in the backfield for rushing plays only. That was the "Stuffs.pP" percentage, indicating on what percentage of plays did a particular defense stuff an opposing offense's running play.
The ACC results were interesting. The top 8 teams featured an above-average rate, with 4 teams in the national top 12: BC (4th), Miami (8th), NC State (11th), and Clemson (12th). Note that each team finished the season in the top 25 of S&P rush defenses as well (2nd, 19th, 23rd, and 6th, respectively). FSU was 9th in the conference, yet fielded a 40th ranked rush defense.
This begs the question: Is there any correlation between a defense's Stuff rate and their rush defense ranking? TN poster orinole stated this:
...what are your thoughts about actually using this as a metric to measure defense improvement? Or toward indicating the overall defensive prowess if you will? I'm not sure how many conclusions can be drawn on stuffs...
Along with Nolesos Locos' suggestion, I added FootballOutsiders' S&P rush defense ranking. Let's run some statistics using everyone's favorite open-source statistical software platform R.The data we'll be using for this analysis is located here. Note: You don't need to download it to run the software - we'll make an http call to the needed file. You can copy-and-paste the code in the shaded boxes below if you'd like to reproduce the analysis.
#Read in data to R and display
stuffs = read.csv("http://myweb.fsu.edu/reh3682/data/fun/Stuffs2010.csv",T); head(stuffs)
#Perform correlation testcor.test( stuffs$Rush.D.Rk, stuffs$Stuffs.pP2)
Pearson's product-moment correlation
data: stuffs$Rush.D.Rk and stuffs$Stuffs.pP2
t = -4.9805, df = 118, p-value = 2.19e-06
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
-0.5545874 -0.2567115
sample estimates:
cor
-0.4167756
What we see here is a significant negative correlation between team S&P rush defense rank and Stuffs per rush against. That is, a team's S&P rush defense rank should be lower if they have a high Stuffs per rush against. That makes sense; maybe even obvious sense. But it's good to see that the relationship is there, and follows logic. (Note that Stuffs per rush against only explains about 17% of the total variance in a team's S&P rush defense rank, so it's not an end-all/be-all metric.)
What about the top 30 defenses? Do the higher ranked run-stopping defenses (adjusted) still demonstrate a proclivity toward stuffing opposing runners?
#Perform correlation test for top 30 defenses
cor.test( stuffs$Rush.D.Rk[stuffs$Rush.D.Rk<31], stuffs$Stuffs.pP2[stuffs$Rush.D.Rk<31])
Pearson's product-moment correlation
data: stuffs$Rush.D.Rk[stuffs$Rush.D.Rk < 31] and stuffs$Stuffs.pP2[stuffs$Rush.D.Rk < 31]
t = -2.3166, df = 28, p-value = 0.02806
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
-0.66520737 -0.04767016
sample estimates:
cor
-0.4010517
In short: Yes. Generally, the best rushing defenses stuff opposing running games.
Here is a scatterplot of all 120 teams' S&P rush defense rank and Stuffs per rush against (as a percentage of plays): Note the 3 outliers on the right-most part of the plot. From top to bottom, they are: Miami (Ohio), Kent State, and Southern Mississippi. So Nolesos Locos' hunch pays off, to the benefit of the Stuffs ~ Rush D. relationship. We know these are smaller schools that aren't getting that great a talent. But then how are these teams getting such a high output on their Stuffs rate? Perhaps they are run-blitzing. The Dr.-turned-king suggested as much:
Let's add 10+ and 20+yard rushes to see if these or other teams are gambling with an aggressive scheme hoping to produce splash plays, but meanwhile giving up big plays. Ultimately, this could show who is gambling (all-or-nothing run coverages, below/average personnel) and who may have an incredible front 7 (vanilla run coverages, dominant personnel). Note that "10+pP" is the percentage of plays for which that defense gave up a run of 10 yards or more (and the same for "20+pP").
|
Boston College does an insanely good job at both (#5 in Stuffs.pP, #1 in least 10+ yard rushes allowed rate). FSU is actually 17th in 10+ yard rushes allowed rate.
We've already shown that there is a nice correlation between a team's Stuffs rate and their S&P rushing defense rank. So let's see if, through the same statistical test, there's a relationship between a team's S&P rushing defense rank and their 10+ and 20+ yard rushes-allowed rate:
cor.test(stuffs$Rush.D.Rk,stuffs$Ten..pP2)
Pearson's product-moment correlation
data: stuffs$Rush.D.Rk and stuffs$Ten..pP2
t = 7.2059, df = 118, p-value = 5.895e-11
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
0.4146318 0.6660386
sample estimates:
cor
0.5527903
cor.test(stuffs$Rush.D.Rk,stuffs$Twenty..pP2)
Pearson's product-moment correlation
data: stuffs$Rush.D.Rk and stuffs$Twenty..pP2
t = 5.8627, df = 118, p-value = 4.232e-08
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
0.3232234 0.6028675
sample estimates:
cor
0.4749486
Assuming you're not asleep now, the above test isn't very mind-blowing: Good rushing defenses limit big plays against. The 2009 FSU Seminole defense proves the corollary to that relationship.
What now? Well, we have two covariates for which we know can help describe a team's rushing defense aptitude: Stuffs per rushing play against rate, and 10+ rushes allowed rate. Let's incorporate this into a multivariate model. From that, we can answer a question like: Which variable describes the change in a team's S&P rushing defense rank more strongly?
Maybe I'll demonstrate this model in the future, but I'm risking losing the rest of you; so I'll summarize the findings:
- It turns out that, while Stuffs rate is important, the ability of a defense to lower the rate of 10+ yard rushes against is roughly 2.5 more times important toward a team's S&P rushing defense rank.
So it pays to stop the bleeding first by limiting the big rushes against. As FSU's defensive personnel increase in experience and girth, I would expect both of those covariates and their final S&P rushing defense rank to improve as well.