看懂nfl定理需要什么知识_NFL球队为什么不经常通过?

看懂nfl定理需要什么知识

Debunking common NFL myths in an analytical study on the true value of passing the ball

在关于传球真实价值的分析研究中揭穿NFL常见神话

Background

背景

Analytics are not used enough in the NFL. In a league with an abundance of money, intelligence, and skill, one may assume that the game we see teams play today is the most optimized and efficient it can be. However, the reality is that the league and its gurus still stick to many traditional aspects of the sport, not utilizing analytical techniques as much as they could. Our research reveals a significant flaw in the way the game is currently played.

在NFL中,分析使用不足。 在一个拥有大量金钱,智力和技能的联赛中,人们可能会认为我们今天看到的球队所进行的比赛是最优化,最高效的。 但是,现实情况是,联盟及其专家仍然坚持这项运动的许多传统方面,没有尽可能多地利用分析技术。 我们的研究表明,目前的游戏方式存在重大缺陷。

Around the same time the three-point line was added to the NBA, the NFL experienced a similarly drastic change with the popularization of the Spread Offense. Many know it as the change that was designed to spread the opposing team’s defense horizontally across the field, exposing holes for the offense. A lesser-known fact is that the implementation of the playstyle caused the passing game to be more efficient than ever…

大约在三分线加入NBA的同时,随着Spread Offense的普及,NFL经历了类似的急剧变化。 很多人都知道它作为被设计为横向传播对方球队的防守穿过田野,露出了犯罪Kong的变化。 鲜为人知的事实是,游戏风格的实现使传球游戏比以往任何时候都更有效率。

Figure 1 图1

… and increasingly so over the years. Figure 1 clarifies that the efficiency (which accounts for yards, first downs, touchdowns, interceptions, and sacks) for passing has for at least the past forty years significantly exceeded rush efficiency.

…并且在过去几年中越来越多。 图1阐明,至少过去四十年来,通过的效率(考虑到码数,首次起落,触地得分,拦截和麻袋)已经大大超过了冲刺效率。

Initial Findings & Suspicions

初步调查结果和怀疑

Mind-blown by this widening chasm between the efficiencies of the two play types, we set out to explore the significance of this discrepancy and how it has affected modern NFL teams’ decision-making. For the data source, NFL play-by-play data from Ron Yurko’s nflscrapR package in R was aggregated.

由于两种比赛效率之间的差距不断扩大,我们对此感到震惊,我们着手探讨这种差异的重要性以及它如何影响现代NFL球队的决策。 对于数据源,Ron Yurko在R中的nflscrapR程序包中的NFL逐次播放数据被汇总。

The initial findings supported the idea that passing has become more efficient — in 2019, the average yards gained from a pass attempt (PYPA) was 6.73 while the average yards gained from carries (YPC) was 4.40. The Baltimore Ravens, who averaged the most yards per carry (5.6), averaged less YPC than Mitch Trubisky and the Chicago Bears’ dismal 5.7 PYPA (worst in the 2019 season). Nonetheless, as seen in Figures 2 & 3, NFL teams have only passed the football 58.5% of the time (out of run or pass plays only) over the past eleven seasons, a stat that one would logically expect to be greater given the above numbers.

最初的调查结果支持传球变得更加有效的想法-在2019年,传球尝试(PYPA)获得的平均码数为6.73,而传球(YPC)获得的平均码数为4.40。 巴尔的摩乌鸦的平均每码平均码数最多(5.6),其平均YPC值低于米奇·特鲁比斯基和芝加哥熊队惨淡的5.7 PYPA(2019赛季最差)。 尽管如此,如图2和3所示,在过去的11个赛季中,NFL球队仅通过了58.5%的时间(仅是失控或传球),这在逻辑上是一个合理的数字数字。

Figure 2 图2
Figure 3 图3

If teams only pass 58.5% of the time even after the best rushing teams fail to average more YPC than the league’s worst passing team’s PYPA, then surely, there must be something missing. Though not necessarily flawed, the initial findings offered a rather one-dimensional analysis of the data. While it was concluded that passing yielded significantly more yards for any team, it had still not been confirmed that passing was more correlated with success. Therefore, the group sought to compare the correlations between teams’ winning percentages (WP) in a given season and rushing and passing, respectively. First, a metric was required to measure the success of each play type for all 352 teams (32 teams * 11 seasons). The most logical choice was using Success Rate over Average. A play is considered to be successful when it gains at least 40% of yards-to-go on first down, 60% of yards-to-go on second down, and 100% of yards-to-go on third or fourth down. The metric was calculated by taking a team’s average success rate of a play type and dividing it by the league average for the given season. The correlations were then run between the success rates and the WPs and graphed, as seen below.

如果即使在最好的冲锋队的平均YPC不能超过联盟最差的冲锋队的PYPA之后,如果团队仅在58.5%的时间上通过了,那么,毫无疑问,肯定有一些不足。 尽管不一定有缺陷,但最初的发现提供了对数据的一维分析。 尽管可以得出结论,传球可以为任何球队带来更多的码数,但仍不能确定传球与成功的关系更大。 因此,该小组寻求比较给定季节中各队的获胜率(WP)与冲刺和传​​球之间的相关性。 首先,需要一个度量标准来衡量所有352支球队(32支球队* 11个赛季)每种比赛类型的成功率。 最合乎逻辑的选择是使用成功率高于平均水平。 如果一局比赛在第一局下降时获得至少40%的码数,在第二局下降时获得至少60%的码数,而在第三局或第四局下降时获得100%的码数,则视为成功。 。 该度量标准是通过计算球队在某一赛季的平均成功率并将其除以联赛平均水平得出的。 然后,在成功率和WP之间运行相关性并绘制图表,如下所示。

Figures 4 & 5 图4和5

Although neither variable has a strong correlation (> .7) with WP, passing is nearly twice as correlated with winning than rushing. Even without knowing the correlations, one can infer that passing success is more correlated with winning merely by noticing the spread of the data points on the graphs. While the comparison in Figure 4 has outliers well beyond the reaches of a wide oval shape, Figure 5 contains all of its data within a football-shaped ellipse.

尽管两个变量都不与WP有很强的相关性(> .7),但传球与获胜的相关性几乎是冲刺的两倍。 即使不知道相关性,也可以仅通过注意到图上数据点的分布来推断传递成功与获胜更相关。 尽管图4中的比较具有超出宽椭圆形范围之外的异常值,但图5包含了其所有数据,这些数据都位于橄榄球形椭圆内。

Creating a Win Percentage Model

创建赢率百分比模型

The next step in the process was constructing a model that predicted a team’s winning percentage given certain passing and rushing attributes. This model would identify which aspects of a team were more important in creating a successful team — this meant selecting rushing and passing variables and weighing them to optimize the model’s correlation with WP. The variables used were the following:

该过程的下一步是构建一个模型,该模型在给定传球和冲刺属性的情况下预测球队的获胜百分比。 该模型将确定团队的哪些方面对于创建一个成功的团队更重要-这意味着选择紧急和传递变量并权衡它们,以优化模型与WP的相关性。 使用的变量如下:

Passing Attributes:

传递属性:

  • Passing Success Rate (as described before)

    通过成功率(如上所述)
  • Adjusted Sack Rate on pass plays (inversely proportional to a team’s success in pass protection)

    调整传球后的解雇率(与球队传球保护的成功成反比)
  • Pass Touchdowns (directly proportional to pass scoring and success)

    传球达阵(与传球得分和成功成正比)

Rushing Attributes:

冲属性:

  • Rushing Success Rate

    冲动成功率
  • Adjusted Line Yards on rushing plays (directly proportional to a team’s success in rush blocking)

    抢断调整后的线码(与球队成功抢断成正比)
  • Rush Touchdowns

    紧急达阵

A team legend was created so the outliers could be identified.

创建了团队图例,以便可以识别异常值。

Figure 6 图6

With a correlation of about 0.67, the refined model in Figure 6 came close to a strong relationship with WP. More significantly, passing statistics had an effect on the model that was 1.5 times greater than rushing stats — in other words, the pass variables were weighed 1.5x more. With this compelling evidence, it was becoming more apparent that the passing game is far more critical for a team’s success than its running game. The increasingly confirmed hypothesis that teams do not pass enough was now raising more questions concerning NFL teams’ decision-making.

由于相关性约为0.67,图6中的精炼模型与WP紧密相关。 更重要的是,传递统计信息对模型的影响比紧急统计信息大1.5倍-换句话说,传递变量的权重比其他统计数据高1.5倍。 有了这些令人信服的证据,越来越明显的是,传球对团队的成功至关重要,而不是奔跑的游戏。 越来越多的关于团队没有通过足够多的假设的假说现在正引发更多有关NFL团队决策的问题。

Busting Two Common Myths in the Modern NFL

打破现代NFL中的两个常识

Surely there was something still missing — a key factor not taken into account was the potential drawback of repetition and its effect on the value of a play in a game. A common notion about play-calling is that repetition tends to decrease the value of a play while the potential unpredictability in “changing it up” (running/passing almost the same number of times) keeps the defense on their feet. To measure the value of variability — or loss of value with repetition — comparisons were run between the run/pass proportions of past play-calls in a game and the EPA of the current play, as seen in Figures 7 & 8. EPA, or expected points added, is a popular metric used to quantify the value of a play in terms of the number of points it is predicted to yield for the team with the ball. As ESPN explains, “without going into technical details, the key is that the relationships in the EP formula encapsulate the basic tenets of football, including: being closer to the opposing goal line and farther from your own is better; earlier downs are better (first-and-10 is better than second-and-10, etc.); shorter distance to go is better; being at home is better.” To study this comparison, all NFL plays since the 2009 season were used.

当然,仍然缺少一些东西–一个未被考虑的关键因素是重复的潜在缺点及其对游戏价值的影响。 关于打出电话的一个普遍观念是,重复往往会降低打出的球的价值,而“改变”(运行/传球次数几乎相同)中潜在的不可预测性却使他们的防守更加稳固。 为了衡量可变性的价值或因重复而造成的价值损失,在游戏中过去玩过的通话次数与当前比赛的EPA的通过/通过比例之间进行了比较,如图7和8所示。预期得分是一种流行的度量标准,用于根据预测为带球球队带来的得分数来量化比赛的价值。 正如ESPN所解释的那样,“无需赘述技术细节,关键在于EP公式中的关系囊括了足球的基本宗旨,包括:越接近对方的球门线,越远越好; 越早越好(第一和十比第二和十更好,依此类推); 距离越短越好; 在家比较好。” 为了研究这种比较,使用了2009赛季以来的所有NFL比赛。

Figure 7 & 8 图7和8

To reiterate, the x-axes represent the percent of previous plays in a given game that was either run or pass, and the y-axes represent the value of the given play. There are many data points at a value of 100% of previous plays being the same play because this only occurs in the first few plays of a game when there is no variation. It is also important to note that “blowout” games of wins by more than four possessions were excluded — since teams tend to pass a lot in desperation, causing those plays to be far less successful. The clear takeaway from Figures 7 & 8 is that repeating the same play type throughout a game does not have any impact on its EPA since the blue line of best fit has a slope of < 0.01. Something else to emphasize from these charts is the reason why these many data points were left to display, which is the fact that there are far more points toward the middle of the graph where about half of the past plays are the same play call. What this shows is the flawed concept in the NFL that there should be an even mix of plays.

重申一下,x轴表示在给定游戏中已运行或通过的先前游戏的百分比,y轴表示给定游戏的值。 有许多数据点,前一个游戏的100%的值是相同的游戏,因为这只发生在游戏的前几个游戏中且没有变化时。 还需要注意的是,排除了超过四个回合赢得“井喷式”比赛的原因-因为球队在绝望中往往会传球很多,导致这些比赛的成功率要差得多。 从图7和8中的清楚的外卖是,在整个游戏重复相同的播放类型没有因为最佳拟合的蓝线具有<0.01的斜率在其EPA任何影响。 这些图表中还有其他要强调的地方是要显示这些数据点的原因,这是事实,在图表的中间有更多的点,过去的比赛中有大约一半是相同的比赛。 这说明NFL中存在一个有缺陷的概念,那就是应该有均匀的比赛组合。

Furthermore, the same graph functions were run but for the proportion of past play calls being the opposite play type — increasing previous run percentages compared to pass EPA and vice versa.

此外,运行了相同的图形函数,但是过去的播放调用的比例是相反的播放类型-与通过EPA相比,增加了之前的运行百分比,反之亦然。

Figures 9 & 10 图9和10

The change in EPA was similarly negligible in these cases as well. All trends were consistent throughout the past eleven seasons. The percentages are simply one minus the percentages from the previous two graphs but are shown here to emphasize that the change in EPA is negligible in these cases as well. It can reasonably be concluded from Figures 7–10 that passing does not lose any value even if teams have already called a high percentage of passes in a given game. This debunks the common myth of the importance of establishing the run, which — as mentioned briefly earlier — is the idea that teams must run the ball to keep the defense honest. From the above analysis, there is no apparent reason why teams do not throw the ball more. But exactly how much would teams benefit if they were to call more pass plays? To quantify this value for any team is difficult since there has been no team who has experimented with throwing the ball exceedingly more.

在这些情况下,EPA的变化同样可以忽略不计。 在过去的十一个季节中,所有趋势都是一致的。 百分比只是前两个图的百分比减去一个百分比,但此处显示的是要强调的是,在这些情况下,EPA的变化也可以忽略不计。 从图7-10可以合理地得出结论,即使团队已经在给定比赛中要求很高的传球率,传球也不会损失任何价值。 这打破了建立奔跑的重要性的普遍神话,正如前面简短提到的那样,这是团队必须为保持诚实的防守而奔波的想法。 通过以上分析,没有明显的理由可以解释为什么球队不多丢球。 但是,如果球队打出更多的传球机会,他们究竟会从中受益多少呢? 对于任何一支球队来说,量化这个价值都是困难的,因为没有任何一支球队尝试过将球投得更多。

Quantifying the Value of Passing More Often

量化通过的价值

It was first necessary to compare teams’ mean pass EPAs with their run EPAs. Success rates (used earlier in the study) would not be useful in quantifying the value of passing because they only provide the percentage of times that teams are successful and not the actual value of a play.

首先需要将团队的平均通过EPA与他们的运行EPA进行比较。 成功率(在研究的早期使用)在量化传球的价值上没有用,因为它们仅提供球队成功的次数百分比,而不是比赛的实际价值

Figure 11 图11

Mean League Pass EPA = 0.0517 | Mean League Run EPA = -0.0229

平均联赛通行证EPA = 0.0517 | 联赛平均EPA = -0.0229

In Figure 11, the discrepancies between teams’ mean pass and run EPAs are clear — on average, 2019 regular season teams’ pass plays yielded 0.0746 more expected points than run plays. Not only is passing the ball far more effective but running it is also losing teams potential points.

在图11中,团队的平均传球次数与跑步EPA之间的差异很明显-平均而言,2019年常规赛季球队的传球次数比跑步次数多0.0746预期点。 传球不仅效率更高,而且奔跑还会失去球队的潜在得分。

Furthermore, to see the actual benefit passing would have for teams, the barplot in Figure 12 was created to display the total expected points a team would earn if they were to pass the ball every time (of course, out of pass/run plays only) in the 2019 season. The y-axis (points added) was calculated by multiplying the number of run plays a team had in 2019 by the discrepancy between their pass and run EPA averages. It is important to note that this graph can be run independently of teams’ previous plays in a game since it was concluded above that both the values of passing and rushing are not at all affected by the proportion of prior plays being the same or opposite play.

此外,要查看传球给球队带来的实际收益,创建了图12中的条形图,以显示如果他们每次传球都可以赚取的球队的预期总积分(当然,只有传球/传球无效) )在2019赛季中。 y轴(加分)是通过将团队在2019年的跑步次数乘以他们的传球与EPA平均值之间的差异来计算的。 重要的是要注意,该图表可以独立于球队先前的比赛来运行,因为上面得出的结论是,传球和冲球的值完全不受先前比赛相同或相反比赛的比例的影响。

Figure 12 图12

On average, teams would have produced 34.33 additional points in their 2019 season had they passed the ball every time. The San Francisco 49ers’ potential 113 more points are astonishing, yet not surprising. The Niners were one of two teams that were running more than they passed, yet with a stellar QB in Jimmy Garoppolo, they averaged nearly twice as many PYPA (8.4) than YPC (4.6) and had the fifth-highest average pass EPA n. Why the 49ers decided to run the ball more is inexplicable. What is surprising, however, is that more successful teams would have benefitted even more from calling more pass plays. Six of the eight teams that made the division round of the playoffs and three of the four teams that made the conference championships were above the mean potential points added in Figure 12.

平均而言,如果他们每次都传球的话,球队在2019赛季会多得34.33分。 旧金山49人队的潜在113分之多令人惊讶,但这并不奇怪。 Niners是两支球队中跑分超过他们的球队之一,但吉米·加洛波洛(Jimmy Garoppolo)的QB表现出色,他们的PYPA(8.4)平均得分是YPC(4.6)的两倍 ,并且平均得分EPA n排名第五。 为什么49人决定更多地控球是无法解释的。 然而,令人惊讶的是,更多成功的球队会从更多的传球中受益。 进入季后赛分区赛的八支球队中有六支,参加会议冠军的四支球队中的三支超过了图12中添加的平均潜在点。

Conclusion

结论

Teams have the opportunity to score more points by only passing the ball more often. Also, better teams tend to have greater margins between their average pass and run EPAs (take a look at the margins of these successful teams in Figure 11: Ravens, 49ers, Cowboys, Chiefs, Saints, and Seahawks). Despite these facts, teams are unwilling to experiment with their pass/run ratio — after all, teams cannot run statistics experiments when their seasons are at stake. However, what can be confirmed is that fans will continue to see teams pass more and more in the coming years. Josh Hermsmeyer of FiveThirtyEight puts it best: “The NFL is a passing league that somehow doesn’t pass enough. NFL teams know the medicine works yet stubbornly refuse to take a clinically effective dose.”

球队只有通过更频繁地传球才有机会得分。 而且,更好的球队往往在平均传球和EPA交易之间有更大的利润空间(请参见图11中这些成功球队的利润:乌鸦,49人,牛仔,酋长,圣徒和海鹰)。 尽管有这些事实,但车队不愿意尝试通过率/奔跑率—毕竟,当赛季处于危险状态时,车队无法进行统计实验。 但是,可以确定的是,在未来几年中,球迷们将继续看到越来越多的球队通过。 FiveThirtyEight的Josh Hermsmeyer说得最好:“ NFL是一个传球联盟,以某种方式还不够传球。 NFL小组知道这种药有效,但顽固地拒绝服用临床有效剂量。”

Credits

学分

Andrew Cramer, Atharv Karanjkar, and Ethan Schwimmer were also members of the initial project and instrumental in generating ideas, coding, and modeling.

Andrew Cramer,Atharv Karanjkar和Ethan Schwimmer也是最初项目的成员,并在产生想法,编码和建模方面发挥了作用。

Sources Used

资料来源

https://github.com/ryurko/nflscrapR-data

https://github.com/ryurko/nflscrapR-data

https://www.footballoutsiders.com/

https://www.footballoutsiders.com/

https://www.sharpfootballstats.com/rushing-success-rate-over-average--sroa-.html

https://www.sharpfootballstats.com/rushing-success-rate-over-average--sroa-.html

https://fivethirtyeight.com/features/sorry-running-backs-even-your-receiving-value-can-be-easily-replaced/

https://fivethirtyeight.com/features/sorry-running-backs-even-your-receiving-value-can-be-easily-replaced/

https://www.footballoutsiders.com/stat-analysis/2020/finding-optimal-passrun-ratio

https://www.footballoutsiders.com/stat-analysis/2020/finding-optimal-passrun-ratio

https://fivethirtyeight.com/features/is-running-the-ball-back

https://fivethirtyeight.com/features/is-running-the-ball-back

https://fivethirtyeight.com/features/for-a-passing-league-the-nfl-still-doesnt-pass-enough/

https://fivethirtyeight.com/features/for-a-passing-league-the-nfl-still-doesnt-pass-enough/

https://thepowerrank.com/2018/09/24/the-surprising-truth-about-passing-and-rushing-in-the-nfl/

https://thepowerrank.com/2018/09/24/the-surprising-truth-about-passing-and-rushing-in-the-nfl/

https://www.pro-football-reference.com/years/NFL/index.htm

https://www.pro-football-reference.com/years/NFL/index.htm

https://www.espn.com/nfl/story/_/id/8379024/nfl-explaining-expected-points-metric

https://www.espn.com/nfl/story/_/id/8379024/nfl-explaining-expected-points-metric

https://www.sharpfootballstats.com/situational-run-pass-ratios--off-.html

https://www.sharpfootballstats.com/situational-run-pass-ratios--off-.html

https://codeandfootball.wordpress.com/2013/10/11/the-very-murky-world-of-offensive-srs-and-defensive-srs/

https://codeandfootball.wordpress.com/2013/10/11/the-very-murky-world-of-offensive-srs-and-defensive-srs/

https://www.footballoutsiders.com/stats/nfl/offensive-line/2019

https://www.footballoutsiders.com/stats/nfl/offensive-line/2019

http://www.footballperspective.com/why-do-teams-run-the-ball-part-iii/

http://www.footballperspective.com/why-do-teams-run-the-ball-part-iii/

https://www.espn.com/nfl/story/_/id/8379024/nfl-explaining-expected-points-metric

https://www.espn.com/nfl/story/_/id/8379024/nfl-explaining-expected-points-metric

翻译自: https://medium.com/the-sports-scientist/why-dont-nfl-teams-pass-more-often-e51adc22efb6

看懂nfl定理需要什么知识

你可能感兴趣的:(python)