cumei1658

python处理数据可视化_数据整理101：使用Python提取，处理和可视化NBA数据

python处理数据可视化

由Viraj Parekh | 2017年4月6日 (by Viraj Parekh | April 6, 2017)

This is a basic tutorial using pandas and a few other packages to build a simple datapipe for getting NBA data. Even though this tutorial is done using NBA data, you don’t need to be an NBA fan to follow along. The same concepts and techniques can be applied to any project of your choosing.

这是使用熊猫和其他一些软件包来构建用于获取NBA数据的简单数据管道的基础教程。即使本教程是使用NBA数据完成的，您也不必成为NBA粉丝。相同的概念和技术可以应用于您选择的任何项目。

This is meant to be used as a general tutorial for beginners with some experience in Python or R.

旨在将其用作具有Python或R经验的初学者的通用教程。

第一步：我们需要什么数据？ (Step One: What data do we need?)

The first step to any data project is getting an idea of what you want. We’re going to focus on getting NBA data at a team level on a game by game basis. From my experience, these team level stats usually exist in different places, making them harder to compare across games.

任何数据项目的第一步都是要了解您想要的东西。我们将专注于逐场比赛在团队层面获取NBA数据。根据我的经验，这些团队级别的统计数据通常存在于不同的地方，这使得它们在整个游戏中很难进行比较。

Our goal is to build box scores across a team level to easily compare them against each other. Hopefully this will give some insight as to how a team’s play has changed over the course of the season or make it easier to do any other type of analysis.

我们的目标是在整个团队水平上建立盒子分数，以轻松地相互比较。希望这能对团队的表现在整个赛季中发生的变化提供一些见解，或者使进行任何其他类型的分析变得更加容易。

On a high level, this might look something like:

从高层次看，这可能看起来像：

下一步：数据来自哪里？ (Next step: Where is the data coming from?)

stats.nba.com has all the NBA data that’s out there, but the harder part is finding a quick way to fetch and manipulate it into the form that’s needed (and what most of this tutorial will be about).

stats.nba.com拥有所有的NBA数据，但更难的部分是找到一种快速方法来将其提取并操纵为所需的形式（以及本教程大部分内容）。

Analytics is fun, but everything around it can be tough.

分析很有趣，但是周围的一切都很艰难。

We’re going to use the nba_py package

我们将使用nba_py包

Huge shoutout to https://github.com/seemethere for putting this together.

要大声地对https://github.com/seemethere进行大喊大叫，以将其整合在一起。

This is going to focus on team stats, so lets play around a little bit to get a sense of what we’re working with.

这将集中在团队统计数据上，因此让我们稍作练习以了解我们正在使用的工具。

Start by importing the packages we’ll need:

首先导入我们需要的软件包：

import pandas as pd
from nba_py import team

import pandas as pd
from nba_py import team

If you’re using jupyter notebooks notebooks you can pip-install any packages you don’t have straight from the notebook using:

如果您使用的是jupyter笔记本电脑笔记本，则可以使用以下方法从笔记本电脑中直接安装您没有的任何软件包：

If you’re using Yhat’s Python IDE, Rodeo you can install nba_py in the packages tab.

如果您使用的是Yhat的Python IDE Rodeo ，则可以在“软件包”标签中安装nba_py 。

python处理数据可视化_数据整理101：使用Python提取，处理和可视化NBA数据_第1张图片

Install packages in the Packages tab. No surprises here.

在“软件包”选项卡中安装软件包。这里没有惊喜。

So referring to the docs, it looks like we’ll need some sort of roster id to get data for each team. This api hits an endpoint on the NBA”s website, so the IDs are most likely in the URL:

因此，参考文档，看来我们需要某种名册ID才能获取每个团队的数据。该api会在NBA网站上命中一个端点，因此ID最有可能出现在URL中：

(Unapologetic Knicks bias) Looking at the team page for the on stats.nba.com, here’s the url: http://stats.nba.com/team/#!/1610612752/

（无奈的尼克斯偏见）在stats.nba.com上查看团队页面，以下是URL：http://stats.nba.com/team/#!/1610612752/

That number at the end looks like a team ID. Let’s see how the passing data works:

最后的数字看起来像一个团队ID。让我们看看传递的数据如何工作：

class nba_py.team.TeamPassTracking(team_id, measure_type=’Base’, per_mode=’PerGame’, plus_minus=’N’, pace_adjust=’N’, rank=’N’, league_id=’00’, season=’2016-17′, season_type=’Regular Season’, po_round=’0′, outcome=”, location=”, month=’0′, season_segment=”, date_from=”, date_to=”, opponent_team_id=’0′, vs_conference=”, vs_division=”, game_segment=”, period=’0′, shot_clock_range=”, last_n_games=’0′)

class nba_py.team.TeamPassTracking（team_id，measure_type ='Base'，per_mode ='PerGame'，plus_minus ='N'，progress_adjust ='N'，rank ='N'，League_id = '00'，season ='2016- 17'，season_type =“常规季节”，po_round ='0'，results =”，location =”，month ='0'，season_segment =”，date_from =”，date_to =”，对手_team_id ='0'，vs_conference = ”，vs_division =”，game_segment =”，期间='0'，shot_clock_range =”，last_n_games ='0'）

passes_made() passes_recieved()

pass_made（）pass_recieved（）

knicks = team.TeamPassTracking(1610612752)

knicks = team.TeamPassTracking(1610612752)

All the info is stored in the knicks object:

所有信息都存储在尼克斯对象中：

		TEAM_ID	TEAM_ID	TEAM_NAME	队名	PASS_TYPE	PASS_TYPE	G	G	PASS_FROM	通行证	PASS_TEAMMATE_PLAYER_ID	PASS_TEAMMATE_PLAYER_ID	FREQUENCY	频率	PASS	通过	AST	AST	FGM	女性外阴残割	FGA	FGA	FG_PCT	FG_PCT	FG2M	FG2M	FG2A	FG2A	FG2_PCT	FG2_PCT	FG3M	FG3M	FG3A	FG3A	FG3_PCT	FG3_PCT
0	0	1610612752	1610612752	New York Knicks	纽约尼克斯	made	制作	64	64	Rose, Derrick	罗斯，德里克	201565	201565	0.144	0.144	56.73	56.73	4.42	4.42	6.30	6.30	14.02	14.02	0.449	0.449	4.34	4.34	8.64	8.64	0.503	0.503	1.95	1.95	5.38	5.38	0.363	0.363
1	1个	1610612752	1610612752	New York Knicks	纽约尼克斯	made	制作	58	58	Jennings, Brandon	詹宁斯，布兰登	201943	201943	0.111	0.111	48.22	48.22	4.93	4.93	7.09	7.09	15.47	15.47	0.458	0.458	5.31	5.31	10.50	10.50	0.506	0.506	1.78	1.78	4.97	4.97	0.358	0.358
2	2	1610612752	1610612752	New York Knicks	纽约尼克斯	made	制作	66	66	Porzingis, Kristaps	克里斯蒂安（Kristaps）波尔津吉斯（Porzingis）	204001	204001	0.106	0.106	40.61	40.61	1.47	1.47	3.29	3.29	7.65	7.65	0.430	0.430	2.56	2.56	5.50	5.50	0.466	0.466	0.73	0.73	2.15	2.15	0.338	0.338
3	3	1610612752	1610612752	New York Knicks	纽约尼克斯	made	制作	46	46	Noah, Joakim	诺亚（Joahm）	201149	201149	0.073	0.073	40.20	40.20	2.24	2.24	4.17	4.17	8.85	8.85	0.472	0.472	3.43	3.43	6.93	6.93	0.495	0.495	0.74	0.74	1.91	1.91	0.386	0.386
4	4	1610612752	1610612752	New York Knicks	纽约尼克斯	made	制作	72	72	Anthony, Carmelo	安东尼（Carmelo）	2546	2546	0.102	0.102	35.83	35.83	2.88	2.88	4.18	4.18	9.65	9.65	0.433	0.433	3.13	3.13	6.99	6.99	0.447	0.447	1.06	1.06	2.67	2.67	0.396	0.396
5	5	1610612752	1610612752	New York Knicks	纽约尼克斯	made	制作	73	73	Lee, Courtney	李·考特尼	201584	201584	0.090	0.090	30.92	30.92	2.33	2.33	3.92	3.92	8.42	8.42	0.465	0.465	3.01	3.01	5.97	5.97	0.505	0.505	0.90	0.90	2.45	2.45	0.369	0.369
6	6	1610612752	1610612752	New York Knicks	纽约尼克斯	made	制作	68	68	Hernangomez, Willy	威利·埃尔南戈梅斯	1626195	1626195	0.076	0.076	28.26	28.26	1.25	1.25	2.32	2.32	5.50	5.50	0.422	0.422	1.74	1.74	3.93	3.93	0.442	0.442	0.59	0.59	1.57	1.57	0.374	0.374
7	7	1610612752	1610612752	New York Knicks	纽约尼克斯	made	制作	46	46	Baker, Ron	贝克，罗恩	1627758	1627758	0.045	0.045	24.93	24.93	1.87	1.87	2.61	2.61	5.72	5.72	0.456	0.456	1.93	1.93	3.80	3.80	0.509	0.509	0.67	0.67	1.91	1.91	0.352	0.352
8	8	1610612752	1610612752	New York Knicks	纽约尼克斯	made	制作	46	46	Thomas, Lance	托马斯·兰斯	202498	202498	0.042	0.042	23.24	23.24	0.76	0.76	1.93	1.93	4.67	4.67	0.414	0.414	1.70	1.70	3.78	3.78	0.448	0.448	0.24	0.24	0.89	0.89	0.268	0.268
9	9	1610612752	1610612752	New York Knicks	纽约尼克斯	made	制作	75	75	O’Quinn, Kyle	奥奎恩，凯尔	203124	203124	0.068	0.068	22.93	22.93	1.49	1.49	2.35	2.35	4.87	4.87	0.482	0.482	1.93	1.93	3.63	3.63	0.533	0.533	0.41	0.41	1.24	1.24	0.333	0.333

python处理数据可视化_数据整理101：使用Python提取，处理和可视化NBA数据_第2张图片

Did you know you can inspect, copy and save data frames in the History tab in Rodeo?

您是否知道可以在Rodeo的“历史记录”选项卡中检查，复制和保存数据框？

Referring back to the docs, this looks like per game averages for passes. Definitely a lot that can be done with this, but let’s try to get it for a specific game. Referring to the docs:

回到文档，这看起来像每场比赛的传球平均值。绝对可以做到这一点，但让我们尝试针对特定游戏获得它。参考文档：

knicks_last_game = team.TeamPassTracking(1610612752, last_n_games =  1)
knicks_last_game.passes_made().head(10)

knicks_last_game = team.TeamPassTracking(1610612752, last_n_games =  1)
knicks_last_game.passes_made().head(10)

		TEAM_ID	TEAM_ID	TEAM_NAME	队名	PASS_TYPE	PASS_TYPE	G	G	PASS_FROM	通行证	PASS_TEAMMATE_PLAYER_ID	PASS_TEAMMATE_PLAYER_ID	FREQUENCY	频率	PASS	通过	AST	AST	FGM	女性外阴残割	FGA	FGA	FG_PCT	FG_PCT	FG2M	FG2M	FG2A	FG2A	FG2_PCT	FG2_PCT	FG3M	FG3M	FG3A	FG3A	FG3_PCT	FG3_PCT
0	0	1610612752	1610612752	New York Knicks	纽约尼克斯	made	制作	1	1个	Baker, Ron	贝克，罗恩	1627758	1627758	0.212	0.212	72.0	72.0	6.0	6.0	7.0	7.0	15.0	15.0	0.467	0.467	7.0	7.0	11.0	11.0	0.636	0.636	0.0	0.0	4.0	4.0	0.000	0.000
1	1个	1610612752	1610612752	New York Knicks	纽约尼克斯	made	制作	1	1个	Ndour, Maurice	恩杜尔，莫里斯	1626254	1626254	0.135	0.135	46.0	46.0	1.0	1.0	3.0	3.0	9.0	9.0	0.333	0.333	3.0	3.0	4.0	4.0	0.750	0.750	0.0	0.0	5.0	5.0	0.000	0.000
2	2	1610612752	1610612752	New York Knicks	纽约尼克斯	made	制作	1	1个	Anthony, Carmelo	安东尼（Carmelo）	2546	2546	0.126	0.126	43.0	43.0	2.0	2.0	5.0	5.0	16.0	16.0	0.313	0.313	4.0	4.0	13.0	13.0	0.308	0.308	1.0	1.0	3.0	3.0	0.333	0.333
3	3	1610612752	1610612752	New York Knicks	纽约尼克斯	made	制作	1	1个	O’Quinn, Kyle	奥奎恩，凯尔	203124	203124	0.118	0.118	40.0	40.0	5.0	5.0	5.0	5.0	6.0	6.0	0.833	0.833	4.0	4.0	4.0	4.0	1.000	1.000	1.0	1.0	2.0	2.0	0.500	0.500
4	4	1610612752	1610612752	New York Knicks	纽约尼克斯	made	制作	1	1个	Lee, Courtney	李·考特尼	201584	201584	0.118	0.118	40.0	40.0	3.0	3.0	6.0	6.0	8.0	8.0	0.750	0.750	2.0	2.0	4.0	4.0	0.500	0.500	4.0	4.0	4.0	4.0	1.000	1.000
5	5	1610612752	1610612752	New York Knicks	纽约尼克斯	made	制作	1	1个	Hernangomez, Willy	威利·埃尔南戈梅斯	1626195	1626195	0.082	0.082	28.0	28.0	3.0	3.0	4.0	4.0	8.0	8.0	0.500	0.500	4.0	4.0	6.0	6.0	0.667	0.667	0.0	0.0	2.0	2.0	0.000	0.000
6	6	1610612752	1610612752	New York Knicks	纽约尼克斯	made	制作	1	1个	Holiday, Justin	假日，贾斯汀	203200	203200	0.071	0.071	24.0	24.0	3.0	3.0	4.0	4.0	7.0	7.0	0.571	0.571	4.0	4.0	6.0	6.0	0.667	0.667	0.0	0.0	1.0	1.0	0.000	0.000
7	7	1610612752	1610612752	New York Knicks	纽约尼克斯	made	制作	1	1个	Kuzminskas, Mindaugas	明道加斯Kuzminskas	1627851	1627851	0.059	0.059	20.0	20.0	2.0	2.0	2.0	2.0	6.0	6.0	0.333	0.333	2.0	2.0	5.0	5.0	0.400	0.400	0.0	0.0	1.0	1.0	0.000	0.000
8	8	1610612752	1610612752	New York Knicks	纽约尼克斯	made	制作	1	1个	Randle, Chasson	查森·兰德尔	1626184	1626184	0.044	0.044	15.0	15.0	0.0	0.0	0.0	0.0	1.0	1.0	0.000	0.000	0.0	0.0	1.0	1.0	0.000	0.000	0.0	0.0	0.0	0.0	NaN	N
9	9	1610612752	1610612752	New York Knicks	纽约尼克斯	made	制作	1	1个	Vujacic, Sasha	萨沙武贾西奇	2756	2756	0.035	0.035	12.0	12.0	1.0	1.0	2.0	2.0	3.0	3.0	0.667	0.667	2.0	2.0	2.0	2.0	1.000	1.000	0.0	0.0	1.0	1.0	0.000	0.000

This looks clean enough to be wrangled into a form that can be worked with.

这看起来很干净，可以整理成可以使用的形式。

If we’re trying to create a team level box score, we’re more than likely going to need to join tables together down the line, just something to keep in mind.

如果我们要创建团队级别的盒子分数，那么很可能需要将表连接在一起，这是需要牢记的。

Hitting the ShotTracking endpoint looks interesting:

击中ShotTracking端点看起来很有趣：

		TEAM_ID	TEAM_ID	TEAM_NAME	队名	SORT_ORDER	排序	G	G	CLOSE_DEF_DIST_RANGE	CLOSE_DEF_DIST_RANGE	FGA_FREQUENCY	FGA_FREQUENCY	FGM	女性生殖器	FGA	FGA	FG_PCT	FG_PCT	EFG_PCT	EFG_PCT	FG2A_FREQUENCY	FG2A_FREQUENCY	FG2M	FG2M	FG2A	FG2A	FG2_PCT	FG2_PCT	FG3A_FREQUENCY	FG3A_FREQUENCY	FG3M	FG3M	FG3A	FG3A	FG3_PCT	FG3_PCT
0	0	1610612752	1610612752	New York Knicks	纽约尼克斯	1	1个	1	1个	0-2 Feet – Very Tight	0-2英尺–非常紧	0.091	0.091	4.0	4.0	8.0	8.0	0.500	0.500	0.500	0.500	0.091	0.091	4.0	4.0	8.0	8.0	0.500	0.500	0.000	0.000	0.0	0.0	0.0	0.0	NaN	N
1	1个	1610612752	1610612752	New York Knicks	纽约尼克斯	2	2	1	1个	2-4 Feet – Tight	2-4英尺–紧	0.318	0.318	15.0	15.0	28.0	28.0	0.536	0.536	0.536	0.536	0.295	0.295	15.0	15.0	26.0	26.0	0.577	0.577	0.023	0.023	0.0	0.0	2.0	2.0	0.000	0.000
2	2	1610612752	1610612752	New York Knicks	纽约尼克斯	3	3	1	1个	4-6 Feet – Open	4-6英尺–开放	0.409	0.409	16.0	16.0	36.0	36.0	0.444	0.444	0.500	0.500	0.250	0.250	12.0	12.0	22.0	22.0	0.545	0.545	0.159	0.159	4.0	4.0	14.0	14.0	0.286	0.286
3	3	1610612752	1610612752	New York Knicks	纽约尼克斯	4	4	1	1个	6+ Feet – Wide Open	6英尺以上-张开	0.182	0.182	7.0	7.0	16.0	16.0	0.438	0.438	0.500	0.500	0.102	0.102	5.0	5.0	9.0	9.0	0.556	0.556	0.080	0.080	2.0	2.0	7.0	7.0	0.286	0.286

python处理数据可视化_数据整理101：使用Python提取，处理和可视化NBA数据_第3张图片

Following along in Rodeo? Your view should look something like this.

跟随牛仔竞技表演吗？您的视图应如下所示。

This looks interesting! We wanted EFG% (effective field goal percentage) in our original table, but it looks like we can get EFG% for open and covered shots. Let’s group ‘Open’ and ‘Wide Open’ together, along with ‘Tight’ and ‘Very Tight.’

这看起来很有趣！我们希望在原始表格中使用EFG％（有效投篮命中率），但看起来我们可以为公开和掩护投篮获得EFG％。让我们将“ Open”和“ Wide Open”以及“ Tight”和“ Very Tight”分组在一起。

Effective field goal percentage is a statistic that adjusts field goal percentage to account for the fact that three-point field goals count for three points while field goals only count for two points:

有效投篮命中率是一种统计数据，它会调整投篮命中率，以说明三分投篮命中占3分而投篮命中仅占2分这一事实：

This might help answer questions like “Do teams hit more open shots when they win?”

这可能有助于回答“团队获胜时会打更多空位吗？”之类的问题。

df_grouped = knicks_shots.closest_defender_shooting()

df_grouped['OPEN'] = df_grouped['CLOSE_DEF_DIST_RANGE'].map(lambda x : True if 'Open' in x else False)
##This creates a new column  OPEN,  mapped from the 'CLOSE_DEF_DIST_RANGE' column.
##http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.map.html


df_grouped

df_grouped = knicks_shots.closest_defender_shooting()

df_grouped['OPEN'] = df_grouped['CLOSE_DEF_DIST_RANGE'].map(lambda x : True if 'Open' in x else False)
##This creates a new column  OPEN,  mapped from the 'CLOSE_DEF_DIST_RANGE' column.
##http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.map.html


df_grouped

		TEAM_ID	TEAM_ID	TEAM_NAME	队名	SORT_ORDER	排序	G	G	CLOSE_DEF_DIST_RANGE	CLOSE_DEF_DIST_RANGE	FGA_FREQUENCY	FGA_FREQUENCY	FGM	女性外阴残割	FGA	FGA	FG_PCT	FG_PCT	EFG_PCT	EFG_PCT	FG2A_FREQUENCY	FG2A_FREQUENCY	FG2M	FG2M	FG2A	FG2A	FG2_PCT	FG2_PCT	FG3A_FREQUENCY	FG3A_FREQUENCY	FG3M	FG3M	FG3A	FG3A	FG3_PCT	FG3_PCT	OPEN	打开
0	0	1610612752	1610612752	New York Knicks	纽约尼克斯	1	1个	1	1个	0-2 Feet – Very Tight	0-2英尺–非常紧	0.091	0.091	4.0	4.0	8.0	8.0	0.500	0.500	0.500	0.500	0.091	0.091	4.0	4.0	8.0	8.0	0.500	0.500	0.000	0.000	0.0	0.0	0.0	0.0	NaN	N	False	假
1	1个	1610612752	1610612752	New York Knicks	纽约尼克斯	2	2	1	1个	2-4 Feet – Tight	2-4英尺–紧	0.318	0.318	15.0	15.0	28.0	28.0	0.536	0.536	0.536	0.536	0.295	0.295	15.0	15.0	26.0	26.0	0.577	0.577	0.023	0.023	0.0	0.0	2.0	2.0	0.000	0.000	False	假
2	2	1610612752	1610612752	New York Knicks	纽约尼克斯	3	3	1	1个	4-6 Feet – Open	4-6英尺–开放	0.409	0.409	16.0	16.0	36.0	36.0	0.444	0.444	0.500	0.500	0.250	0.250	12.0	12.0	22.0	22.0	0.545	0.545	0.159	0.159	4.0	4.0	14.0	14.0	0.286	0.286	True	真正
3	3	1610612752	1610612752	New York Knicks	纽约尼克斯	4	4	1	1个	6+ Feet – Wide Open	6英尺以上-张开	0.182	0.182	7.0	7.0	16.0	16.0	0.438	0.438	0.500	0.500	0.102	0.102	5.0	5.0	9.0	9.0	0.556	0.556	0.080	0.080	2.0	2.0	7.0	7.0	0.286	0.286	True	真正

The last column ‘OPEN’ gives us the information we need. Now we can aggregate based off of it. Let’s get the total number of open shots.

最后一列“ OPEN”为我们提供了我们所需的信息。现在我们可以基于它进行聚合。让我们获取打开镜头的总数。

That looks like it worked. Similarly, we can get the total number of “covered” shots taken (looks like it’s a lot higher…nothing surprising there.)

看起来很有效。同样，我们可以获得已拍摄的“被覆盖”镜头的总数（看起来要高很多……不足为奇）。

Keep in mind, this is a bit misleading, as layups and other shots near the basket are more likely to have a nearby defender.

请记住，这有点误导，因为篮筐附近的上篮得分和其他投篮更有可能在附近有后卫。

Referring to the definition for EFG%:

参考EFG％的定义：

$$EFG = frac{(FGM + .5 * 3PM)}{FGA}$$

$$ EFG = frac {（FGM + .5 * 3PM）} {FGA} $$

We definitely have all the information we need to compute this for open and covered shots:

我们肯定拥有计算公开和掩饰照片所需的所有信息：

#Mapping the formula above into a column:
open_efg = (df_grouped.loc[df_grouped['OPEN']== True, 'FGM'].sum() + (.5 * df_grouped.loc[df_grouped['OPEN']== True, 'FG3M'].sum()))/(df_grouped.loc[df_grouped['OPEN']== True, 'FGA'].sum())
covered_efg = (df_grouped.loc[df_grouped['OPEN']== False, 'FGM'].sum() + (.5 * df_grouped.loc[df_grouped['OPEN']== False, 'FG3M'].sum()))/(df_grouped.loc[df_grouped['OPEN']== False, 'FGA'].sum())

print open_efg
print covered_efg


0.5
0.527777777778

#Mapping the formula above into a column:
open_efg = (df_grouped.loc[df_grouped['OPEN']== True, 'FGM'].sum() + (.5 * df_grouped.loc[df_grouped['OPEN']== True, 'FG3M'].sum()))/(df_grouped.loc[df_grouped['OPEN']== True, 'FGA'].sum())
covered_efg = (df_grouped.loc[df_grouped['OPEN']== False, 'FGM'].sum() + (.5 * df_grouped.loc[df_grouped['OPEN']== False, 'FG3M'].sum()))/(df_grouped.loc[df_grouped['OPEN']== False, 'FGA'].sum())

print open_efg
print covered_efg


0.5
0.527777777778

Interesting… shooting better when there’s a defender nearby makes it look like there’s more to the story. Then again, nothing about the Knicks ever seems to makes sense.

有趣的是……在附近有后卫的情况下拍摄得更好，这使故事看起来还有更多。再说一次，关于尼克斯的一切似乎都没有道理。

Referring back to the original plan, it looks like we have most of the stats we set out to get. However, we still haven’t addressed:

回到最初的计划，看起来我们已经有了大部分的统计数据。但是，我们仍然没有解决：

1）与谁比赛？谁赢了？ (1) Who was the game against? Who won?)

2）每个团队休息了几天？ (2) How many days rest did each team have?)

3）我们如何将所有这些数据汇总在一起？ (3) How are we going to get all this data together?)

From the looks of it, there isn’t anything in the nba_py team modules we’re using that can be directly used as an identifier.

从外观上看，我们正在使用的nba_py团队模块中没有任何可直接用作标识符的内容。

However, it looks like we can get stats for date ranges. To test this, let’s look at a single game the Knicks played on Sunday, January 29th:

但是，看来我们可以获得日期范围的统计信息。为了测试这一点，让我们看一下尼克斯队在1月29日星期日进行的一场比赛：

		TEAM_ID	TEAM_ID	TEAM_NAME	队名	SORT_ORDER	排序	G	G	CLOSE_DEF_DIST_RANGE	CLOSE_DEF_DIST_RANGE	FGA_FREQUENCY	FGA_FREQUENCY	FGM	女性生殖器	FGA	FGA	FG_PCT	FG_PCT	EFG_PCT	EFG_PCT	FG2A_FREQUENCY	FG2A_FREQUENCY	FG2M	FG2M	FG2A	FG2A	FG2_PCT	FG2_PCT	FG3A_FREQUENCY	FG3A_FREQUENCY	FG3M	FG3M	FG3A	FG3A	FG3_PCT	FG3_PCT
0	0	1610612752	1610612752	New York Knicks	纽约尼克斯	1	1个	1	1个	0-2 Feet – Very Tight	0-2英尺–非常紧	0.156	0.156	6.0	6.0	20.0	20.0	0.300	0.300	0.300	0.300	0.148	0.148	6.0	6.0	19.0	19.0	0.316	0.316	0.008	0.008	0.0	0.0	1.0	1.0	0.000	0.000
1	1个	1610612752	1610612752	New York Knicks	纽约尼克斯	2	2	1	1个	2-4 Feet – Tight	2-4英尺–紧	0.344	0.344	23.0	23.0	44.0	44.0	0.523	0.523	0.591	0.591	0.258	0.258	17.0	17.0	33.0	33.0	0.515	0.515	0.086	0.086	6.0	6.0	11.0	11.0	0.545	0.545
2	2	1610612752	1610612752	New York Knicks	纽约尼克斯	3	3	1	1个	4-6 Feet – Open	4-6英尺–开放	0.320	0.320	13.0	13.0	41.0	41.0	0.317	0.317	0.390	0.390	0.156	0.156	7.0	7.0	20.0	20.0	0.350	0.350	0.164	0.164	6.0	6.0	21.0	21.0	0.286	0.286
3	3	1610612752	1610612752	New York Knicks	纽约尼克斯	4	4	1	1个	6+ Feet – Wide Open	6英尺以上-张开	0.180	0.180	9.0	9.0	23.0	23.0	0.391	0.391	0.522	0.522	0.039	0.039	3.0	3.0	5.0	5.0	0.600	0.600	0.141	0.141	6.0	6.0	18.0	18.0	0.333	0.333

A quick check of the box score confirms that the Knicks shot a total of 128, so it looks like adding a date field will work out. We’ll just need to figure out which dates to pass in:

快速检查一下盒子得分，可以确认尼克斯一共打了128球，因此添加日期字段看起来很可行。我们只需要找出要传递的日期即可：

We still don’t know what the outcome was, so let’s jump back into the docs to see if another module will help out.

我们仍然不知道结果是什么，所以让我们跳回到文档中看看是否其他模块会有所帮助。

#Hitting another endpoint
knicks_log = team.TeamGameLogs(knicks_id)

knicks_log.info()

#Hitting another endpoint
knicks_log = team.TeamGameLogs(knicks_id)

knicks_log.info()

		Team_ID	Team_ID	Game_ID	Game_ID	GAME_DATE	GAME_DATE	MATCHUP	配对	WL	WL	W	w ^	L	大号	W_PCT	PCT	MIN	最低	FGM	女性外阴残割	…	…	FT_PCT	FT_PCT	OREB	OREB	DREB	DREB	REB	REB	AST	AST	STL	STL	BLK	黑色	TOV	TOV	PF	PF	PTS	PTS
0	0	1610612752	1610612752	0021601160	0021601160	APR 04, 2017	2017年4月4日	NYK vs. CHI	NYK vs.CHI	W	w ^	30	30	48	48	0.385	0.385	240	240	42	42	…	…	0.625	0.625	16	16	37	37	53	53	26	26	5	5	7	7	15	15	22	22	100	100
1	1个	1610612752	1610612752	0021601145	0021601145	APR 02, 2017	2017年4月2日	NYK vs. BOS	NYK与BOS	L	大号	29	29	48	48	0.377	0.377	240	240	33	33	…	…	0.840	0.840	8	8	24	24	32	32	20	20	12	12	2	2	11	11	20	20	94	94
2	2	1610612752	1610612752	0021601133	0021601133	MAR 31, 2017	2017年3月31日	NYK @ MIA	NYK @ MIA	W	w ^	29	29	47	47	0.382	0.382	240	240	38	38	…	…	0.941	0.941	8	8	31	31	39	39	25	25	9	9	5	5	14	14	18	18	98	98
3	3	1610612752	1610612752	0021601115	0021601115	MAR 29, 2017	2017年3月29日	NYK vs. MIA	NYK与MIA	L	大号	28	28	47	47	0.373	0.373	240	240	33	33	…	…	0.810	0.810	17	17	35	35	52	52	19	19	2	2	6	6	14	14	16	16	88	88
4	4	1610612752	1610612752	0021601098	0021601098	MAR 27, 2017	2017年3月27日	NYK vs. DET	NYK与DET	W	w ^	28	28	46	46	0.378	0.378	240	240	45	45	…	…	0.923	0.923	4	4	33	33	37	37	26	26	13	13	5	5	12	12	16	16	109	109
5	5	1610612752	1610612752	0021601085	0021601085	MAR 25, 2017	2017年3月25日	NYK @ SAS	NYK @ SAS	L	大号	27	27	46	46	0.370	0.370	240	240	41	41	…	…	0.867	0.867	12	12	33	33	45	45	24	24	6	6	5	5	16	16	16	16	98	98
6	6	1610612752	1610612752	0021601071	0021601071	MAR 23, 2017	2017年3月23日	NYK @ POR	NYK @ POR	L	大号	27	27	45	45	0.375	0.375	240	240	36	36	…	…	0.900	0.900	9	9	31	31	40	40	23	23	5	5	9	9	11	11	20	20	95	95
7	7	1610612752	1610612752	0021601066	0021601066	MAR 22, 2017	2017年3月22日	NYK @ UTA	纽约@UTA	L	大号	27	27	44	44	0.380	0.380	240	240	38	38	…	…	0.889	0.889	9	9	27	27	36	36	19	19	5	5	1	1个	11	11	26	26	101	101
8	8	1610612752	1610612752	0021601050	0021601050	MAR 20, 2017	2017年3月20日	NYK @ LAC	纽约@ LAC	L	大号	27	27	43	43	0.386	0.386	240	240	40	40	…	…	0.792	0.792	14	14	34	34	48	48	24	24	6	6	1	1个	12	12	19	19	105	105
9	9	1610612752	1610612752	0021601016	0021601016	MAR 16, 2017	2017年3月16日	NYK vs. BKN	NYK对阵BKN	L	大号	27	27	42	42	0.391	0.391	240	240	41	41	…	…	0.962	0.962	5	5	29	29	34	34	20	20	6	6	4	4	7	7	26	26	110	110
10	10	1610612752	1610612752	0021601001	0021601001	MAR 14, 2017	2017年3月14日	NYK vs. IND	NYK vs.IND	W	w ^	27	27	41	41	0.397	0.397	240	240	35	35	…	…	0.615	0.615	11	11	41	41	52	52	21	21	8	8	4	4	14	14	15	15	87	87
11	11	1610612752	1610612752	0021600986	0021600986	MAR 12, 2017	2017年3月12日	NYK @ BKN	NYK @ BKN	L	大号	26	26	41	41	0.388	0.388	240	240	39	39	…	…	0.813	0.813	11	11	32	32	43	43	22	22	5	5	8	8	9	9	20	20	112	112
12	12	1610612752	1610612752	0021600975	0021600975	MAR 11, 2017	2017年3月11日	NYK @ DET	NYK @ DET	L	大号	26	26	40	40	0.394	0.394	240	240	36	36	…	…	0.636	0.636	8	8	36	36	44	44	26	26	4	4	7	7	18	18	18	18	92	92
13	13	1610612752	1610612752	0021600952	0021600952	MAR 08, 2017	2017年3月8日	NYK @ MIL	NYK @ MIL	L	大号	26	26	39	39	0.400	0.400	240	240	39	39	…	…	0.667	0.667	10	10	33	33	43	43	22	22	4	4	6	6	15	15	20	20	93	93
14	14	1610612752	1610612752	0021600935	0021600935	MAR 06, 2017	2017年3月6日	NYK @ ORL	NYK @ ORL	W	w ^	26	26	38	38	0.406	0.406	240	240	40	40	…	…	0.964	0.964	12	12	33	33	45	45	26	26	6	6	1	1个	9	9	23	23	113	113
15	15	1610612752	1610612752	0021600928	0021600928	MAR 05, 2017	2017年3月5日	NYK vs. GSW	NYK与GSW	L	大号	25	25	38	38	0.397	0.397	240	240	39	39	…	…	0.800	0.800	12	12	35	35	47	47	18	18	5	5	6	6	15	15	20	20	105	105
16	16	1610612752	1610612752	0021600909	0021600909	MAR 03, 2017	2017年3月3日	NYK @ PHI	NYK @ PHI	L	大号	25	25	37	37	0.403	0.403	240	240	33	33	…	…	0.879	0.879	9	9	32	32	41	41	14	14	10	10	3	3	10	10	20	20	102	102
17	17	1610612752	1610612752	0021600895	0021600895	MAR 01, 2017	2017年3月1日	NYK @ ORL	NYK @ ORL	W	w ^	25	25	36	36	0.410	0.410	240	240	34	34	…	…	0.806	0.806	13	13	37	37	50	50	21	21	9	9	3	3	11	11	16	16	101	101
18	18	1610612752	1610612752	0021600882	0021600882	FEB 27, 2017	2017年2月27日	NYK vs. TOR	NYK vs.TOR	L	大号	24	24	36	36	0.400	0.400	240	240	33	33	…	…	0.842	0.842	8	8	32	32	40	40	17	17	10	10	6	6	17	17	19	19	91	91
19	19	1610612752	1610612752	0021600868	0021600868	FEB 25, 2017	2017年2月25日	NYK vs. PHI	NYK对战PHI	W	w ^	24	24	35	35	0.407	0.407	240	240	43	43	…	…	0.783	0.783	10	10	34	34	44	44	21	21	6	6	7	7	11	11	22	22	110	110
20	20	1610612752	1610612752	0021600853	0021600853	FEB 23, 2017	2017年2月23日	NYK @ CLE	NYK @ CLE	L	大号	23	23	35	35	0.397	0.397	240	240	42	42	…	…	0.706	0.706	16	16	34	34	50	50	24	24	4	4	7	7	12	12	19	19	104	104
21	21	1610612752	1610612752	0021600845	0021600845	FEB 15, 2017	2017年2月15日	NYK @ OKC	NYK @ OKC	L	大号	23	23	34	34	0.404	0.404	240	240	41	41	…	…	0.857	0.857	6	6	33	33	39	39	19	19	8	8	12	12	15	15	21	21	105	105
22	22	1610612752	1610612752	0021600817	0021600817	FEB 12, 2017	2017年2月12日	NYK vs. SAS	NYK与SAS	W	w ^	23	23	33	33	0.411	0.411	240	240	34	34	…	…	0.810	0.810	5	5	39	39	44	44	18	18	5	5	8	8	19	19	19	19	94	94
23	23	1610612752	1610612752	0021600800	0021600800	FEB 10, 2017	2017年2月10日	NYK vs. DEN	NYK对阵DEN	L	大号	22	22	33	33	0.400	0.400	240	240	52	52	…	…	0.600	0.600	10	10	23	23	33	33	36	36	10	10	5	5	10	10	14	14	123	123
24	24	1610612752	1610612752	0021600791	0021600791	FEB 08, 2017	2017年2月8日	NYK vs. LAC	纽约和洛杉矶	L	大号	22	22	32	32	0.407	0.407	240	240	46	46	…	…	0.833	0.833	12	12	29	29	41	41	25	25	9	9	5	5	11	11	22	22	115	115
25	25	1610612752	1610612752	0021600768	0021600768	FEB 06, 2017	2017年2月6日	NYK vs. LAL	NYK对阵LAL	L	大号	22	22	31	31	0.415	0.415	240	240	37	37	…	…	0.788	0.788	6	6	34	34	40	40	16	16	4	4	4	4	16	16	24	24	107	107
26	26	1610612752	1610612752	0021600759	0021600759	FEB 04, 2017	2017年2月4日	NYK vs. CLE	NYK与CLE	L	大号	22	22	30	30	0.423	0.423	240	240	39	39	…	…	0.500	0.500	13	13	29	29	42	42	23	23	9	9	7	7	10	10	20	20	104	104
27	27	1610612752	1610612752	0021600733	0021600733	FEB 01, 2017	2017年2月1日	NYK @ BKN	NYK @ BKN	W	w ^	22	22	29	29	0.431	0.431	240	240	35	35	…	…	0.613	0.613	21	21	37	37	58	58	23	23	16	16	7	7	13	13	18	18	95	95
28	28	1610612752	1610612752	0021600724	0021600724	JAN 31, 2017	2017年1月31日	NYK @ WAS	NYK @ WAS	L	大号	21	21	29	29	0.420	0.420	240	240	34	34	…	…	0.800	0.800	22	22	29	29	51	51	18	18	8	8	2	2	12	12	17	17	101	101
29	29	1610612752	1610612752	0021600711	0021600711	JAN 29, 2017	2017年1月29日	NYK @ ATL	NYK @ ATL	L	大号	21	21	28	28	0.429	0.429	340	340	51	51	…	…	0.826	0.826	15	15	48	48	63	63	32	32	9	9	11	11	12	12	39	39	139	139
…	…	…	…	…	…	…	…	…	…	…	…	…	…	…	…	…	…	…	…	…	…	…	…	…	…	…	…	…	…	…	…	…	…	…	…	…	…	…	…	…	…	…	…
48	48	1610612752	1610612752	0021600456	0021600456	DEC 25, 2016	2016年12月25日	NYK vs. BOS	NYK与BOS	L	大号	16	16	14	14	0.533	0.533	240	240	41	41	…	…	0.889	0.889	17	17	32	32	49	49	11	11	5	5	6	6	17	17	23	23	114	114
49	49	1610612752	1610612752	0021600438	0021600438	DEC 22, 2016	2016年12月22日	NYK vs. ORL	NYK与ORL	W	w ^	16	16	13	13	0.552	0.552	240	240	41	41	…	…	0.882	0.882	18	18	34	34	52	52	26	26	9	9	9	9	15	15	18	18	106	106
50	50	1610612752	1610612752	0021600421	0021600421	DEC 20, 2016	2016年12月20日	NYK vs. IND	NYK vs.IND	W	w ^	15	15	13	13	0.536	0.536	240	240	44	44	…	…	0.810	0.810	4	4	41	41	45	45	24	24	6	6	8	8	14	14	18	18	118	118
51	51	1610612752	1610612752	0021600404	0021600404	DEC 17, 2016	2016年12月17日	NYK @ DEN	纽约@DEN	L	大号	14	14	13	13	0.519	0.519	240	240	35	35	…	…	0.923	0.923	9	9	26	26	35	35	18	18	6	6	4	4	11	11	27	27	114	114
52	52	1610612752	1610612752	0021600388	0021600388	DEC 15, 2016	2016年12月15日	NYK @ GSW	NYK @ GSW	L	大号	14	14	12	12	0.538	0.538	240	240	38	38	…	…	0.474	0.474	14	14	35	35	49	49	19	19	10	10	5	5	11	11	10	10	90	90
53	53	1610612752	1610612752	0021600372	0021600372	DEC 13, 2016	2016年12月13日	NYK @ PHX	NYK @ PHX	L	大号	14	14	11	11	0.560	0.560	265	265	38	38	…	…	0.737	0.737	11	11	32	32	43	43	23	23	10	10	4	4	13	13	27	27	111	111
54	54	1610612752	1610612752	0021600360	0021600360	DEC 11, 2016	2016年12月11日	NYK @ LAL	NYK @ LAL	W	w ^	14	14	10	10	0.583	0.583	240	240	41	41	…	…	0.839	0.839	8	8	36	36	44	44	21	21	9	9	11	11	10	10	15	15	118	118
55	55	1610612752	1610612752	0021600345	0021600345	DEC 09, 2016	2016年12月9日	NYK @ SAC	NYK @ SAC	W	w ^	13	13	10	10	0.565	0.565	240	240	36	36	…	…	0.840	0.840	12	12	42	42	54	54	22	22	3	3	6	6	16	16	25	25	103	103
56	56	1610612752	1610612752	0021600327	0021600327	DEC 07, 2016	2016年12月7日	NYK vs. CLE	NYK与CLE	L	大号	12	12	10	10	0.545	0.545	240	240	35	35	…	…	0.867	0.867	13	13	30	30	43	43	22	22	6	6	3	3	16	16	22	22	94	94
57	57	1610612752	1610612752	0021600316	0021600316	DEC 06, 2016	2016年12月6日	NYK @ MIA	NYK @ MIA	W	w ^	12	12	9	9	0.571	0.571	240	240	48	48	…	…	0.688	0.688	18	18	35	35	53	53	22	22	6	6	6	6	10	10	18	18	114	114
58	58	1610612752	1610612752	0021600302	0021600302	DEC 04, 2016	2016年12月4日	NYK vs. SAC	NYK vs.SAC	W	w ^	11	11	9	9	0.550	0.550	240	240	39	39	…	…	0.708	0.708	14	14	44	44	58	58	20	20	5	5	10	10	18	18	25	25	106	106
59	59	1610612752	1610612752	0021600285	0021600285	DEC 02, 2016	2016年12月2日	NYK vs. MIN	NYK vs.MIN	W	w ^	10	10	9	9	0.526	0.526	240	240	41	41	…	…	0.800	0.800	11	11	32	32	43	43	26	26	9	9	6	6	15	15	18	18	118	118
60	60	1610612752	1610612752	0021600271	0021600271	NOV 30, 2016	2016年11月30日	NYK @ MIN	NYK @ MIN	W	w ^	9	9	9	9	0.500	0.500	240	240	41	41	…	…	0.733	0.733	12	12	27	27	39	39	24	24	8	8	3	3	13	13	26	26	106	106
61	61	1610612752	1610612752	0021600255	0021600255	NOV 28, 2016	2016年11月28日	NYK vs. OKC	NYK对阵OKC	L	大号	8	8	9	9	0.471	0.471	240	240	36	36	…	…	0.893	0.893	12	12	28	28	40	40	20	20	9	9	11	11	5	5	16	16	103	103
62	62	1610612752	1610612752	0021600241	0021600241	NOV 26, 2016	2016年11月26日	NYK @ CHA	NYK @ CHA	L	大号	8	8	8	8	0.500	0.500	240	240	37	37	…	…	0.800	0.800	11	11	36	36	47	47	26	26	9	9	6	6	8	8	27	27	102	102
63	63	1610612752	1610612752	0021600228	0021600228	NOV 25, 2016	2016年11月25日	NYK vs. CHA	NYK对阵CHA	W	w ^	8	8	7	7	0.533	0.533	265	265	45	45	…	…	0.923	0.923	13	13	42	42	55	55	26	26	9	9	8	8	16	16	21	21	113	113
64	64	1610612752	1610612752	0021600208	0021600208	NOV 22, 2016	2016年11月22日	NYK vs. POR	NYK与POR	W	w ^	7	7	7	7	0.500	0.500	240	240	45	45	…	…	1.000	1.000	10	10	33	33	43	43	26	26	8	8	5	5	13	13	23	23	107	107
65	65	1610612752	1610612752	0021600193	0021600193	NOV 20, 2016	2016年11月20日	NYK vs. ATL	NYK与ATL	W	w ^	6	6	7	7	0.462	0.462	240	240	42	42	…	…	0.714	0.714	11	11	39	39	50	50	21	21	8	8	1	1个	15	15	23	23	104	104
66	66	1610612752	1610612752	0021600169	0021600169	NOV 17, 2016	2016年11月17日	NYK @ WAS	NYK @ WAS	L	大号	5	5	7	7	0.417	0.417	240	240	41	41	…	…	0.900	0.900	10	10	26	26	36	36	23	23	9	9	1	1个	13	13	20	20	112	112
67	67	1610612752	1610612752	0021600162	0021600162	NOV 16, 2016	2016年11月16日	NYK vs. DET	NYK与DET	W	w ^	5	5	6	6	0.455	0.455	240	240	42	42	…	…	0.632	0.632	19	19	33	33	52	52	24	24	8	8	9	9	9	9	11	11	105	105
68	68	1610612752	1610612752	0021600146	0021600146	NOV 14, 2016	2016年11月14日	NYK vs. DAL	NYK对DAL	W	w ^	4	4	6	6	0.400	0.400	240	240	34	34	…	…	0.889	0.889	14	14	37	37	51	51	18	18	5	5	5	5	17	17	16	16	93	93
69	69	1610612752	1610612752	0021600131	0021600131	NOV 12, 2016	2016年11月12日	NYK @ TOR	NYK @ TOR	L	大号	3	3	6	6	0.333	0.333	240	240	44	44	…	…	0.750	0.750	17	17	32	32	49	49	19	19	3	3	2	2	16	16	23	23	107	107
70	70	1610612752	1610612752	0021600125	0021600125	NOV 11, 2016	2016年11月11日	NYK @ BOS	NYK @ BOS	L	大号	3	3	5	5	0.375	0.375	240	240	33	33	…	…	0.882	0.882	21	21	36	36	57	57	19	19	7	7	11	11	25	25	26	26	87	87
71	71	1610612752	1610612752	0021600106	0021600106	NOV 09, 2016	2016年11月9日	NYK vs. BKN	NYK对阵BKN	W	w ^	3	3	4	4	0.429	0.429	240	240	44	44	…	…	0.706	0.706	9	9	41	41	50	50	25	25	11	11	5	5	14	14	21	21	110	110
72	72	1610612752	1610612752	0021600087	0021600087	NOV 06, 2016	2016年11月6日	NYK vs. UTA	NYK vs.UTA	L	大号	2	2	4	4	0.333	0.333	240	240	42	42	…	…	0.895	0.895	10	10	29	29	39	39	18	18	8	8	5	5	12	12	26	26	109	109
73	73	1610612752	1610612752	0021600073	0021600073	NOV 04, 2016	2016年11月4日	NYK @ CHI	NYK @ CHI	W	w ^	2	2	3	3	0.400	0.400	240	240	46	46	…	…	0.762	0.762	11	11	29	29	40	40	32	32	7	7	2	2	5	5	23	23	117	117
74	74	1610612752	1610612752	0021600058	0021600058	NOV 02, 2016	2016年11月2日	NYK vs. HOU	NYK vs.侯	L	大号	1	1个	3	3	0.250	0.250	240	240	37	37	…	…	0.680	0.680	7	7	27	27	34	34	18	18	10	10	6	6	16	16	22	22	99	99
75	75	1610612752	1610612752	0021600050	0021600050	NOV 01, 2016	2016年11月1日	NYK @ DET	NYK @ DET	L	大号	1	1个	2	2	0.333	0.333	240	240	35	35	…	…	0.800	0.800	8	8	35	35	43	43	18	18	6	6	9	9	11	11	20	20	89	89
76	76	1610612752	1610612752	0021600028	0021600028	OCT 29, 2016	2016年10月29日	NYK vs. MEM	NYK与MEM	W	w ^	1	1个	1	1个	0.500	0.500	240	240	40	40	…	…	0.641	0.641	6	6	35	35	41	41	24	24	4	4	4	4	12	12	25	25	111	111
77	77	1610612752	1610612752	0021600001	0021600001	OCT 25, 2016	2016年10月25日	NYK @ CLE	NYK @ CLE	L	大号	0	0	1	1个	0.000	0.000	240	240	32	32	…	…	0.750	0.750	13	13	29	29	42	42	17	17	6	6	6	6	18	18	22	22	88	88

78 rows × 27 columns

78行×27列

Looks like this can be manipulated to get rest days:

看起来可以这样来休息一下：

		Team_ID	Team_ID	Game_ID	Game_ID	GAME_DATE	GAME_DATE	MATCHUP	配对	WL	WL	W	w ^	L	大号	W_PCT	PCT	MIN	最低	FGM	女性外阴残割	…	…	OREB	OREB	DREB	DREB	REB	REB	AST	AST	STL	STL	BLK	黑色	TOV	TOV	PF	PF	PTS	PTS	DAYS_REST	DAYS_REST
0	0	1610612752	1610612752	0021601160	0021601160	2017-04-04	2017-04-04	NYK vs. CHI	NYK vs.CHI	W	w ^	30	30	48	48	0.385	0.385	240	240	42	42	…	…	16	16	37	37	53	53	26	26	5	5	7	7	15	15	22	22	100	100	2 days	2天
1	1个	1610612752	1610612752	0021601145	0021601145	2017-04-02	2017-04-02	NYK vs. BOS	NYK与BOS	L	大号	29	29	48	48	0.377	0.377	240	240	33	33	…	…	8	8	24	24	32	32	20	20	12	12	2	2	11	11	20	20	94	94	2 days	2天
2	2	1610612752	1610612752	0021601133	0021601133	2017-03-31	2017-03-31	NYK @ MIA	NYK @ MIA	W	w ^	29	29	47	47	0.382	0.382	240	240	38	38	…	…	8	8	31	31	39	39	25	25	9	9	5	5	14	14	18	18	98	98	2 days	2天
3	3	1610612752	1610612752	0021601115	0021601115	2017-03-29	2017-03-29	NYK vs. MIA	NYK与MIA	L	大号	28	28	47	47	0.373	0.373	240	240	33	33	…	…	17	17	35	35	52	52	19	19	2	2	6	6	14	14	16	16	88	88	2 days	2天
4	4	1610612752	1610612752	0021601098	0021601098	2017-03-27	2017-03-27	NYK vs. DET	NYK与DET	W	w ^	28	28	46	46	0.378	0.378	240	240	45	45	…	…	4	4	33	33	37	37	26	26	13	13	5	5	12	12	16	16	109	109	2 days	2天

5 rows × 28 columns

5行×28列

df_game_log.dtypes




Team_ID                int64
Game_ID               object
GAME_DATE     datetime64[ns]
MATCHUP               object
WL                    object
W                      int64
L                      int64
W_PCT                float64
MIN                    int64
FGM                    int64
FGA                    int64
FG_PCT               float64
FG3M                   int64
FG3A                   int64
FG3_PCT              float64
FTM                    int64
FTA                    int64
FT_PCT               float64
OREB                   int64
DREB                   int64
REB                    int64
AST                    int64
STL                    int64
BLK                    int64
TOV                    int64
PF                     int64
PTS                    int64
DAYS_REST    timedelta64[ns]
dtype: object

df_game_log.dtypes




Team_ID                int64
Game_ID               object
GAME_DATE     datetime64[ns]
MATCHUP               object
WL                    object
W                      int64
L                      int64
W_PCT                float64
MIN                    int64
FGM                    int64
FGA                    int64
FG_PCT               float64
FG3M                   int64
FG3A                   int64
FG3_PCT              float64
FTM                    int64
FTA                    int64
FT_PCT               float64
OREB                   int64
DREB                   int64
REB                    int64
AST                    int64
STL                    int64
BLK                    int64
TOV                    int64
PF                     int64
PTS                    int64
DAYS_REST    timedelta64[ns]
dtype: object

Team_ID               int64
Game_ID              object
GAME_DATE    datetime64[ns]
MATCHUP              object
WL                   object
W                     int64
L                     int64
W_PCT               float64
MIN                   int64
FGM                   int64
FGA                   int64
FG_PCT              float64
FG3M                  int64
FG3A                  int64
FG3_PCT             float64
FTM                   int64
FTA                   int64
FT_PCT              float64
OREB                  int64
DREB                  int64
REB                   int64
AST                   int64
STL                   int64
BLK                   int64
TOV                   int64
PF                    int64
PTS                   int64
DAYS_REST           float64
dtype: object

Team_ID               int64
Game_ID              object
GAME_DATE    datetime64[ns]
MATCHUP              object
WL                   object
W                     int64
L                     int64
W_PCT               float64
MIN                   int64
FGM                   int64
FGA                   int64
FG_PCT              float64
FG3M                  int64
FG3A                  int64
FG3_PCT             float64
FTM                   int64
FTA                   int64
FT_PCT              float64
OREB                  int64
DREB                  int64
REB                   int64
AST                   int64
STL                   int64
BLK                   int64
TOV                   int64
PF                    int64
PTS                   int64
DAYS_REST           float64
dtype: object

This looks like we’ll get all the info for all games. We’ll start by appending the information for a single game and then try to do it for all dates:

看来我们将获得所有游戏的所有信息。我们将从为单个游戏添加信息开始，然后尝试在所有日期进行：

		Team_ID	Team_ID	Game_ID	Game_ID	GAME_DATE	GAME_DATE	MATCHUP	配对	WL	WL	W	w ^	L	大号	W_PCT	PCT	MIN	最低	FGM	女性外阴残割	…	…	OREB	OREB	DREB	DREB	REB	REB	AST	AST	STL	STL	BLK	黑色	TOV	TOV	PF	PF	PTS	PTS	DAYS_REST	DAYS_REST
0	0	1610612752	1610612752	0021601160	0021601160	2017-04-04	2017-04-04	NYK vs. CHI	NYK vs.CHI	W	w ^	30	30	48	48	0.385	0.385	240	240	42	42	…	…	16	16	37	37	53	53	26	26	5	5	7	7	15	15	22	22	100	100	2.0	2.0
1	1个	1610612752	1610612752	0021601145	0021601145	2017-04-02	2017-04-02	NYK vs. BOS	NYK与BOS	L	大号	29	29	48	48	0.377	0.377	240	240	33	33	…	…	8	8	24	24	32	32	20	20	12	12	2	2	11	11	20	20	94	94	2.0	2.0
2	2	1610612752	1610612752	0021601133	0021601133	2017-03-31	2017-03-31	NYK @ MIA	NYK @ MIA	W	w ^	29	29	47	47	0.382	0.382	240	240	38	38	…	…	8	8	31	31	39	39	25	25	9	9	5	5	14	14	18	18	98	98	2.0	2.0
3	3	1610612752	1610612752	0021601115	0021601115	2017-03-29	2017-03-29	NYK vs. MIA	NYK与MIA	L	大号	28	28	47	47	0.373	0.373	240	240	33	33	…	…	17	17	35	35	52	52	19	19	2	2	6	6	14	14	16	16	88	88	2.0	2.0
4	4	1610612752	1610612752	0021601098	0021601098	2017-03-27	2017-03-27	NYK vs. DET	NYK与DET	W	w ^	28	28	46	46	0.378	0.378	240	240	45	45	…	…	4	4	33	33	37	37	26	26	13	13	5	5	12	12	16	16	109	109	2.0	2.0

5 rows × 28 columns

5行×28列

#Get the dates from the game logs and pass them into the other functions:

dates = df_game_log['GAME_DATE']

print len(dates)


78

#Get the dates from the game logs and pass them into the other functions:

dates = df_game_log['GAME_DATE']

print len(dates)


78

game_info

game_info

		TEAM_ID	TEAM_ID	TEAM_NAME	队名	PASS_TYPE	PASS_TYPE	G	G	PASS_FROM	通行证	PASS_TEAMMATE_PLAYER_ID	PASS_TEAMMATE_PLAYER_ID	FREQUENCY	频率	PASS	通过	AST	AST	FGM	女性外阴残割	FGA	FGA	FG_PCT	FG_PCT	FG2M	FG2M	FG2A	FG2A	FG2_PCT	FG2_PCT	FG3M	FG3M	FG3A	FG3A	FG3_PCT	FG3_PCT	GAME_DATE	GAME_DATE
0	0	1610612752	1610612752	New York Knicks	纽约尼克斯	made	制作	1	1个	Baker, Ron	贝克，罗恩	1627758	1627758	0.238	0.238	67.0	67.0	6.0	6.0	7.0	7.0	18.0	18.0	0.389	0.389	5.0	5.0	14.0	14.0	0.357	0.357	2.0	2.0	4.0	4.0	0.5	0.5	2017-03-31	2017-03-31
1	1个	1610612752	1610612752	New York Knicks	纽约尼克斯	made	制作	1	1个	Vujacic, Sasha	萨沙武贾西奇	2756	2756	0.149	0.149	42.0	42.0	7.0	7.0	8.0	8.0	15.0	15.0	0.533	0.533	5.0	5.0	9.0	9.0	0.556	0.556	3.0	3.0	6.0	6.0	0.5	0.5	2017-03-31	2017-03-31
2	2	1610612752	1610612752	New York Knicks	纽约尼克斯	made	制作	1	1个	Hernangomez, Willy	威利·埃尔南戈梅斯	1626195	1626195	0.131	0.131	37.0	37.0	2.0	2.0	3.0	3.0	5.0	5.0	0.600	0.600	3.0	3.0	5.0	5.0	0.600	0.600	0.0	0.0	0.0	0.0	NaN	N	2017-03-31	2017-03-31
3	3	1610612752	1610612752	New York Knicks	纽约尼克斯	made	制作	1	1个	Porzingis, Kristaps	克里斯蒂安（Kristaps）Porzingis	204001	204001	0.113	0.113	32.0	32.0	3.0	3.0	4.0	4.0	9.0	9.0	0.444	0.444	4.0	4.0	7.0	7.0	0.571	0.571	0.0	0.0	2.0	2.0	0.0	0.0	2017-03-31	2017-03-31
4	4	1610612752	1610612752	New York Knicks	纽约尼克斯	made	制作	1	1个	Lee, Courtney	李·考特尼	201584	201584	0.106	0.106	30.0	30.0	1.0	1.0	4.0	4.0	6.0	6.0	0.667	0.667	4.0	4.0	5.0	5.0	0.800	0.800	0.0	0.0	1.0	1.0	0.0	0.0	2017-03-31	2017-03-31
5	5	1610612752	1610612752	New York Knicks	纽约尼克斯	made	制作	1	1个	O’Quinn, Kyle	奥奎恩，凯尔	203124	203124	0.096	0.096	27.0	27.0	3.0	3.0	3.0	3.0	5.0	5.0	0.600	0.600	3.0	3.0	5.0	5.0	0.600	0.600	0.0	0.0	0.0	0.0	NaN	N	2017-03-31	2017-03-31
6	6	1610612752	1610612752	New York Knicks	纽约尼克斯	made	制作	1	1个	Holiday, Justin	假日，贾斯汀	203200	203200	0.082	0.082	23.0	23.0	2.0	2.0	4.0	4.0	5.0	5.0	0.800	0.800	3.0	3.0	3.0	3.0	1.000	1.000	1.0	1.0	2.0	2.0	0.5	0.5	2017-03-31	2017-03-31
7	7	1610612752	1610612752	New York Knicks	纽约尼克斯	made	制作	1	1个	Randle, Chasson	查森·兰德尔	1626184	1626184	0.057	0.057	16.0	16.0	0.0	0.0	0.0	0.0	2.0	2.0	0.000	0.000	0.0	0.0	2.0	2.0	0.000	0.000	0.0	0.0	0.0	0.0	NaN	N	2017-03-31	2017-03-31
8	8	1610612752	1610612752	New York Knicks	纽约尼克斯	made	制作	1	1个	Ndour, Maurice	恩杜尔，莫里斯	1626254	1626254	0.028	0.028	8.0	8.0	1.0	1.0	1.0	1.0	2.0	2.0	0.500	0.500	1.0	1.0	2.0	2.0	0.500	0.500	0.0	0.0	0.0	0.0	NaN	N	2017-03-31	2017-03-31

df_sum.reset_index(level =  0,  inplace =  True)
df_sum

df_sum.reset_index(level =  0,  inplace =  True)
df_sum

		GAME_DATE	GAME_DATE	TEAM_ID	TEAM_ID	G	G	PASS_TEAMMATE_PLAYER_ID	PASS_TEAMMATE_PLAYER_ID	FREQUENCY	频率	PASS	通过	AST	AST	FGM	女性外阴残割	FGA	FGA	FG_PCT	FG_PCT	FG2M	FG2M	FG2A	FG2A	FG2_PCT	FG2_PCT	FG3M	FG3M	FG3A	FG3A	FG3_PCT	FG3_PCT
0	0	2017-03-31	2017-03-31	14495514768	14495514768	9	9	7321056	7321056	1.0	1.0	282.0	282.0	25.0	25.0	34.0	34.0	67.0	67.0	4.533	4.533	28.0	28.0	52.0	52.0	4.984	4.984	6.0	6.0	15.0	15.0	1.5	1.5

When we merge this row back up to the bigger dataframe, we can drop the columns we don’t need.

当我们将此行合并回更大的数据框时，我们可以删除不需要的列。

shot_info

shot_info

		TEAM_ID	TEAM_ID	TEAM_NAME	队名	SORT_ORDER	排序	G	G	CLOSE_DEF_DIST_RANGE	CLOSE_DEF_DIST_RANGE	FGA_FREQUENCY	FGA_FREQUENCY	FGM	女性外阴残割	FGA	FGA	FG_PCT	FG_PCT	EFG_PCT	EFG_PCT	FG2A_FREQUENCY	FG2A_FREQUENCY	FG2M	FG2M	FG2A	FG2A	FG2_PCT	FG2_PCT	FG3A_FREQUENCY	FG3A_FREQUENCY	FG3M	FG3M	FG3A	FG3A	FG3_PCT	FG3_PCT
0	0	1610612752	1610612752	New York Knicks	纽约尼克斯	1	1个	1	1个	0-2 Feet – Very Tight	0-2英尺–非常紧	0.200	0.200	8.0	8.0	16.0	16.0	0.500	0.500	0.500	0.500	0.200	0.200	8.0	8.0	16.0	16.0	0.500	0.500	0.000	0.000	0.0	0.0	0.0	0.0	NaN	N
1	1个	1610612752	1610612752	New York Knicks	纽约尼克斯	2	2	1	1个	2-4 Feet – Tight	2-4英尺–紧	0.450	0.450	15.0	15.0	36.0	36.0	0.417	0.417	0.417	0.417	0.425	0.425	15.0	15.0	34.0	34.0	0.441	0.441	0.025	0.025	0.0	0.0	2.0	2.0	0.0	0.0
2	2	1610612752	1610612752	New York Knicks	纽约尼克斯	3	3	1	1个	4-6 Feet – Open	4-6英尺–开放	0.263	0.263	11.0	11.0	21.0	21.0	0.524	0.524	0.619	0.619	0.138	0.138	7.0	7.0	11.0	11.0	0.636	0.636	0.125	0.125	4.0	4.0	10.0	10.0	0.4	0.4
3	3	1610612752	1610612752	New York Knicks	纽约尼克斯	4	4	1	1个	6+ Feet – Wide Open	6英尺以上-张开	0.088	0.088	4.0	4.0	7.0	7.0	0.571	0.571	0.714	0.714	0.038	0.038	2.0	2.0	3.0	3.0	0.667	0.667	0.050	0.050	2.0	2.0	4.0	4.0	0.5	0.5

df_sum

df_sum

		GAME_DATE	GAME_DATE	TEAM_ID	TEAM_ID	G	G	PASS_TEAMMATE_PLAYER_ID	PASS_TEAMMATE_PLAYER_ID	FREQUENCY	频率	PASS	通过	AST	AST	FGM	女性外阴残割	FGA	FGA	FG_PCT	FG_PCT	FG2M	FG2M	FG2A	FG2A	FG2_PCT	FG2_PCT	FG3M	FG3M	FG3A	FG3A	FG3_PCT	FG3_PCT	OPEN_SHOTS	OPEN_SHOTS	OPEN_EFG	OPEN_EFG	COVERED_EFG	COVERED_EFG
0	0	2017-03-31	2017-03-31	14495514768	14495514768	9	9	7321056	7321056	1.0	1.0	282.0	282.0	25.0	25.0	34.0	34.0	67.0	67.0	4.533	4.533	28.0	28.0	52.0	52.0	4.984	4.984	6.0	6.0	15.0	15.0	1.5	1.5	28.0	28.0	0.642857	0.642857	0.442308	0.442308

Now to append the columns we need back up. This is going to work like a SQL left-join.

现在要追加列，我们需要备份。这将像SQL左联接一样工作。

df_custom_boxscore.head(10)

df_custom_boxscore.head(10)

		Team_ID	Team_ID	Game_ID	Game_ID	GAME_DATE	GAME_DATE	MATCHUP	配对	WL	WL	W	w ^	L	大号	W_PCT	PCT	MIN	最低	FGM	女性外阴残割	…	…	BLK	黑色	TOV	TOV	PF	PF	PTS	PTS	DAYS_REST	DAYS_REST	PASS	通过	FG2M	FG2M	OPEN_SHOTS	OPEN_SHOTS	OPEN_EFG	OPEN_EFG	COVERED_EFG	COVERED_EFG
0	0	1610612752	1610612752	0021601160	0021601160	2017-04-04	2017-04-04	NYK vs. CHI	NYK vs.CHI	W	w ^	30	30	48	48	0.385	0.385	240	240	42	42	…	…	7	7	15	15	22	22	100	100	2.0	2.0	NaN	N	NaN	N	NaN	N	NaN	N	NaN	N
1	1个	1610612752	1610612752	0021601145	0021601145	2017-04-02	2017-04-02	NYK vs. BOS	NYK与BOS	L	大号	29	29	48	48	0.377	0.377	240	240	33	33	…	…	2	2	11	11	20	20	94	94	2.0	2.0	NaN	N	NaN	N	NaN	N	NaN	N	NaN	N
2	2	1610612752	1610612752	0021601133	0021601133	2017-03-31	2017-03-31	NYK @ MIA	NYK @ MIA	W	w ^	29	29	47	47	0.382	0.382	240	240	38	38	…	…	5	5	14	14	18	18	98	98	2.0	2.0	282.0	282.0	28.0	28.0	28.0	28.0	0.642857	0.642857	0.442308	0.442308
3	3	1610612752	1610612752	0021601115	0021601115	2017-03-29	2017-03-29	NYK vs. MIA	NYK与MIA	L	大号	28	28	47	47	0.373	0.373	240	240	33	33	…	…	6	6	14	14	16	16	88	88	2.0	2.0	NaN	N	NaN	N	NaN	N	NaN	N	NaN	N
4	4	1610612752	1610612752	0021601098	0021601098	2017-03-27	2017-03-27	NYK vs. DET	NYK与DET	W	w ^	28	28	46	46	0.378	0.378	240	240	45	45	…	…	5	5	12	12	16	16	109	109	2.0	2.0	NaN	N	NaN	N	NaN	N	NaN	N	NaN	N
5	5	1610612752	1610612752	0021601085	0021601085	2017-03-25	2017-03-25	NYK @ SAS	NYK @ SAS	L	大号	27	27	46	46	0.370	0.370	240	240	41	41	…	…	5	5	16	16	16	16	98	98	2.0	2.0	NaN	N	NaN	N	NaN	N	NaN	N	NaN	N
6	6	1610612752	1610612752	0021601071	0021601071	2017-03-23	2017-03-23	NYK @ POR	NYK @ POR	L	大号	27	27	45	45	0.375	0.375	240	240	36	36	…	…	9	9	11	11	20	20	95	95	1.0	1.0	NaN	N	NaN	N	NaN	N	NaN	N	NaN	N
7	7	1610612752	1610612752	0021601066	0021601066	2017-03-22	2017-03-22	NYK @ UTA	纽约@UTA	L	大号	27	27	44	44	0.380	0.380	240	240	38	38	…	…	1	1个	11	11	26	26	101	101	2.0	2.0	NaN	N	NaN	N	NaN	N	NaN	N	NaN	N
8	8	1610612752	1610612752	0021601050	0021601050	2017-03-20	2017-03-20	NYK @ LAC	纽约@ LAC	L	大号	27	27	43	43	0.386	0.386	240	240	40	40	…	…	1	1个	12	12	19	19	105	105	4.0	4.0	NaN	N	NaN	N	NaN	N	NaN	N	NaN	N
9	9	1610612752	1610612752	0021601016	0021601016	2017-03-16	2017-03-16	NYK vs. BKN	NYK对阵BKN	L	大号	27	27	42	42	0.391	0.391	240	240	41	41	…	…	4	4	7	7	26	26	110	110	2.0	2.0	NaN	N	NaN	N	NaN	N	NaN	N	NaN	N

10 rows × 33 columns

10行×33列

Looks like everything joined correctly for exactly the date we chose. Let’s make some modifications and then work on a script to join the rest of the dates.

看起来一切都在我们选择的日期正确地加入了。让我们进行一些修改，然后使用脚本将其余的日期合并在一起。

We should be good to go!

我们应该很好走！

Put all the steps above into a function:

将以上所有步骤放入函数中：

def custom_boxscore(roster_id):

    game_logs  = team.TeamGameLogs(roster_id)

    df_game_logs = game_logs.info()
    df_game_logs['GAME_DATE'] =  pd.to_datetime(df_game_logs['GAME_DATE'])
    df_game_logs['DAYS_REST'] =  df_game_logs['GAME_DATE'] - df_game_logs['GAME_DATE'].shift(-1)
    df_game_logs['DAYS_REST'] =  df_game_logs['DAYS_REST'].astype('timedelta64[D]')

    ##Just like before, that should get us the gamelogs we need and the rest days column

    ##Now to loop through the list of dates for our other stats

    ##This will build up a dataframe of the custom stats and join that to the gamelogs
    df_all =pd.DataFrame() ##blank dataframe

    dates = df_game_logs['GAME_DATE']

    for date in dates:

        game_info = team.TeamPassTracking(roster_id,  date_from=date, date_to=date).passes_made()
        game_info['GAME_DATE'] = date ## We need to append the date to this so we can  join back

        temp_df = game_info.groupby(['GAME_DATE']).sum()
        temp_df.reset_index(level =  0,  inplace =  True)

        ##now to get the shot info. For the most part, we're just reusing code we've already written
        open_info =  team.TeamShotTracking(roster_id,date_from =date,  date_to =  date).closest_defender_shooting()
        open_info['OPEN'] = open_info['CLOSE_DEF_DIST_RANGE'].map(lambda x: True if 'Open' in x else False)

        temp_df['OPEN_SHOTS'] = open_info.loc[open_info['OPEN'] == True, 'FGA'].sum()
        temp_df['OPEN_EFG']= (open_info.loc[open_info['OPEN']== True, 'FGM'].sum() + (.5 * open_info.loc[open_info['OPEN']== True, 'FG3M'].sum()))/(open_info.loc[open_info['OPEN']== True, 'FGA'].sum())
        temp_df['COVERED_EFG']= (open_info.loc[open_info['OPEN']== False, 'FGM'].sum() + (.5 * open_info.loc[open_info['OPEN']== False, 'FG3M'].sum()))/(open_info.loc[open_info['OPEN']== False, 'FGA'].sum())

        ##append this to our bigger dataframe
        df_all = df_all.append(temp_df)

    df_boxscore =  pd.merge(df_game_logs, df_all[['PASS', 'FG2M', 'FG2_PCT', 'OPEN_SHOTS', 'OPEN_EFG', 'COVERED_EFG']], how = 'left', left_on = df_game_logs['GAME_DATE'], right_on = df_all['GAME_DATE'])
    df_boxscore['PASS_AST'] = df_boxscore['PASS'] /  df_boxscore['AST']
    df_boxscore['RESULT'] = df_boxscore['WL'].map(lambda x: 1 if 'W' in x else 0 )

    return df_boxscore

def custom_boxscore(roster_id):

    game_logs  = team.TeamGameLogs(roster_id)

    df_game_logs = game_logs.info()
    df_game_logs['GAME_DATE'] =  pd.to_datetime(df_game_logs['GAME_DATE'])
    df_game_logs['DAYS_REST'] =  df_game_logs['GAME_DATE'] - df_game_logs['GAME_DATE'].shift(-1)
    df_game_logs['DAYS_REST'] =  df_game_logs['DAYS_REST'].astype('timedelta64[D]')

    ##Just like before, that should get us the gamelogs we need and the rest days column

    ##Now to loop through the list of dates for our other stats

    ##This will build up a dataframe of the custom stats and join that to the gamelogs
    df_all =pd.DataFrame() ##blank dataframe

    dates = df_game_logs['GAME_DATE']

    for date in dates:

        game_info = team.TeamPassTracking(roster_id,  date_from=date, date_to=date).passes_made()
        game_info['GAME_DATE'] = date ## We need to append the date to this so we can  join back

        temp_df = game_info.groupby(['GAME_DATE']).sum()
        temp_df.reset_index(level =  0,  inplace =  True)

        ##now to get the shot info. For the most part, we're just reusing code we've already written
        open_info =  team.TeamShotTracking(roster_id,date_from =date,  date_to =  date).closest_defender_shooting()
        open_info['OPEN'] = open_info['CLOSE_DEF_DIST_RANGE'].map(lambda x: True if 'Open' in x else False)

        temp_df['OPEN_SHOTS'] = open_info.loc[open_info['OPEN'] == True, 'FGA'].sum()
        temp_df['OPEN_EFG']= (open_info.loc[open_info['OPEN']== True, 'FGM'].sum() + (.5 * open_info.loc[open_info['OPEN']== True, 'FG3M'].sum()))/(open_info.loc[open_info['OPEN']== True, 'FGA'].sum())
        temp_df['COVERED_EFG']= (open_info.loc[open_info['OPEN']== False, 'FGM'].sum() + (.5 * open_info.loc[open_info['OPEN']== False, 'FG3M'].sum()))/(open_info.loc[open_info['OPEN']== False, 'FGA'].sum())

        ##append this to our bigger dataframe
        df_all = df_all.append(temp_df)

    df_boxscore =  pd.merge(df_game_logs, df_all[['PASS', 'FG2M', 'FG2_PCT', 'OPEN_SHOTS', 'OPEN_EFG', 'COVERED_EFG']], how = 'left', left_on = df_game_logs['GAME_DATE'], right_on = df_all['GAME_DATE'])
    df_boxscore['PASS_AST'] = df_boxscore['PASS'] /  df_boxscore['AST']
    df_boxscore['RESULT'] = df_boxscore['WL'].map(lambda x: 1 if 'W' in x else 0 )

    return df_boxscore

Let’s see if this worked:

让我们看看这是否有效：

df_knicks_box_scores.head(10)

df_knicks_box_scores.head(10)

		Team_ID	Team_ID	Game_ID	Game_ID	GAME_DATE	GAME_DATE	MATCHUP	配对	WL	WL	W	w ^	L	大号	W_PCT	PCT	MIN	最低	FGM	女性外阴残割	…	…	PTS	PTS	DAYS_REST	DAYS_REST	PASS	通过	FG2M	FG2M	FG2_PCT	FG2_PCT	OPEN_SHOTS	OPEN_SHOTS	OPEN_EFG	OPEN_EFG	COVERED_EFG	COVERED_EFG	PASS/ASSIST	通过/协助	RESULT	结果
0	0	1610612752	1610612752	0021600845	0021600845	2017-02-15	2017-02-15	NYK @ OKC	NYK @ OKC	L	大号	23	23	34	34	0.404	0.404	240	240	41	41	…	…	105	105	3.0	3.0	339.0	339.0	26.0	26.0	4.070	4.070	38.0	38.0	0.500000	0.500000	0.572917	0.572917	17.842105	17.842105	0	0
1	1个	1610612752	1610612752	0021600817	0021600817	2017-02-12	2017-02-12	NYK vs. SAS	NYK与SAS	W	w ^	23	23	33	33	0.411	0.411	240	240	34	34	…	…	94	94	2.0	2.0	261.0	261.0	22.0	22.0	4.882	4.882	28.0	28.0	0.750000	0.750000	0.437500	0.437500	14.500000	14.500000	1	1个
2	2	1610612752	1610612752	0021600800	0021600800	2017-02-10	2017-02-10	NYK vs. DEN	NYK对阵DEN	L	大号	22	22	33	33	0.400	0.400	240	240	52	52	…	…	123	123	2.0	2.0	313.0	313.0	31.0	31.0	5.733	5.733	46.0	46.0	0.652174	0.652174	0.638298	0.638298	8.694444	8.694444	0	0
3	3	1610612752	1610612752	0021600791	0021600791	2017-02-08	2017-02-08	NYK vs. LAC	纽约和洛杉矶	L	大号	22	22	32	32	0.407	0.407	240	240	46	46	…	…	115	115	2.0	2.0	336.0	336.0	36.0	36.0	4.981	4.981	47.0	47.0	0.542553	0.542553	0.544444	0.544444	13.440000	13.440000	0	0
4	4	1610612752	1610612752	0021600768	0021600768	2017-02-06	2017-02-06	NYK vs. LAL	NYK对阵LAL	L	大号	22	22	31	31	0.415	0.415	240	240	37	37	…	…	107	107	2.0	2.0	316.0	316.0	30.0	30.0	4.501	4.501	35.0	35.0	0.571429	0.571429	0.445652	0.445652	19.750000	19.750000	0	0
5	5	1610612752	1610612752	0021600759	0021600759	2017-02-04	2017-02-04	NYK vs. CLE	NYK与CLE	L	大号	22	22	30	30	0.423	0.423	240	240	39	39	…	…	104	104	3.0	3.0	308.0	308.0	21.0	21.0	3.457	3.457	46.0	46.0	0.510870	0.510870	0.486842	0.486842	13.391304	13.391304	0	0
6	6	1610612752	1610612752	0021600733	0021600733	2017-02-01	2017-02-01	NYK @ BKN	NYK @ BKN	W	w ^	22	22	29	29	0.431	0.431	240	240	35	35	…	…	95	95	1.0	1.0	305.0	305.0	23.0	23.0	4.150	4.150	37.0	37.0	0.297297	0.297297	0.435484	0.435484	13.260870	13.260870	1	1个
7	7	1610612752	1610612752	0021600724	0021600724	2017-01-31	2017-01-31	NYK @ WAS	NYK @ WAS	L	大号	21	21	29	29	0.420	0.420	240	240	34	34	…	…	101	101	2.0	2.0	293.0	293.0	24.0	24.0	5.245	5.245	41.0	41.0	0.317073	0.317073	0.460784	0.460784	16.277778	16.277778	0	0
8	8	1610612752	1610612752	0021600711	0021600711	2017-01-29	2017-01-29	NYK @ ATL	NYK @ ATL	L	大号	21	21	28	28	0.429	0.429	340	340	51	51	…	…	139	139	2.0	2.0	479.0	479.0	32.0	32.0	4.137	4.137	64.0	64.0	0.437500	0.437500	0.500000	0.500000	14.968750	14.968750	0	0
9	9	1610612752	1610612752	0021600699	0021600699	2017-01-27	2017-01-27	NYK vs. CHA	NYK对阵CHA	W	w ^	21	21	27	27	0.438	0.438	240	240	46	46	…	…	110	110	2.0	2.0	350.0	350.0	31.0	31.0	6.167	6.167	37.0	37.0	0.594595	0.594595	0.483051	0.483051	15.909091	15.909091	1	1个

10 rows × 36 columns

10行×36列

I’m going to throw in a safeguard against divide by 0 errors just in case. This is a really janky, ugly fix, but it’ll get the job done for the time being:

为了防万一，我将提出防止除以0错误的措施。这是一个非常棘手的丑陋修复程序，但是暂时可以完成工作：

    df_knicks_box_scores = custom_boxscore(knicks_id)

    df_knicks_box_scores = custom_boxscore(knicks_id)

Awesome! Looks like everything came out okay. With a team_id, we can do this with every team. We just need a get a list of team_ids and team names.

太棒了！看起来一切都顺利了。使用team_id，我们可以对每个团队执行此操作。我们只需要获取team_id和团队名称的列表即可。

From the documentation:

从文档中：

http://nba-py.readthedocs.io/en/0.1a2/nba_py/

df_teams.head()

df_teams.head()

		TEAM	球队	TEAM_ID	TEAM_ID
0	0	Boston	波斯顿	1610612738	1610612738
1	1个	Cleveland	克利夫兰	1610612739	1610612739
2	2	Toronto	多伦多	1610612761	1610612761
3	3	Washington	华盛顿州	1610612764	1610612764
4	4	Milwaukee	密尔沃基	1610612749	1610612749

Now we can pass in the team IDs to create custom boxscores for all teams.

现在，我们可以传入团队ID来为所有团队创建自定义Boxscore。

teams = df_teams['TEAM']
roster_ids = df_teams['TEAM_ID']

teams = df_teams['TEAM']
roster_ids = df_teams['TEAM_ID']

Just feed in these two arrays into the function and we should be good to go.

只需将这两个数组输入函数中，我们就应该做好了。

I went ahead and did this for a few teams. The NBA’s website cuts you off if you make too many requests too quickly (hence all the sleep statements above).

我继续前进，并为一些团队做到了这一点。如果您提出太多请求太快（因此上面的所有睡眠声明），NBA网站都会拒绝您。

After fiddling with it for a while, I was finally able to get the data for each team. You might have to run the code above piece by piece, or just use the CSVs here:

经过一段时间的摆弄，我终于能够获得每个团队的数据。您可能需要逐段运行代码，或仅在此处使用CSV：

可视化 (Visualization)

Let’s see if we can visually represent anything about each team’s offense:

让我们看看我们是否可以直观地代表每支球队的进攻：

These visualizations are going to be done in Plotly because I think it’s the best vizualiation library out there for quickly and easily making graphs that are both visually appleaing and interactive, but feel free to use something else (although I can’t imagine why you would).

这些可视化将在Plotly中完成，因为我认为它是最好的vizualiation库，可以快速轻松地制作既可视化又可交互的图形，但可以随意使用其他图形（尽管我无法想象为什么会这样））。

This is going to break my heart a bit…but lets compare some of the stats fetched between the Knicks and teams that aren’t the Knicks.

这会让我有些伤心……但是让我们比较一下尼克斯队和不是尼克斯队的球员之间取得的一些数据。

Something about not counting another man’s money right?

关于不算另一个人的钱的事情对吗？

import plotly.plotly as py
import plotly.graph_objs as go

trace0 = go.Box(
    y=knicks['PASS'],
    name='Knicks',
    boxmean='sd'
)
trace1 = go.Box(
    y=spurs['PASS'],
    name='Spurs',
    boxmean='sd'
)
trace2 = go.Box(
    y=warriors['PASS'],
    name='Warriors',
    boxmean='sd'
)
trace3 = go.Box(
    y=thunder['PASS'],
    name='Thunder',
    boxmean='sd'
)
trace4 = go.Box(
    y=celtics['PASS'],
    name='Celtics',
    boxmean='sd'
)

layout = go.Layout(
    title='Passing Box Plot',
)
data = [trace0, trace1, trace2, trace3, trace4]

fig = go.Figure(data=data, layout=layout)
py.iplot(fig)

import plotly.plotly as py
import plotly.graph_objs as go

trace0 = go.Box(
    y=knicks['PASS'],
    name='Knicks',
    boxmean='sd'
)
trace1 = go.Box(
    y=spurs['PASS'],
    name='Spurs',
    boxmean='sd'
)
trace2 = go.Box(
    y=warriors['PASS'],
    name='Warriors',
    boxmean='sd'
)
trace3 = go.Box(
    y=thunder['PASS'],
    name='Thunder',
    boxmean='sd'
)
trace4 = go.Box(
    y=celtics['PASS'],
    name='Celtics',
    boxmean='sd'
)

layout = go.Layout(
    title='Passing Box Plot',
)
data = [trace0, trace1, trace2, trace3, trace4]

fig = go.Figure(data=data, layout=layout)
py.iplot(fig)

Ignoring the outlier from the 4OT Knicks-Hawks game, this graph is pretty telling. Obviously this isn’t the full story, but it looks like the Spurs and Thunder play pretty consistent but different offenses . What’s really interesting is that despite the Spurs and Warriors having coaches and systems that emphasize ball movement, they throw FEWER passes than a team like the Knicks.

忽略4OT Knicks-Hawks游戏中的异常值，此图非常清楚。显然这还不是完整的故事，但看起来马刺和雷霆的表现相当一致，但进攻方式却不同。真正有趣的是，尽管马刺和勇士拥有强调球运动的教练和系统，但与尼克斯这样的球队相比，他们拥有更多的传球机会。

Let’s look at if those passes translate to assists:

让我们看看这些通行证是否可以转化为助攻：

These two graphs in conjunction are pretty telling. On average, it takes the Knicks almost 2 more passes than the Spurs, and 5 more than the Warriors to get an assist.

将这两个图结合起来很不错。平均而言，尼克斯需要比马刺多2次传球，比勇士多5次才能得到助攻。

Going by this graph every, ~10th pass the Warriors make results in an assist. From the previous graph, we see that they make an average of 313 passes a game. This almost lines up with their season average of roughly 31 assists/game.

每次经过这个图表，勇士队都会在第10次传球中获得助攻。从上一张图表中，我们可以看到他们平均每局进行313次传球。这几乎与他们每赛季约31次助攻的赛季平均水平相符。

The standard deviation of the above graph can be interpreted in a few ways. On one hand, it’s a loose metric of playstyle consistency; teams that play the same way through the entire game are probably going to have a lower standard deviation than teams who pass the ball for 3 quarters and forget to in the 4th (cough cough, New York, cough cough). On the other hand, teams might have different playstyles depending on the lineups they have on the floor, resulting in a higher standard deviation (Spurs).

上图的标准偏差可以通过几种方式解释。一方面，这是游戏风格一致性的宽松指标；在整个比赛中以相同方式进行比赛的球队，其标准偏差可能会比传球连续3个季度又忘记第4位的球员（咳嗽，纽约，咳嗽）的标准偏差要低。另一方面，球队可能会有不同的打法，这取决于他们在场上的阵容，从而导致更高的标准差（马刺）。

The Thunder probably fall into this, most likely due to Russell Westbrook averaging over 10 of the team’s total 20 assists per game.

雷霆队很可能落入这个位置，这很可能是由于拉塞尔·威斯布鲁克平均每场20次助攻中超过10次。

Obviously, there’s a lot more to the story. How many passes led to FTs? Is there any correlation between passes per assist and wins? If anything, stats like these tell you more about what kind of offense a team runs, not how effectively they run it.

显然，这个故事还有很多。有多少通行证导致了FT？每个助攻的传球次数和获胜次数之间有相关性吗？如果有的话，这些统计信息可以告诉您更多有关团队进攻的类型，而不是他们如何有效地进攻。

Now let’s see if there’s any noticeable difference in wins vs losses:

现在，我们来看看胜利与失败之间是否存在明显差异：

import plotly.plotly as py
import plotly.graph_objs as go

trace0 = go.Box(
    y=knicks.loc[knicks['WL'] == 'W']['PASS_AST'],
    name='Knicks Wins',
    boxmean='sd'
)
trace1 = go.Box(
    y=knicks.loc[knicks['WL'] == 'L']['PASS_AST'],
    name='Knicks Loss',
    boxmean='sd'
)
trace2 = go.Box(
    y=spurs.loc[spurs['WL'] == 'W']['PASS_AST'],
    name='Spurs Wins',
    boxmean='sd'
)
trace3 = go.Box(
    y=spurs.loc[spurs['WL'] == 'L']['PASS_AST'],
    name='Spurs Loss',
    boxmean='sd'

)
trace4 = go.Box(
    y=warriors.loc[warriors['WL'] == 'W']['PASS_AST'],
    name='Warriors Wins',
    boxmean='sd'
)
trace5 = go.Box(
    y=warriors.loc[warriors['WL'] == 'L']['PASS_AST'],
    name='Warriors Losses',
    boxmean='sd'
)
trace6 = go.Box(
    y=thunder.loc[thunder['WL'] == 'W']['PASS_AST'],
    name='Thunder Wins',
    boxmean='sd'
)
trace7 = go.Box(
    y=thunder.loc[thunder['WL'] == 'L']['PASS_AST'],
    name='Thunder Losses',
    boxmean='sd'
)
trace8 = go.Box(
    y=celtics.loc[celtics['WL'] == 'W']['PASS_AST'],
    name='Celtics Wins',
    boxmean='sd'
)
trace9 = go.Box(
    y=celtics.loc[celtics['WL'] == 'L']['PASS_AST'],
    name='Celtics Lossses',
    boxmean='sd'
)
layout = go.Layout(
    title='Passes per Assist in Wins vs Losses',
)
data = [trace0, trace1, trace2, trace3, trace4, trace5, trace6, trace7, trace8, trace9]
fig = go.Figure(data=data, layout=layout)
py.iplot(fig)

import plotly.plotly as py
import plotly.graph_objs as go

trace0 = go.Box(
    y=knicks.loc[knicks['WL'] == 'W']['PASS_AST'],
    name='Knicks Wins',
    boxmean='sd'
)
trace1 = go.Box(
    y=knicks.loc[knicks['WL'] == 'L']['PASS_AST'],
    name='Knicks Loss',
    boxmean='sd'
)
trace2 = go.Box(
    y=spurs.loc[spurs['WL'] == 'W']['PASS_AST'],
    name='Spurs Wins',
    boxmean='sd'
)
trace3 = go.Box(
    y=spurs.loc[spurs['WL'] == 'L']['PASS_AST'],
    name='Spurs Loss',
    boxmean='sd'

)
trace4 = go.Box(
    y=warriors.loc[warriors['WL'] == 'W']['PASS_AST'],
    name='Warriors Wins',
    boxmean='sd'
)
trace5 = go.Box(
    y=warriors.loc[warriors['WL'] == 'L']['PASS_AST'],
    name='Warriors Losses',
    boxmean='sd'
)
trace6 = go.Box(
    y=thunder.loc[thunder['WL'] == 'W']['PASS_AST'],
    name='Thunder Wins',
    boxmean='sd'
)
trace7 = go.Box(
    y=thunder.loc[thunder['WL'] == 'L']['PASS_AST'],
    name='Thunder Losses',
    boxmean='sd'
)
trace8 = go.Box(
    y=celtics.loc[celtics['WL'] == 'W']['PASS_AST'],
    name='Celtics Wins',
    boxmean='sd'
)
trace9 = go.Box(
    y=celtics.loc[celtics['WL'] == 'L']['PASS_AST'],
    name='Celtics Lossses',
    boxmean='sd'
)
layout = go.Layout(
    title='Passes per Assist in Wins vs Losses',
)
data = [trace0, trace1, trace2, trace3, trace4, trace5, trace6, trace7, trace8, trace9]
fig = go.Figure(data=data, layout=layout)
py.iplot(fig)

Apart from the Celtics, every team had to make more at least 1 more pass to get an assist in games they lost compared to games they lost. In one way, it’s almost like they have to “work harder” for assists.

除凯尔特人队外，每支球队必须多丢至少1个传球才能获得与输掉的比赛相比的帮助。在某种程度上，这几乎就像他们必须“更加努力”获得帮助。

From the looks of this graph, the Warriors offense when its firing on all cylinders is a in a league of its own.

从该图的外观来看，勇士在进攻所有汽缸时都属于自己的进攻。

Just to reiterate once again, the purpose of the visualizations above is to ask, not answer questions.

再次重申一下，以上可视化的目的是询问而不是回答问题。

But now, let’s see if we can get any team specific insights from any of this:

但是现在，让我们看看是否可以从以下任何一个方面获得任何针对团队的见解：

The Clippers have played without Chris Paul and Blake Griffin, two of the best passers at their position in the league.

快船队没有克里斯·保罗和布雷克·格里芬，他们是联盟中最好的两个传球手。

Do the boxscores show how their offense has had to adjust?

方块分数是否显示他们的进攻情况如何调整？

/home/virajparekh/anaconda2/lib/python2.7/site-packages/ipykernel/__main__.py:2: FutureWarning:

sort(columns=....) is deprecated, use sort_values(by=.....)

/home/virajparekh/anaconda2/lib/python2.7/site-packages/ipykernel/__main__.py:2: FutureWarning:

sort(columns=....) is deprecated, use sort_values(by=.....)

		Unnamed: 0	未命名：0	Team_ID	Team_ID	Game_ID	Game_ID	GAME_DATE	GAME_DATE	MATCHUP	配对	WL	WL	W	w ^	L	大号	W_PCT	PCT	MIN	最低	…	…	DAYS_REST	DAYS_REST	PASS	通过	FG2M	FG2M	FG2_PCT	FG2_PCT	OPEN_SHOTS	OPEN_SHOTS	COVERED_SHOTS	COVERED_SHOTS	OPEN_EFG	OPEN_EFG	COVERED_EFG	COVERED_EFG	PASS_AST	通行证	RESULT	结果
77	77	53.0	53.0	1610612746	1610612746	21600017	21600017	2016-10-27	2016-10-27	LAC @ POR	LAC @ POR	W	w ^	1	1个	0	0	1.00	1.00	240	240	…	…	NaN	N	301.0	301.0	21.0	21.0	4.047	4.047	41.0	41.0	50.0	50.0	0.463415	0.463415	0.440000	0.440000	25.083333	25.083333	1.0	1.0
76	76	52.0	52.0	1610612746	1610612746	21600035	21600035	2016-10-30	2016-10-30	LAC vs. UTA	LAC与UTA	W	w ^	2	2	0	0	1.00	1.00	240	240	…	…	3.0	3.0	275.0	275.0	22.0	22.0	4.757	4.757	34.0	34.0	48.0	48.0	0.558824	0.558824	0.375000	0.375000	16.176471	16.176471	1.0	1.0
75	75	51.0	51.0	1610612746	1610612746	21600045	21600045	2016-10-31	2016-10-31	LAC vs. PHX	LAC与PHX	W	w ^	3	3	0	0	1.00	1.00	240	240	…	…	1.0	1.0	276.0	276.0	28.0	28.0	4.795	4.795	38.0	38.0	42.0	42.0	0.565789	0.565789	0.511905	0.511905	13.142857	13.142857	1.0	1.0
74	74	50.0	50.0	1610612746	1610612746	21600064	21600064	2016-11-02	2016-11-02	LAC vs. OKC	LAC与OKC	L	大号	3	3	1	1个	0.75	0.75	240	240	…	…	2.0	2.0	302.0	302.0	24.0	24.0	2.893	2.893	33.0	33.0	54.0	54.0	0.424242	0.424242	0.435185	0.435185	13.727273	13.727273	0.0	0.0
73	73	49.0	49.0	1610612746	1610612746	21600074	21600074	2016-11-04	2016-11-04	LAC @ MEM	LAC @ MEM	W	w ^	4	4	1	1个	0.80	0.80	240	240	…	…	2.0	2.0	300.0	300.0	17.0	17.0	2.681	2.681	41.0	41.0	44.0	44.0	0.451220	0.451220	0.397727	0.397727	15.789474	15.789474	1.0	1.0

5 rows × 38 columns

5行×38列

clippers_rolling.head(10)

clippers_rolling.head(10)

		GAME_DATE	GAME_DATE	PTS	PTS	TOV	TOV	PASS	通过	OPEN_SHOTS	OPEN_SHOTS	OPEN_EFG	OPEN_EFG	AST	AST	PASS_AST	通行证
77	77	2016-10-27	2016-10-27	NaN	N	NaN	N	NaN	N	NaN	N	NaN	N	NaN	N	NaN	N
76	76	2016-10-30	2016-10-30	NaN	N	NaN	N	NaN	N	NaN	N	NaN	N	NaN	N	NaN	N
75	75	2016-10-31	2016-10-31	NaN	N	NaN	N	NaN	N	NaN	N	NaN	N	NaN	N	NaN	N
74	74	2016-11-02	2016-11-02	NaN	N	NaN	N	NaN	N	NaN	N	NaN	N	NaN	N	NaN	N
73	73	2016-11-04	2016-11-04	100.0	100.0	12.8	12.8	290.8	290.8	37.4	37.4	0.492698	0.492698	18.2	18.2	16.783881	16.783881
72	72	2016-11-05	2016-11-05	100.4	100.4	13.0	13.0	293.6	293.6	37.4	37.4	0.514649	0.514649	20.6	20.6	14.392215	14.392215
71	71	2016-11-07	2016-11-07	105.6	105.6	12.4	12.4	302.4	302.4	39.2	39.2	0.544745	0.544745	22.8	22.8	13.435492	13.435492
70	70	2016-11-09	2016-11-09	104.6	104.6	10.6	10.6	303.0	303.0	38.8	38.8	0.537143	0.537143	23.4	23.4	13.131921	13.131921
69	69	2016-11-11	2016-11-11	110.0	110.0	9.2	9.2	308.8	308.8	41.2	41.2	0.563405	0.563405	22.6	22.6	14.064244	14.064244
68	68	2016-11-12	2016-11-12	114.0	114.0	9.8	9.8	306.6	306.6	40.8	40.8	0.591110	0.591110	23.8	23.8	13.218349	13.218349

If we want to see all of this on the same graph, we need to normalize it. This means we’re going to scale each value by subtracting it from the mean, and dividing by standard deviation.

如果我们想在同一张图上看到所有这些，我们需要对其进行归一化。这意味着我们将通过从平均值中减去每个值并除以标准偏差来缩放每个值。

This is a bit of a janky way to do so because it relies on the columns being in the same order.

这有点麻烦，因为它依赖于列的顺序相同。

clippers_rolling.head(10)

clippers_rolling.head(10)

		GAME_DATE	GAME_DATE	PTS	PTS	TOV	TOV	PASS	通过	OPEN_SHOTS	OPEN_SHOTS	OPEN_EFG	OPEN_EFG	AST	AST	PASS_AST	通行证
77	77	2016-10-27	2016-10-27	NaN	N	NaN	N	NaN	N	NaN	N	NaN	N	NaN	N	NaN	N
76	76	2016-10-30	2016-10-30	NaN	N	NaN	N	NaN	N	NaN	N	NaN	N	NaN	N	NaN	N
75	75	2016-10-31	2016-10-31	NaN	N	NaN	N	NaN	N	NaN	N	NaN	N	NaN	N	NaN	N
74	74	2016-11-02	2016-11-02	NaN	N	NaN	N	NaN	N	NaN	N	NaN	N	NaN	N	NaN	N
73	73	2016-11-04	2016-11-04	-0.705388	-0.705388	0.033424	0.033424	-0.403622	-0.403622	-0.504550	-0.504550	-0.714018	-0.714018	-0.878286	-0.878286	0.739290	0.739290
72	72	2016-11-05	2016-11-05	-0.672092	-0.672092	0.088895	0.088895	-0.284434	-0.284434	-0.504550	-0.504550	-0.480817	-0.480817	-0.376102	-0.376102	0.083865	0.083865
71	71	2016-11-07	2016-11-07	-0.239256	-0.239256	-0.077516	-0.077516	0.090155	0.090155	-0.202852	-0.202852	-0.161093	-0.161093	0.084234	0.084234	-0.178321	-0.178321
70	70	2016-11-09	2016-11-09	-0.322493	-0.322493	-0.576750	-0.576750	0.115695	0.115695	-0.269896	-0.269896	-0.241857	-0.241857	0.209780	0.209780	-0.261513	-0.261513
69	69	2016-11-11	2016-11-11	0.126991	0.126991	-0.965042	-0.965042	0.362583	0.362583	0.132369	0.132369	0.037146	0.037146	0.042385	0.042385	-0.006014	-0.006014
68	68	2016-11-12	2016-11-12	0.459943	0.459943	-0.798631	-0.798631	0.268936	0.268936	0.065325	0.065325	0.331470	0.331470	0.293477	0.293477	-0.237828	-0.237828

According to this, Blake’s abscence definitely had an effect on how the team runs their offense. Passes per assist went up just as total points went down after Blake’s injury.

据此，布雷克的缺席无疑对球队进攻的方式产生了影响。布雷克受伤后，每助攻的传球次数都与总得分下降一样。

From the looks of it, the Clippers were starting to adjust to Blake being out just as CP3 got hurt, but for the most part, there’s too much noise here.

从外观上看，快船队开始调整以适应布雷克在CP3受伤时的状态，但在大多数情况下，这里的噪音太大了。

Blake gets injured almost every year, so I wonder how this graph looks for 2015-2016 data…..

布雷克几乎每年都会受伤，所以我想知道这张图如何看待2015-2016年的数据…..

This is just the tip of the iceburg. Exploratory analysis like this is about finding interesting questions, not answering them.

这只是冰山一角。 这样的探索性分析是关于找到有趣的问题，而不是回答它们。

建立管道： (Building the Pipe:)

Now that we can what we can do with the data, let’s get everything piping into a database.

现在我们可以处理数据了，让我们将所有内容整理到数据库中。

If you see yourself doing some of your own analysis or just want some more experience moving and cleaning data, this section is a high level overview of how to do that:

如果您看到自己在做一些自己的分析，或者只是想获得更多有关移动和清理数据的经验，那么本节将概述如何进行此操作：

I stored the output from the previous function in CSVs in the same directory as this notebook so they can be easily imported.

我将前一个函数的输出存储在与该笔记本相同的目录中的CSV文件中，以便可以轻松导入它们。

Pandas has a built in to_sql function that works with sqlalchemy:

Pandas具有与sqlalchemy一起使用的内置to_sql函数：

http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_sql.html

import glob, os
import time
files = []
os.chdir("YOUR_DIRECTORY_HERE")
for file in glob.glob("*.csv"):
    files.append(file)

import glob, os
import time
files = []
os.chdir("YOUR_DIRECTORY_HERE")
for file in glob.glob("*.csv"):
    files.append(file)

The python sqlalchemy package supports most databases, so refer to the documentation for each db’s connection string:

python sqlalchemy软件包支持大多数数据库，因此请参考文档以获取每个数据库的连接字符串：

http://www.sqlalchemy.org/

What would really make analysis easier is if information updated after every game. Let’s try using the info we have to update one of the CSVs, and then use that to do it for all the other files:

真正让分析变得更容易的是，如果每场比赛之后都更新信息。让我们尝试使用必须更新其中一个CSV的信息，然后对所有其他文件使用该信息：

def update_info(file):

    file_name =  str(file)
    name  =  file_name.partition('.csv')[0].replace(' ','_').replace('.','').lower()
    old_info = pd.read_csv(file_name)

    team_id = old_info['Team_ID'].max()
    new_logs = team.TeamGameLogs(team_id).info()

    old_info['GAME_DATE'] =  pd.to_datetime(old_info['GAME_DATE'])
    order = old_info.columns
    new_logs['GAME_DATE'] = pd.to_datetime(new_logs['GAME_DATE'])

    ## If there's no new  games for the team, return:
    if max(new_logs['GAME_DATE']) ==  max(old_info['GAME_DATE']):
        return

    new_logs['DAYS_REST']= new_logs['GAME_DATE'] - new_logs['GAME_DATE'].shift(-1) ##this gives us our days rest column
    new_logs['DAYS_REST']= new_logs['DAYS_REST'].astype('timedelta64[D]')

    ##keeping datatypes consistent
    new_logs['Game_ID'] = new_logs['Game_ID'].astype(str).astype(int)

    ##Append the info from the previously saved CSV and append it to the new game logs
    info =  pd.concat([old_info, new_logs], ignore_index = True)

    ##Drop the duplicates
    info = info.drop_duplicates(['Game_ID'], keep = 'first')

    ## Sort by date
    info = info.sort(['GAME_DATE'], ascending = [0])

    ##Reset the axis
    info =  info.reset_index(drop = True)

    ##Find the dates where there's  no values for any of the stats we fetched. We can make requests for only those dates
    updates = info.loc[np.isnan(info['COVERED_EFG'])]

    ##If the team's boxscore is up to date, return.
    if len(updates) == 0:
        return
    dates  =  updates['GAME_DATE']

    df_passes =  pd.DataFrame()
    for d in dates:
        ##All exactly the same as before

        game_info = team.TeamPassTracking(team_id, date_from =d, date_to = d).passes_made()
        game_info['EVENT_DATE'] = d

        df_sum = game_info.groupby(['EVENT_DATE']).sum()
        df_sum.reset_index(level = 0, inplace =  True)

        open_info = team.TeamShotTracking(team_id, date_from =d, date_to = d).closest_defender_shooting()

        open_info['OPEN'] = open_info['CLOSE_DEF_DIST_RANGE'].map(lambda  x: True if 'Open' in x else False)
        df_sum['OPEN_SHOTS'] = open_info.loc[open_info['OPEN']== True, 'FGA'].sum()
        df_sum['COVERED_SHOTS'] = open_info.loc[open_info['OPEN']== False, 'FGA'].sum()

        if (open_info.loc[open_info['OPEN']== True, 'FGA'].sum() > 0):
            df_sum['OPEN_EFG']= (open_info.loc[open_info['OPEN']== True, 'FGM'].sum() + (.5 * open_info.loc[open_info['OPEN']== True, 'FG3M'].sum()))/(open_info.loc[open_info['OPEN']== True, 'FGA'].sum())
        else:
            df_sum['OPEN_EFG'] = 0

        if (open_info.loc[open_info['OPEN']== False, 'FGA'].sum() > 0):
            df_sum['COVERED_EFG']= (open_info.loc[open_info['OPEN']== False, 'FGM'].sum() + (.5 * open_info.loc[open_info['OPEN']== False, 'FG3M'].sum()))/(open_info.loc[open_info['OPEN']== False, 'FGA'].sum())
        else:
            df_sum['COVERED_EFG']=0

        df_passes = df_passes.append(df_sum)

    df_passes = df_passes.reset_index(drop = True)

    ##Join the new stats with the old information.
    info.update(df_passes[['PASS', 'FG2M', 'FG2_PCT', 'OPEN_SHOTS', 'OPEN_EFG', 'COVERED_EFG','COVERED_SHOTS']])

    ##Calculate these two post join in case there were any stat corrections
    info['PASS_AST'] = info['PASS'] /  info['AST']
    info['RESULT'] = info['WL'].map(lambda x: 1 if 'W' in x else 0 )

    ##Reorder the columns in the dataframe
    info = info[order]

    ##Save to csv
    info.to_csv(file_name, index = False)

    ##Upload, and replace the information already there. Since we're working with a relatively small volume of data,
    ##upserts isn't worth the time.
    info.to_sql(con =  eng,  index = False, name = name, schema  =  'nba', if_exists = 'replace')

    print name

    return info

def update_info(file):

    file_name =  str(file)
    name  =  file_name.partition('.csv')[0].replace(' ','_').replace('.','').lower()
    old_info = pd.read_csv(file_name)

    team_id = old_info['Team_ID'].max()
    new_logs = team.TeamGameLogs(team_id).info()

    old_info['GAME_DATE'] =  pd.to_datetime(old_info['GAME_DATE'])
    order = old_info.columns
    new_logs['GAME_DATE'] = pd.to_datetime(new_logs['GAME_DATE'])

    ## If there's no new  games for the team, return:
    if max(new_logs['GAME_DATE']) ==  max(old_info['GAME_DATE']):
        return

    new_logs['DAYS_REST']= new_logs['GAME_DATE'] - new_logs['GAME_DATE'].shift(-1) ##this gives us our days rest column
    new_logs['DAYS_REST']= new_logs['DAYS_REST'].astype('timedelta64[D]')

    ##keeping datatypes consistent
    new_logs['Game_ID'] = new_logs['Game_ID'].astype(str).astype(int)

    ##Append the info from the previously saved CSV and append it to the new game logs
    info =  pd.concat([old_info, new_logs], ignore_index = True)

    ##Drop the duplicates
    info = info.drop_duplicates(['Game_ID'], keep = 'first')

    ## Sort by date
    info = info.sort(['GAME_DATE'], ascending = [0])

    ##Reset the axis
    info =  info.reset_index(drop = True)

    ##Find the dates where there's  no values for any of the stats we fetched. We can make requests for only those dates
    updates = info.loc[np.isnan(info['COVERED_EFG'])]

    ##If the team's boxscore is up to date, return.
    if len(updates) == 0:
        return
    dates  =  updates['GAME_DATE']

    df_passes =  pd.DataFrame()
    for d in dates:
        ##All exactly the same as before

        game_info = team.TeamPassTracking(team_id, date_from =d, date_to = d).passes_made()
        game_info['EVENT_DATE'] = d

        df_sum = game_info.groupby(['EVENT_DATE']).sum()
        df_sum.reset_index(level = 0, inplace =  True)

        open_info = team.TeamShotTracking(team_id, date_from =d, date_to = d).closest_defender_shooting()

        open_info['OPEN'] = open_info['CLOSE_DEF_DIST_RANGE'].map(lambda  x: True if 'Open' in x else False)
        df_sum['OPEN_SHOTS'] = open_info.loc[open_info['OPEN']== True, 'FGA'].sum()
        df_sum['COVERED_SHOTS'] = open_info.loc[open_info['OPEN']== False, 'FGA'].sum()

        if (open_info.loc[open_info['OPEN']== True, 'FGA'].sum() > 0):
            df_sum['OPEN_EFG']= (open_info.loc[open_info['OPEN']== True, 'FGM'].sum() + (.5 * open_info.loc[open_info['OPEN']== True, 'FG3M'].sum()))/(open_info.loc[open_info['OPEN']== True, 'FGA'].sum())
        else:
            df_sum['OPEN_EFG'] = 0

        if (open_info.loc[open_info['OPEN']== False, 'FGA'].sum() > 0):
            df_sum['COVERED_EFG']= (open_info.loc[open_info['OPEN']== False, 'FGM'].sum() + (.5 * open_info.loc[open_info['OPEN']== False, 'FG3M'].sum()))/(open_info.loc[open_info['OPEN']== False, 'FGA'].sum())
        else:
            df_sum['COVERED_EFG']=0

        df_passes = df_passes.append(df_sum)

    df_passes = df_passes.reset_index(drop = True)

    ##Join the new stats with the old information.
    info.update(df_passes[['PASS', 'FG2M', 'FG2_PCT', 'OPEN_SHOTS', 'OPEN_EFG', 'COVERED_EFG','COVERED_SHOTS']])

    ##Calculate these two post join in case there were any stat corrections
    info['PASS_AST'] = info['PASS'] /  info['AST']
    info['RESULT'] = info['WL'].map(lambda x: 1 if 'W' in x else 0 )

    ##Reorder the columns in the dataframe
    info = info[order]

    ##Save to csv
    info.to_csv(file_name, index = False)

    ##Upload, and replace the information already there. Since we're working with a relatively small volume of data,
    ##upserts isn't worth the time.
    info.to_sql(con =  eng,  index = False, name = name, schema  =  'nba', if_exists = 'replace')

    print name

    return info

You can put the script above in a seperate .py file in the same directory as all of the team CSVs and schedule it via crontab to run automatically so your database and CSVs will always contain the latest information (assuming stats.nba.com doesn’t change anything on their end).

您可以将上面的脚本与所有团队CSV放在一个单独的.py文件中，并通过crontab安排它自动运行，以便您的数据库和CSV始终包含最新信息（假设stats.nba.com不会）不能改变他们的目标）。

Here’s a great tutorial on how to do so: https://www.youtube.com/watch?v=hDJ3XQzW8nk

这是有关如何执行此操作的出色教程： https : //www.youtube.com/watch?v=hDJ3XQzW8nk

包装全部 (Wrapping it all up)

That should be everything you need to get started.

那应该是您入门所需的一切。

If this was interesting to you and you want to test what you learned, here are a few excercises (in relatively increasing difficulty):

如果这对您很有趣，并且您想测试您学到的知识，请参考以下一些练习（难度相对增加）：

1) The current boxscores only have a team_id in each table. Find a way to insert a column for a team name.

1）当前的Boxscores在每个表中只有一个team_id。找到一种为团队名称插入列的方法。

2) Throw in some data about conested/unconested rebounding. This might be interesting when looking at different factors that contribute to wins.

2）投掷有关巢穴/未巢穴反弹的一些数据。当查看有助于获胜的不同因素时，这可能很有趣。

3) There’s no player specific data in any of this. Try throwing in a column in each team’s boxscores with each game’s leading scorer.

3）任何一项都没有球员特定的数据。尝试与每场比赛的领先得分手在每支球队的得分中加入一列。

4) Compare data across seasons! Is it possible to visualize changes to the Thunder offense after KD left? How different is Tom Thibideau’s offense from Sam Mitchell’s scheme in Minnesota?

4）比较各个季节的数据！ KD离开后是否可以可视化雷霆进攻的变化？ Tom Thibideau的罪行与Sam Mitchell在明尼苏达州的计划有何不同？

5) Do some data science! I’d love to see what sort of interesting models can be drummed up using this infastructure.

5）做一些数据科学！我很想看看使用这种基础结构可以鼓出什么样的有趣模型。

Thanks for reading! Send your suggestions, solutions, or anything else to [email protected]

谢谢阅读！ 将您的建议，解决方案或其他任何方式发送至[email protected]

翻译自: https://www.pybloggers.com/2017/04/data-wrangling-101-using-python-to-fetch-manipulate-visualize-nba-data/