Numpy Dot 用来计算两个向量之间的点积,
点积:每个条目的数相乘后相加
例:
a = [1 ,2, 3, 4]
b = [2 ,3, 4, 5]
那么 a与b的点积 = 12+23+34+45 =40
我们可以用numpy.dot来计算
a = [1,2,3,4]
b = [2,3,4,5]
numpy.dot(a,b) = 40
数组与矩阵相乘:
矩阵和矩阵相乘:
例子:计算出所有获奖国家的得分,金牌4分 ,银牌2分,铜牌1分。最后以包含获奖国家名称和得分的数据框输出:
import numpy
from pandas import DataFrame, Series
def numpy_dot():
countries = ['Russian Fed.', 'Norway', 'Canada', 'United States',
'Netherlands', 'Germany', 'Switzerland', 'Belarus',
'Austria', 'France', 'Poland', 'China', 'Korea',
'Sweden', 'Czech Republic', 'Slovenia', 'Japan',
'Finland', 'Great Britain', 'Ukraine', 'Slovakia',
'Italy', 'Latvia', 'Australia', 'Croatia', 'Kazakhstan']
gold = [13, 11, 10, 9, 8, 8, 6, 5, 4, 4, 4, 3, 3, 2, 2, 2, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0]
silver = [11, 5, 10, 7, 7, 6, 3, 0, 8, 4, 1, 4, 3, 7, 4, 2, 4, 3, 1, 0, 0, 2, 2, 2, 1, 0]
bronze = [9, 10, 5, 12, 9, 5, 2, 1, 5, 7, 1, 2, 2, 6, 2, 4, 3, 1, 2, 1, 0, 6, 2, 1, 0, 1]
# YOUR CODE HERE
return olympic_points_df
在‘# YOUR CODE HERE’处 输入正确代码:
-
整理数据并转成dataFrame的形式:
data ={'country_name':countries, 'gold': gold, 'silver':silver, 'bronze':bronze} base_data_df = DataFrame(data);
-
筛选出金牌,银牌,铜牌数量,并乘以 [4,2,1]这样就能计算出points这列了,计算完并保存到points列
base_data_df['points'] = base_data_df[['gold','silver','bronze']].dot([4,2,1])
我们打印下这个时候的base_data_df:
bronze country_name gold silver points
0 9 Russian Fed. 13 11 83
1 10 Norway 11 5 64
2 5 Canada 10 10 65
3 12 United States 9 7 62
4 9 Netherlands 8 7 55
5 5 Germany 8 6 49
6 2 Switzerland 6 3 32
7 1 Belarus 5 0 21
8 5 Austria 4 8 37
9 7 France 4 4 31
10 1 Poland 4 1 19
11 2 China 3 4 22
12 2 Korea 3 3 20
13 6 Sweden 2 7 28
14 2 Czech Republic 2 4 18
15 4 Slovenia 2 2 16
16 3 Japan 1 4 15
17 1 Finland 1 3 11
18 2 Great Britain 1 1 8
19 1 Ukraine 1 0 5
20 0 Slovakia 1 0 4
21 6 Italy 0 2 10
22 2 Latvia 0 2 6
23 1 Australia 0 2 5
24 0 Croatia 0 1 2
25 1 Kazakhstan 0 0 1
- 看完上面的数据,我们只需要将country_name 和points两列筛选出来就ok了:
olympic_points_df = base_data_df[['country_name','points']]
看下结果:
country_name points
0 Russian Fed. 83
1 Norway 64
2 Canada 65
3 United States 62
4 Netherlands 55
5 Germany 49
6 Switzerland 32
7 Belarus 21
8 Austria 37
9 France 31
10 Poland 19
11 China 22
12 Korea 20
13 Sweden 28
14 Czech Republic 18
15 Slovenia 16
16 Japan 15
17 Finland 11
18 Great Britain 8
19 Ukraine 5
20 Slovakia 4
21 Italy 10
22 Latvia 6
23 Australia 5
24 Croatia 2
25 Kazakhstan 1
ok ,这就是我们要的:
有些人在获取金牌银牌铜牌的数据时可能会直接通过基础数据生成个DataFrame,
data ={ 'gold': gold,
'silver':silver,
'bronze':bronze}
base_data_df = DataFrame(data);
然后直接 base_data_df.dot([4,2,1]),算出的结果是错的。
为什么呢?
我们来输出下以上面形式组成的base_data_df:
bronze gold silver
0 9 13 11
1 10 11 5
2 5 10 10
3 12 9 7
4 9 8 7
5 5 8 6
6 2 6 3
7 1 5 0
8 5 4 8
9 7 4 4
10 1 4 1
11 2 3 4
12 2 3 3
13 6 2 7
14 2 2 4
15 4 2 2
16 3 1 4
17 1 1 3
18 2 1 1
19 1 1 0
20 0 1 0
21 6 0 2
22 2 0 2
23 1 0 2
24 0 0 1
25 1 0 0
看出区别了吗,gold,silver,,bronze三个的排序是不定的,所以乘以[4,2,1]就得出错误的结果了