pandas 中True, False的妙用

True, False能方便地用来计算总和(即个数)和均值(即占比)。
pandas 中True, False的妙用_第1张图片

data['Comedy'] = data['genres'].str.contains('Comedy')
data['Drama'] = data['genres'].str.contains('Drama')

## 注意,这里不能用count,count也会计入false的个数
result = data.groupby('release_year')['Comedy','Drama'].agg('sum')
result
## output

Comedy  Drama
release_year        
1960    1   3
1961    2   1
1962    3   4
1963    6   4
1964    3   4
1965    3   4
1966    0   4
1967    0   2
1968    0   2
1969    0   1
1970    2   2
1971    0   2
1972    0   1
1973    0   3
1974    0   3
1975    1   1
1976    3   3
1977    1   2
1978    3   4
1979    2   2
1980    4   4
1981    4   1
1982    5   6
1983    5   2
1984    8   6
1985    5   6
1986    6   5
1987    7   2
1988    9   2
1989    3   3
1990    6   1
1991    3   4
1992    2   2
1993    3   6
1994    6   4
1995    2   6
1996    3   6
1997    2   1
1998    6   6
1999    7   10
2000    6   5
2001    8   5
2002    10  10
2003    9   3
2004    11  9
2005    12  15
2006    15  11
2007    11  13
2008    10  9
2009    20  15
2010    14  16
2011    12  14
2012    17  16
2013    13  16
2014    15  16
2015    16  21

你可能感兴趣的:(学习笔记)