中文筛选的方法:
data_director = data_clean.groupby(‘导演’).sum()[[‘好评数’,‘评分人数’]]
data_director[‘好评率’]=data_director[‘好评数’]/data_director[‘评分人数’]
data_director_new = data_director.sort_values(by=‘好评率’,ascending=False)
data_director_wangjing = data_clean[data_clean.导演.str.contains(‘王静’)]
去掉重复
data_director_wangjing = data_clean[data_clean.导演.str.contains(‘王静’)].drop_duplicates([‘整理后剧名’])
结合data_clean,查看只有王静作为导演的作品有哪些?
data_director_onlywangjing = data_clean[data_clean.导演==‘王静’]
去掉重复
data_director_onlywangjing = data_clean[data_clean.导演==‘王静’].drop_duplicates([‘整理后剧名’])
结合data_clean,查看好评率前20的导演的作品有哪些?
data_directorTOP20 = data_clean[data_clean.导演.isin(data_director_new[:20].index.tolist())]
去掉重复
data_directorTOP20 = data_clean[data_clean.导演.isin(data_director_new[:20].index.tolist())].drop_duplicates([‘整理后剧名’])