datawhale 组对学习 pandas
http://datawhale.club/t/topic/579/4
任务:美国大选投票情况
【题目描述】两张数据表中分别给出了美国各县(country)的人口数以及大选的投票情况,请解决以下问题:
1.有多少县满足总票数超过县人口数的一半
2.把州(state)作为行索引,把投票候选人作为列名,列名的顺序按照候选人在全美的总票数由高到低排序,行列
对应的元素为该候选人在该州获得的总票数
#此处是一个样例,实际的州或人名用原表的英语代替
拜登 川普
威斯康星州 2 1
德克萨斯州 3 4
3.每一个州下设若干县的,定义拜登在该县的得票率减去川普在该县的得票率为该州的BT指标,
若某个州所有县BT指标的中位数大于0,则该州为Biden State,请找出所有的Biden State
import numpy as np
import pandas as pd
df1 = pd.read_csv("D:\BaiduNetdiskDownload\county_population.csv")
df1
df2 = pd.read_csv("D:\BaiduNetdiskDownload\president_county_candidate.csv")
df2
df1['US County'].unique()
array(['.Autauga County, Alabama', '.Baldwin County, Alabama',
'.Barbour County, Alabama', ..., '.Uinta County, Wyoming',
'.Washakie County, Wyoming', '.Weston County, Wyoming'],
dtype=object)
df2['state'].unique()
array(['Delaware', 'District of Columbia', 'Florida', 'Georgia', 'Hawaii',
'Idaho', 'Illinois', 'Indiana', 'Iowa', 'Kansas', 'Kentucky',
'Louisiana', 'Maine', 'Maryland', 'Massachusetts', 'Michigan',
'Minnesota', 'Mississippi', 'Missouri', 'Montana', 'Nebraska',
'Nevada', 'New Hampshire', 'New Jersey', 'New Mexico', 'New York',
'North Carolina', 'North Dakota', 'Ohio', 'Oklahoma', 'Oregon',
'Pennsylvania', 'Rhode Island', 'South Carolina', 'South Dakota',
'Tennessee', 'Texas', 'Utah', 'Vermont', 'Virginia', 'Washington',
'West Virginia', 'Wisconsin', 'Wyoming', 'Alabama', 'Alaska',
'Arkansas', 'California', 'Colorado', 'Connecticut', 'Arizona'],
dtype=object)
df2['county'].unique()
array(['Kent County', 'New Castle County', 'Sussex County', ...,
'La Paz County', 'Maricopa County', 'Mohave County'], dtype=object)