df.sample()用于从dataframe或者series中,随机取样。sample 美['sæmp(ə)l] v采样;取样;n样品
DataFrame.sample(self: ~ FrameOrSeries, n=None, frac=None, replace=False, weights=None, random_state=None, axis=None)
import pandas as pd
df = pd.DataFrame({'name':['zhao','qian','sun','wang'],'mark':[150,122,155,132],'gender':['female','female','male','male']})
df
name mark gender
0 zhao 150 female
1 qian 122 female
2 sun 155 male
3 wang 132 male
name mark gender
0 zhao 150 female
1 qian 122 female
2 sun 155 male
3 wang 132 male
df.sample(2)
name mark gender
1 qian 122 female
0 zhao 150 female
df.sample(frac=0.75)
name mark gender
2 sun 155 male
1 qian 122 female
0 zhao 150 female
df.sample(3,replace=True)
name mark gender
0 zhao 150 female
1 qian 122 female
0 zhao 150 female
#里面有重复的数据
name mark gender
0 zhao 150 female
1 qian 122 female
2 sun 155 male
3 wang 132 male
df.sample(3,replace=True,weights=[1,2,3,4])
name mark gender
3 wang 132 male
3 wang 132 male
3 wang 132 male
#选取到3行的概率更大了
df.sample(2,replace=True,random_state=3)
name mark gender
2 sun 155 male
0 zhao 150 female
指定随机数种子后,每次选取的结果就固定了。