pandas实现对dataframe随机抽样、分层抽样

随机抽样:

import pandas as pd
#对dataframe随机抽取100个样本
pd.sample(df, n=100)

分层抽样:
利用train_test_split中的函数灵活进行抽样

from sklearn.model_selection import train_test_split
#y是在X中的某一个属性列
X_train, X_test, y_train, y_test = train_test_split(X,y, test_size=0.1, stratify=y)

你可能感兴趣的:(Python,pandas,python)