在做深度学习时,经常需要将数据集划分为3份,本代码可按照比例划分数据集,df为dataframe,ratio_train,ratio_test,ratio_val分别为训练集、测试集和验证集的比例。直接调用函数即可
from sklearn.model_selection import train_test_split
def train_test_val_split(df,ratio_train,ratio_test,ratio_val):
train, middle = train_test_split(df,test_size=1-ratio_train)
ratio=ratio_val/(1-ratio_train)
test,validation =train_test_split(middle,test_size=ratio)
return train,test,validation
演示例子:将数据集df按照训练集:测试集:验证集=0.6:0.2:0.2的比例划分
train,test,val=train_test_val_split(df,0.6,0.2,0.2)