MiniImageNet数据集重新划分训练集、测试集、验证集

前言

    对于数据集大家还是要先知道他的构成,以免入坑,这个数据集100类分别划给了train,test,val,不利于我的图像分类任务,所以重新做了划分。100类,每类600张,我将每类400,100,100分别重新划给了train,test,val:
						train.csv包含40000张图片,共100个类别。
						val.csv包含10000张图片,共100个类别。
						test.csv包含10000张图片,共100个类别。

一、原MiniImageNet数据集

可以看这个:
https://blog.csdn.net/qq_37541097/article/details/113027489

百度网盘下载:
链接: https://pan.baidu.com/s/1Uro6RuEbRGGCQ8iXvF2SAQ 密码: hl31

二、重新分割步骤

1.引入库

代码如下(示例):

import pandas as pd

2.分割

代码如下(示例):

# 原文件
train_path = r'D:\CsDataset\mini-imagenet\train.csv'
test_path = r'D:\CsDataset\mini-imagenet\test.csv'
val_path = r'D:\CsDataset\mini-imagenet\val.csv'
# 处理后要保存的文件
new_train_path = r'D:\CsDataset\mini-imagenet\new_train.csv'
new_test_path = r'D:\CsDataset\mini-imagenet\new_test.csv'
new_val_path = r'D:\CsDataset\mini-imagenet\new_val.csv'


def split_miniimagenet(train_path, test_path, val_path, new_train_path, new_test_path, new_val_path):
    train_data = pd.read_csv(train_path)
    test_data = pd.read_csv(test_path)
    val_data = pd.read_csv(val_path)


    data11 = [train_data.iloc[600*i:600*i+400,:] for i in range(64)]
    df11 = pd.concat(data11)#竖着拼接
    data12 = [test_data.iloc[600*i:600*i+400,:] for i in range(20)]
    df12 = pd.concat(data12)
    data13 = [val_data.iloc[600*i:600*i+400,:] for i in range(16)]
    df13 = pd.concat(data13)
    data1 = [df11, df12, df13] 
    df1 = pd.concat(data1)
    df1.to_csv(new_train_path,index=0, sep=',')#输出文件名


    data21 = [train_data.iloc[600*i+400:600*i+500,:] for i in range(64)]
    df21 = pd.concat(data21)
    data22 = [test_data.iloc[600*i+400:600*i+500,:] for i in range(20)]
    df22 = pd.concat(data22)
    data23 = [val_data.iloc[600*i+400:600*i+500,:] for i in range(16)]
    df23 = pd.concat(data23)
    data2 = [df21, df22, df23] 
    df2 = pd.concat(data2)
    df2.to_csv(new_test_path,index=0, sep=',')#输出文件名

    data31 = [train_data.iloc[600*i+500:600*i+600,:] for i in range(64)]
    df31 = pd.concat(data31)
    data32 = [test_data.iloc[600*i+500:600*i+600,:] for i in range(20)]
    df32 = pd.concat(data32)
    data33 = [val_data.iloc[600*i+500:600*i+600,:] for i in range(16)]
    df33 = pd.concat(data33)
    data3 = [df31, df32, df33]
    df3 = pd.concat(data3)
    df3.to_csv(new_val_path,index=0, sep=',')#输出文件名



split_miniimagenet(train_path, test_path, val_path, new_train_path, new_test_path, new_val_path)



你可能感兴趣的:(python,经验分享)