PyTorch数据划分

需要在训练集里划出一部分作为验证集,可以使用SubsetRandomSampler(或者其他的sampler),例如:

from torch.utils.data.sampler import SubsetRandomSampler

#训练集样本数
num_training_samples = 48000 
train_sampler = SubsetRandomSampler(torch.arange(0, num_training_samples))

#验证集样本数
num_val_samples = 12000
val_sampler = SubsetRandomSampler(torch.arange(num_training_samples, num_training_samples+num_val_samples))

然后在DataLoader接口中如下设置即可,注意shuffle此处必须为False

train_dataloader = torch.utils.data.DataLoader(
    ...
    sampler=train_sampler,
    ...
)

val_dataloader = torch.utils.data.DataLoader(
    ...
    sampler=val_sampler,
    ...
)

或者,也可以这也划分:

train_db, val_db = torch.utils.data.random_split(data, [48000, 12000])

train_dataloader = t.utils.data.DataLoader(
     train_db,
     ...
     )

# val_dataloader = t.utils.data.DataLoader(
     val_db
     ...
      )

你可能感兴趣的:(#,PyTorch)