class MyCollate:
def __init__(self, pad_idx):
self.pad_idx = pad_idx #填充值
def __call__(self, batch):
imgs = [item[0].unsqueeze(0) for item in batch]
imgs = torch.cat(imgs, dim=0)
targets = [item[1] for item in batch] #item是一条数据
targets = pad_sequence(targets, batch_first=True, padding_value=self.pad_idx)
return imgs, targets
这段代码中pad_sequence
来自from torch.nn.utils.rnn import pad_sequence
targets = pad_sequence(targets, batch_first=True, padding_value=self.pad_idx)
作用接受targets 2层嵌套list, 然后将其它list补齐大最大list, 补齐的值为padding_value
targets: list矩阵,shape=[batch_size, N] ,N长度不一
batch_first:默认batch_size在第一维度
padding_value:填充的值
返回
[batch_size, M]
M为batch中的最大长度