Paddle重写Pytorch:DataLoader报错SystemError: (Fatal) Blocking queue is killed because the data reader ra

引例

最近用paddle重写pytorch项目代码时,遇到了DataLoader一直报错的问题,看API文档时,发现两个框架并无区别,于是简单拿来重用,结果调试浪费了很多时间,还是要看API的源代码,比较底层实现。现将问题记录如下:

Paddle读取数据主要用到两个类:paddle.io.Datasetpaddle.io.DataLoader
下面的例子来自于官方文档:

import numpy as np
from paddle.io import Dataset, DataLoader
import paddle

# define a random dataset
class RandomDataset(Dataset):
    def __init__(self, num_samples):
        self.num_samples = num_samples

    def __getitem__(self, idx):
        image = np.random.random([784]).astype('float32')
        label = np.random.randint(0, 9, (1, )).astype('int64')
        return image, label

    def __len__(self):
        return self.num_samples

dataset = RandomDataset(10)
loader = DataLoader(dataset,
                    batch_size=BATCH_SIZE,
                    shuffle=True,
                    drop_last=True,
                    num_workers=2)
# 查看数据                   
for i in range(len(dataset)):
    print(dataset[i])
    
#迭代地读取数据用于训练    
for i, (image, label) in enumerate(loader()):
	print('Got it!')

错误1

Dataset类的__getitem__(self, idx)返回的数据不是numpy.ndarray类型。
比如在return前加一句:

image = paddle.to_tensor(image)

则会报错:
SystemError: (Fatal) Blocking queue is killed because the data reader raises an exception.
[Hint: Expected killed_ != true, but received killed_:1 == true:1.] (at /paddle/paddle/fluid/operators/reader/blocking_queue.h:158)
Paddle重写Pytorch:DataLoader报错SystemError: (Fatal) Blocking queue is killed because the data reader ra_第1张图片

错误2

Dataset类的__getitem__(self, idx)返回的数据为字典(Dict) 类型。
例如将返回的语句改为:

return {'input': image, 'lb': label}

会报完全一样的错误。

参考

https://blog.csdn.net/qq_37668436/article/details/114336142
https://blog.csdn.net/qq_32097577/article/details/112385033

你可能感兴趣的:(学习成长,深度学习,python,人工智能,paddle,DataLoader)