训练yolov5出现的错误

pyorch问题(1):锁页内存问题:Leaking Caffe2 thread-pool after fork. (function pthreadpool

[W pthreadpool-cpp.cc:90] Warning: Leaking Caffe2 thread-pool after fork. (function pthreadpool)
[W pthreadpool-cpp.cc:90] Warning: Leaking Caffe2 thread-pool after fork. (function pthreadpool)
[W pthreadpool-cpp.cc:90] Warning: Leaking Caffe2 thread-pool after fork. (function pthreadpool)
[W pthreadpool-cpp.cc:90] Warning: Leaking Caffe2 thread-pool after fork. (function pthreadpool)
[W pthreadpool-cpp.cc:90] Warning: Leaking Caffe2 thread-pool after fork. (function pthreadpool)
[W pthreadpool-cpp.cc:90] Warning: Leaking Caffe2 thread-pool after fork. (function pthreadpool)
[W pthreadpool-cpp.cc:90] Warning: Leaking Caffe2 thread-pool after fork. (function pthreadpool)

pytorch运行过程中遇到Leaking Caffe2 thread-pool after fork. (function pthreadpool)
这是因为DataLoader中的pin_memory设置为True;
主机中的内存,有两种存在方式,一是锁页,二是不锁页,锁页内存存放的内容在任何情况下都不会与主机的虚拟内存进行交换(注:虚拟内存就是硬盘),而不锁页内存在主机内存不足时,数据会存放在虚拟内存中。
pin_memory就是锁页内存,创建DataLoader时,设置pin_memory=True,则意味着生成的Tensor数据最开始是属于内存中的锁页内存,这样将内存的Tensor转义到GPU的显存就会更快一些;在设备比较告诉高端,内存充足的情况下,可以将pin_memory设置为True,因为这样设置的话,则意味着生成的Tensor数据最开始是属于内存中的锁页内存==(显存都是虚拟内存)==,这样将内存的Tensor转义到GPU的显存就会更快一些。但是如果主机内存不足的话,设置pin_memory为false,回到导致这种错误;
解决办法:将pin_memory设置为false;这样在锁存不足的时候,就会把数据存在虚拟内存(硬盘内);只不过这种方法,在给GPU喂数据的时候会比较慢;
解决方法:
DataLoader函数中参数改成pin_memory = False

修改前:
DataLoader(val_dataset, shuffle=True, batch_size=Batch_size, num_workers=4, pin_memory=True,
                                    drop_last=True, collate_fn=yolo_dataset_collate)
修改后:
DataLoader(val_dataset, shuffle=True, batch_size=Batch_size, num_workers=4, pin_memory=False,
                                    drop_last=True, collate_fn=yolo_dataset_collate)

TypeError:can‘t convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy

提示:将报错代码self.numpy()改为self.cpu().numpy()即可

AAE_x_hat = AAE_x_hat.detach().cpu().numpy().squeeze()
	

RuntimeError: a view of a leaf Variable that requires grad is being used in an in-place operation.

修改前:
        for mi, s in zip(m.m, m.stride):  # from
            b = mi.bias.view(m.na, -1)  # conv.bias(255) to (3,85)
            b[:, 4] += math.log(8 / (640 / s) ** 2)  # obj (8 objects per 640 image)
            b[:, 5:] += math.log(0.6 / (m.nc - 0.99)) if cf is None else torch.log(cf / cf.sum())  # cls
            mi.bias = torch.nn.Parameter(b.view(-1), requires_grad=True)
修改后: 
 def _initialize_biases(self, cf=None):  # initialize biases into Detect(), cf is class frequency
      # https://arxiv.org/abs/1708.02002 section 3.3
      # cf = torch.bincount(torch.tensor(np.concatenate(dataset.labels, 0)[:, 0]).long(), minlength=nc) + 1.
      m = self.model[-1]  # Detect() module
      for mi, s in zip(m.m, m.stride):  # from
          b = mi.bias.view(m.na, -1)  # conv.bias(255) to (3,85)
          with torch.no_grad():
              b[:, 4] += math.log(8 / (640 / s) ** 2)  # obj (8 objects per 640 image)
              b[:, 5:] += math.log(0.6 / (m.nc - 0.99)) if cf is None else torch.log(cf / cf.sum())  # cls
          mi.bias = torch.nn.Parameter(b.view(-1), requires_grad=True)

AttributeError: module ‘distutils‘ has no attribute ‘version‘ 解决方案

pip uninstall setuptools
pip install setuptools==59.5.0

ImportError: libGL.so.1: cannot open shared object file: No such file or directory

apt install libgl1-mesa-glx

apt-get update

ImportError: libgthread-2.0.so.0: cannot open shared object file: No such file or directory

apt-get update
apt-get install libglib2.0-dev

你可能感兴趣的:(pytorch,pytorch)