前言:
最近在学习yolov5,记录一些报错
Sizes of tensors must match except in dimension 1. Expected size 16 but got size 8 for tensor number 1 in the list.
报错信息如下:
Traceback (most recent call last):
File "I:\EnvVariable\ML\Anaconda3\envs\pytorch\lib\site-packages\IPython\core\interactiveshell.py", line 3343, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "" , line 1, in <module>
runfile('I:/GraduationProject/yolov5-5.0-sniperitf798/train.py', wdir='I:/GraduationProject/yolov5-5.0-sniperitf798')
File "I:\DevSoftware\Python\JetBrains\PyCharm 2020.2.3\plugins\python\helpers\pydev\_pydev_bundle\pydev_umd.py", line 198, in runfile
pydev_imports.execfile(filename, global_vars, local_vars) # execute the script
File "I:\DevSoftware\Python\JetBrains\PyCharm 2020.2.3\plugins\python\helpers\pydev\_pydev_imps\_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "I:/GraduationProject/yolov5-5.0-sniperitf798/train.py", line 543, in <module>
train(hyp, opt, device, tb_writer)
File "I:/GraduationProject/yolov5-5.0-sniperitf798/train.py", line 88, in train
model = Model(opt.cfg or ckpt['model'].yaml, ch=3, nc=nc, anchors=hyp.get('anchors')).to(device) # create
File "I:\GraduationProject\yolov5-5.0-sniperitf798\models\yolo.py", line 93, in __init__
m.stride = torch.tensor([s / x.shape[-2] for x in self.forward(torch.zeros(1, ch, s, s))]) # forward
File "I:\GraduationProject\yolov5-5.0-sniperitf798\models\yolo.py", line 123, in forward
return self.forward_once(x, profile) # single-scale inference, train
File "I:\GraduationProject\yolov5-5.0-sniperitf798\models\yolo.py", line 139, in forward_once
x = m(x) # run
File "I:\EnvVariable\ML\Anaconda3\envs\pytorch\lib\site-packages\torch\nn\modules\module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "I:\GraduationProject\yolov5-5.0-sniperitf798\models\common.py", line 210, in forward
return torch.cat(x, self.d)
RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 16 but got size 8 for tensor number 1 in the list.
初步估计是模型网络结构出了问题。
下面是报错的网络结构:
backbone:
# [from, number, module, args]
# Shuffle_Block: [out, stride]
[[ -1, 1, conv_bn_relu_maxpool, [ 32 ] ], # 0-P2/4
[ -1, 1, Shuffle_Block, [ 128, 2 ] ], # 1-P3/8
[ -1, 3, Shuffle_Block, [ 128, 1 ] ], # 2
[ -1, 1, Shuffle_Block, [ 256, 2 ] ], # 3-P4/16
[ -1, 7, Shuffle_Block, [ 256, 1 ] ], # 4
[ -1, 1, Shuffle_Block, [ 512, 2 ] ], # 5-P5/32
[ -1, 3, Shuffle_Block, [ 512, 1 ] ], # 6
[ -1, 1, Shuffle_Block, [ 1024, 2 ] ], # 7-P6/64
# [ -1, 3, Shuffle_Block, [ 1024, 1 ] ], # 8
]
# YOLOv5 v5.0 head
head:
[[-1, 1, Conv, [512, 1, 1]], # 7
[-1, 1, nn.Upsample, [None, 2, 'nearest']],# 8
[[-1, 6], 1, Concat, [1]], # cat backbone P4 # 9
[-1, 3, C3, [512, False]], # 10
[-1, 1, Conv, [256, 1, 1]], # 11
[-1, 1, nn.Upsample, [None, 2, 'nearest']], # 12
[[-1, 4], 1, Concat, [1]], # cat backbone P3 # 13
[-1, 3, C3, [256, False]], # 14 (P3/8-small)
[-1, 1, Conv, [256, 3, 2]], # 15
[[-1, 14], 1, Concat, [1]], # cat head P4 # 16
[-1, 3, C3, [512, False]], # 17(P4/16-medium)
[-1, 1, Conv, [512, 3, 2]],# 18
[[-1, 10], 1, Concat, [1]], # cat head P5
[-1, 3, C3, [1024, False]], # 20 (P5/32-large)
[[15, 18, 21], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5)
]
解决方法:在backbone网络末尾加上了Conv,spp,c3网络
# YOLOv5 backbone
backbone:
# [from, number, module, args]
# Shuffle_Block: [out, stride]
[[ -1, 1, conv_bn_relu_maxpool, [ 32 ] ], # 0-P2/4
[ -1, 1, Shuffle_Block, [ 128, 2 ] ], # 1-P3/8
[ -1, 3, Shuffle_Block, [ 128, 1 ] ], # 2
[ -1, 1, Shuffle_Block, [ 256, 2 ] ], # 3-P4/16
[ -1, 7, Shuffle_Block, [ 256, 1 ] ], # 4
[ -1, 1, Shuffle_Block, [ 512, 2 ] ], # 5-P5/32
[ -1, 3, Shuffle_Block, [ 512, 1 ] ], # 6
[ -1, 1, Conv, [ 1024, 3, 2 ] ], # 7-P5/32
[ -1, 1, SPP, [ 1024, [ 5, 9, 13 ] ] ],# 8
[ -1, 3, C3, [ 1024, False ] ], # 9
]
但是开始训练,又报新的错
train: Scanning 'VOC\labels\train' images and labels... 16551 found, 0 missing, 0 empty, 0 corrupted: 100%|██████████| 16551/16551 [00:11<00:00, 1491.29it/s]
train: New cache created: VOC\labels\train.cache
Traceback (most recent call last):
File "" , line 1, in <module>
File "I:\EnvVariable\ML\Anaconda3\envs\pytorch\lib\multiprocessing\spawn.py", line 105, in spawn_main
exitcode = _main(fd)
File "I:\EnvVariable\ML\Anaconda3\envs\pytorch\lib\multiprocessing\spawn.py", line 114, in _main
prepare(preparation_data)
File "I:\EnvVariable\ML\Anaconda3\envs\pytorch\lib\multiprocessing\spawn.py", line 225, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "I:\EnvVariable\ML\Anaconda3\envs\pytorch\lib\multiprocessing\spawn.py", line 277, in _fixup_main_from_path
run_name="__mp_main__")
File "I:\EnvVariable\ML\Anaconda3\envs\pytorch\lib\runpy.py", line 263, in run_path
pkg_name=pkg_name, script_name=fname)
File "I:\EnvVariable\ML\Anaconda3\envs\pytorch\lib\runpy.py", line 96, in _run_module_code
mod_name, mod_spec, pkg_name, script_name)
File "I:\EnvVariable\ML\Anaconda3\envs\pytorch\lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "I:\GraduationProject\yolov5-5.0-sniperitf798\train.py", line 12, in <module>
import torch.distributed as dist
File "I:\EnvVariable\ML\Anaconda3\envs\pytorch\lib\site-packages\torch\__init__.py", line 124, in <module>
raise err
OSError: [WinError 1455] 页面文件太小,无法完成操作。 Error loading "I:\EnvVariable\ML\Anaconda3\envs\pytorch\lib\site-packages\torch\lib\caffe2_detectron_ops_gpu.dll" or one of its dependencies.
Traceback (most recent call last):
File "I:\EnvVariable\ML\Anaconda3\envs\pytorch\lib\site-packages\IPython\core\interactiveshell.py", line 3343, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "" , line 1, in <module>
runfile('I:/GraduationProject/yolov5-5.0-sniperitf798/train.py', wdir='I:/GraduationProject/yolov5-5.0-sniperitf798')
File "I:\DevSoftware\Python\JetBrains\PyCharm 2020.2.3\plugins\python\helpers\pydev\_pydev_bundle\pydev_umd.py", line 198, in runfile
pydev_imports.execfile(filename, global_vars, local_vars) # execute the script
File "I:\DevSoftware\Python\JetBrains\PyCharm 2020.2.3\plugins\python\helpers\pydev\_pydev_imps\_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "I:/GraduationProject/yolov5-5.0-sniperitf798/train.py", line 545, in <module>
train(hyp, opt, device, tb_writer)
File "I:/GraduationProject/yolov5-5.0-sniperitf798/train.py", line 194, in train
image_weights=opt.image_weights, quad=opt.quad, prefix=colorstr('train: '))
File "I:\GraduationProject\yolov5-5.0-sniperitf798\utils\datasets.py", line 84, in create_dataloader
collate_fn=LoadImagesAndLabels.collate_fn4 if quad else LoadImagesAndLabels.collate_fn)
File "I:\GraduationProject\yolov5-5.0-sniperitf798\utils\datasets.py", line 97, in __init__
self.iterator = super().__iter__()
File "I:\EnvVariable\ML\Anaconda3\envs\pytorch\lib\site-packages\torch\utils\data\dataloader.py", line 359, in __iter__
return self._get_iterator()
File "I:\EnvVariable\ML\Anaconda3\envs\pytorch\lib\site-packages\torch\utils\data\dataloader.py", line 305, in _get_iterator
return _MultiProcessingDataLoaderIter(self)
File "I:\EnvVariable\ML\Anaconda3\envs\pytorch\lib\site-packages\torch\utils\data\dataloader.py", line 918, in __init__
w.start()
File "I:\EnvVariable\ML\Anaconda3\envs\pytorch\lib\multiprocessing\process.py", line 105, in start
self._popen = self._Popen(self)
File "I:\EnvVariable\ML\Anaconda3\envs\pytorch\lib\multiprocessing\context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "I:\EnvVariable\ML\Anaconda3\envs\pytorch\lib\multiprocessing\context.py", line 322, in _Popen
return Popen(process_obj)
File "I:\EnvVariable\ML\Anaconda3\envs\pytorch\lib\multiprocessing\popen_spawn_win32.py", line 65, in __init__
reduction.dump(process_obj, to_child)
File "I:\EnvVariable\ML\Anaconda3\envs\pytorch\lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
BrokenPipeError: [Errno 32] Broken pipe
电脑在默认情况下没有给I盘分配虚拟内存,所以将Python装在I盘的,在跑程序的时候,没有分配虚拟内存,就会遇到上面的问题。所以,只要给I盘分派虚拟内存即可。如果Python安装在C盘,更改C盘的虚拟内存的值,调大些。
解决上面一二问题开始出现新的问题,具体报错如下:
train: Scanning 'VOC\labels\train.cache' images and labels... 16551 found, 0 missing, 0 empty, 0 corrupted: 100%|██████████| 16551/16551 [00:00<?, ?it/s]
OMP: Error #15: Initializing libiomp5md.dll, but found libiomp5md.dll already initialized.
OMP: Hint This means that multiple copies of the OpenMP runtime have been linked into the program. That is dangerous, since it can degrade performance or cause incorrect results. The best thing to do is to ensure that only a single OpenMP runtime is linked into the process, e.g. by avoiding static linking of the OpenMP runtime in any library. As an unsafe, unsupported, undocumented workaround you can set the environment variable KMP_DUPLICATE_LIB_OK=TRUE to allow the program to continue to execute, but that may cause crashes or silently produce incorrect results. For more information, please see http://www.intel.com/software/products/support/.
OMP: Error #15: Initializing libiomp5md.dll, but found libiomp5md.dll already initialized.
OMP: Hint This means that multiple copies of the OpenMP runtime have been linked into the program. That is dangerous, since it can degrade performance or cause incorrect results. The best thing to do is to ensure that only a single OpenMP runtime is linked into the process, e.g. by avoiding static linking of the OpenMP runtime in any library. As an unsafe, unsupported, undocumented workaround you can set the environment variable KMP_DUPLICATE_LIB_OK=TRUE to allow the program to continue to execute, but that may cause crashes or silently produce incorrect results. For more information, please see http://www.intel.com/software/products/support/.
OMP: Error #15: Initializing libiomp5md.dll, but found libiomp5md.dll already initialized.
OMP: Hint This means that multiple copies of the OpenMP runtime have been linked into the program. That is dangerous, since it can degrade performance or cause incorrect results. The best thing to do is to ensure that only a single OpenMP runtime is linked into the process, e.g. by avoiding static linking of the OpenMP runtime in any library. As an unsafe, unsupported, undocumented workaround you can set the environment variable KMP_DUPLICATE_LIB_OK=TRUE to allow the program to continue to execute, but that may cause crashes or silently produce incorrect results. For more information, please see http://www.intel.com/software/products/support/.
OMP: Error #15: Initializing libiomp5md.dll, but found libiomp5md.dll already initialized.
OMP: Hint This means that multiple copies of the OpenMP runtime have been linked into the program. That is dangerous, since it can degrade performance or cause incorrect results. The best thing to do is to ensure that only a single OpenMP runtime is linked into the process, e.g. by avoiding static linking of the OpenMP runtime in any library. As an unsafe, unsupported, undocumented workaround you can set the environment variable KMP_DUPLICATE_LIB_OK=TRUE to allow the program to continue to execute, but that may cause crashes or silently produce incorrect results. For more information, please see http://www.intel.com/software/products/support/.
OMP: Error #15: Initializing libiomp5md.dll, but found libiomp5md.dll already initialized.
OMP: Hint This means that multiple copies of the OpenMP runtime have been linked into the program. That is dangerous, since it can degrade performance or cause incorrect results. The best thing to do is to ensure that only a single OpenMP runtime is linked into the process, e.g. by avoiding static linking of the OpenMP runtime in any library. As an unsafe, unsupported, undocumented workaround you can set the environment variable KMP_DUPLICATE_LIB_OK=TRUE to allow the program to continue to execute, but that may cause crashes or silently produce incorrect results. For more information, please see http://www.intel.com/software/products/support/.
OMP: Error #15: Initializing libiomp5md.dll, but found libiomp5md.dll already initialized.
OMP: Hint This means that multiple copies of the OpenMP runtime have been linked into the program. That is dangerous, since it can degrade performance or cause incorrect results. The best thing to do is to ensure that only a single OpenMP runtime is linked into the process, e.g. by avoiding static linking of the OpenMP runtime in any library. As an unsafe, unsupported, undocumented workaround you can set the environment variable KMP_DUPLICATE_LIB_OK=TRUE to allow the program to continue to execute, but that may cause crashes or silently produce incorrect results. For more information, please see http://www.intel.com/software/products/support/.
OMP: Error #15: Initializing libiomp5md.dll, but found libiomp5md.dll already initialized.
OMP: Hint This means that multiple copies of the OpenMP runtime have been linked into the program. That is dangerous, since it can degrade performance or cause incorrect results. The best thing to do is to ensure that only a single OpenMP runtime is linked into the process, e.g. by avoiding static linking of the OpenMP runtime in any library. As an unsafe, unsupported, undocumented workaround you can set the environment variable KMP_DUPLICATE_LIB_OK=TRUE to allow the program to continue to execute, but that may cause crashes or silently produce incorrect results. For more information, please see http://www.intel.com/software/products/support/.
Scanning images: 0%| | 0/4952 [00:00<?, ?it/s]OMP: Error #15: Initializing libiomp5md.dll, but found libiomp5md.dll already initialized.
OMP: Hint This means that multiple copies of the OpenMP runtime have been linked into the program. That is dangerous, since it can degrade performance or cause incorrect results. The best thing to do is to ensure that only a single OpenMP runtime is linked into the process, e.g. by avoiding static linking of the OpenMP runtime in any library. As an unsafe, unsupported, undocumented workaround you can set the environment variable KMP_DUPLICATE_LIB_OK=TRUE to allow the program to continue to execute, but that may cause crashes or silently produce incorrect results. For more information, please see http://www.intel.com/software/products/support/.
val: Scanning 'VOC\labels\val' images and labels... 4952 found, 0 missing, 0 empty, 0 corrupted: 100%|██████████| 4952/4952 [00:05<00:00, 846.03it/s]
val: New cache created: VOC\labels\val.cache
Plotting labels...
OMP: Error #15: Initializing libiomp5md.dll, but found libiomp5md.dll already initialized.
OMP: Hint This means that multiple copies of the OpenMP runtime have been linked into the program. That is dangerous, since it can degrade performance or cause incorrect results. The best thing to do is to ensure that only a single OpenMP runtime is linked into the process, e.g. by avoiding static linking of the OpenMP runtime in any library. As an unsafe, unsupported, undocumented workaround you can set the environment variable KMP_DUPLICATE_LIB_OK=TRUE to allow the program to continue to execute, but that may cause crashes or silently produce incorrect results. For more information, please see http://www.intel.com/software/products/support/.
解决方法:
在train.py开头添加以下代码:
##OMP报错
import os
os.environ['KMP_DUPLICATE_LIB_OK'] = 'TRUE'
参考:
彻底解决pycharm中: OSError: [WinError 1455] 页面文件太小,无法完成操作的问题–亲测