一、Alphapose
首先是Alphapose,遇到的问题是:1.加载到100%的时候卡住
2.Opencv报错:Failed to load OpenH264 library: openh264-1.8.0-win64.dll
首先解决第一个问题,根据警告提示,Userwarning: Failed to load image Python extension:warn(f"Failed to load image python extension: ie}")
这个警告表明在加载图像时,torchvision 在尝试使用某个 Python 扩展时遇到了问题。这可能会影响图像加载的性能或功能。
(1)检查 Pillow 或 PIL 安装:
torchvision 使用 Pillow(或其前身 PIL)库来处理图像。确保你的环境中安装了 Pillow,并且它是最新版本:
pip install --upgrade Pillow
如果你已经使用了 Pillow,确保版本在兼容的范围内,不会导致与 torchvision 不兼容的问题。
(2)检查依赖项:
pip install --upgrade torchvision,pip install --upgrade torch...
第二个警告
这个警告表明你正在使用的 torchvision 模型中的 pretrained
参数已经被弃用,而且在将来的版本中可能会被移除。替代参数是 weights
。感觉影响不大。
这是我先对警告的一个修改,然后发现卡住的原因和视频的格式以及pytorch、cuda、torchvision
有关,这仨要版本对应。其次,我将mp4格式的换成了avi格式发现能用了(视频的格式会有影响),但是会报个错。
Releases · cisco/openh264 · GitHub下载相关的dll文件,然后把它放进C:\Windows\System32就好了。
参考了MotionBert论文解读及详细复现教程-CSDN博客
对alphapose进行测试
python scripts/demo_inference.py --cfg configs/halpe_26/resnet/256x192_res50_lr1e-3_1x.yaml --checkpoint pretrained_models/halpe26_fast_res50_256x192.pth --indir examples/demo/ --save_img
并导入视频
python scripts/demo_inference.py --cfg configs/halpe_26/resnet/256x192_res50_lr1e-3_1x.yaml --checkpoint pretrained_models/halpe26_fast_res50_256x192.pth --video examples/demo/test_video.mp4 --save_video
二、然后motionbert遇到的问题是
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:
if __name__ == '__main__':
freeze_support()
...
The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce an executable.
原因可能是linux行的windows不一定行呀。根据资料查阅,有两个解决方法
参考RuntimeError: An attempt has been made to start a new process before the current pr_has beenmade-CSDN博客
解决RuntimeError: An attempt has been made to start a new process before...办法_yolov8 runtimeerror:-CSDN博客1.加main函数,在main中调用,继续多进程加载,可以加速
import os
import numpy as np
import argparse
from tqdm import tqdm
import imageio
import torch
import torch.nn as nn
from torch.utils.data import DataLoader
from lib.utils.tools import *
from lib.utils.learning import *
from lib.utils.utils_data import flip_data
from lib.data.dataset_wild import WildDetDataset
from lib.utils.vismo import render_and_save
def parse_args():
parser = argparse.ArgumentParser()
parser.add_argument("--config", type=str, default="configs/pose3d/MB_ft_h36m_global_lite.yaml", help="Path to the config file.")
parser.add_argument('-e', '--evaluate', default='checkpoint/pose3d/FT_MB_lite_MB_ft_h36m_global_lite/best_epoch.bin', type=str, metavar='FILENAME', help='checkpoint to evaluate (file name)')
parser.add_argument('-j', '--json_path', type=str, help='alphapose detection result json path')
parser.add_argument('-v', '--vid_path', type=str, help='video path')
parser.add_argument('-o', '--out_path', type=str, help='output path')
parser.add_argument('--pixel', action='store_true', help='align with pixle coordinates')
parser.add_argument('--focus', type=int, default=None, help='target person id')
parser.add_argument('--clip_len', type=int, default=243, help='clip length for network input')
opts = parser.parse_args()
return opts
def main(argv=None):
opts = parse_args()
args = get_config(opts.config)
model_backbone = load_backbone(args)
if torch.cuda.is_available():
model_backbone = nn.DataParallel(model_backbone)
model_backbone = model_backbone.cuda()
print('Loading checkpoint', opts.evaluate)
checkpoint = torch.load(opts.evaluate, map_location=lambda storage, loc: storage)
model_backbone.load_state_dict(checkpoint['model_pos'], strict=True)
model_pos = model_backbone
model_pos.eval()
testloader_params = {
'batch_size': 1,
'shuffle': False,
'num_workers': 8,
'pin_memory': True,
'prefetch_factor': 4,
'persistent_workers': True,
'drop_last': False
}
vid = imageio.get_reader(opts.vid_path, 'ffmpeg')
fps_in = vid.get_meta_data()['fps']
vid_size = vid.get_meta_data()['size']
os.makedirs(opts.out_path, exist_ok=True)
if opts.pixel:
# Keep relative scale with pixel coornidates
wild_dataset = WildDetDataset(opts.json_path, clip_len=opts.clip_len, vid_size=vid_size, scale_range=None, focus=opts.focus)
else:
# Scale to [-1,1]
wild_dataset = WildDetDataset(opts.json_path, clip_len=opts.clip_len, scale_range=[1,1], focus=opts.focus)
test_loader = DataLoader(wild_dataset, **testloader_params)
results_all = []
with torch.no_grad():
for batch_input in tqdm(test_loader):
N, T = batch_input.shape[:2]
if torch.cuda.is_available():
batch_input = batch_input.cuda()
if args.no_conf:
batch_input = batch_input[:, :, :, :2]
if args.flip:
batch_input_flip = flip_data(batch_input)
predicted_3d_pos_1 = model_pos(batch_input)
predicted_3d_pos_flip = model_pos(batch_input_flip)
predicted_3d_pos_2 = flip_data(predicted_3d_pos_flip) # Flip back
predicted_3d_pos = (predicted_3d_pos_1 + predicted_3d_pos_2) / 2.0
else:
predicted_3d_pos = model_pos(batch_input)
if args.rootrel:
predicted_3d_pos[:,:,0,:]=0 # [N,T,17,3]
else:
predicted_3d_pos[:,0,0,2]=0
pass
if args.gt_2d:
predicted_3d_pos[...,:2] = batch_input[...,:2]
results_all.append(predicted_3d_pos.cpu().numpy())
results_all = np.hstack(results_all)
results_all = np.concatenate(results_all)
render_and_save(results_all, '%s/X3D.mp4' % (opts.out_path), keep_imgs=False, fps=fps_in)
if opts.pixel:
# Convert to pixel coordinates
results_all = results_all * (min(vid_size) / 2.0)
results_all[:,:,:2] = results_all[:,:,:2] + np.array(vid_size) / 2.0
np.save('%s/X3D.npy' % (opts.out_path), results_all)
#def main():
# 主要功能代码
#print(1)
# freeze_support()
if __name__ == '__main__':
sys.exit(main())
#freeze_support()
2.num_workers改为0,单进程加载。(不太行)直接报错
"Anconda\envs\MotionBERT\lib\site-packages\torch\utils\data\dataloader.py", line 236, in __init__ raise ValueError('prefetch_factor option could only be specified in multiprocessing.'ValueError: prefetch_factor option could only be specified in multiprocessing.let num_workers > 0 to enable multiprocessing.