1. python环境的openpose编译
在cmake的过程中需要设置:
DBUILD_PYTHON=ON
可以直接在openpose工程文件夹下面测试和修改pyhton文件:
# From command line
cd build/examples/tutorial_api_python
# Python 3 (default version)
python3 01_body_from_image.py
python3 02_whole_body_from_image.py
# python3 [any_other_python_example.py]
# Python 2
python2 01_body_from_image.py
python2 02_whole_body_from_image.py
# python2 [any_other_python_example.py]
有两种方式写自己的python代码:
1.可以在examples/tutorial_api_python文件内复制和创建自己的.py文件,但是每次修改.py文件的时候都需要重新编译openpose。
2.也可以直接修改build/examples/tutorial_api_python中的.py文件,但是clean openpose的时候会删除整个build文件夹。
上面两种方法都可行,但是因为我担心一直在openpose下工作会有一天不小心动到里面的文件,所以我还是选择export 一个可以调用的库出来。
2. exporting python Openpose
这一步可以让我们把自己的.py文件移出openpose文件夹。
sudo make install
之后设置在.py文件里面设置openpose安装的路径,一般是:
/usr/local/python
然后就可以直接调用里面的api了。参考下面的.py文件作为例子
build/examples/tutorial_api_python/01_body_from_image.py
在某个教程里面还看到需要:
cd openpose/build/python/openpose
sudo make install
但是我自己是按照github上的openpose说只安装了一次,并且也调用成功了。
2.不安装openpose
直接将openpose里面的python demo移出去,在.py文件里面加入:
sys.path.append('{OpenPose_path}/python')
#{OpenPose_path}是指向openpose的build文件夹
在网上基本找不到有说怎么具体使用openpose的python API的,这对我们这种第一次使用的人太不友好了,C++的阅读难度又比较大,所以我尽量把自己探索的过程记录下来,防止以后自己忘记,可能会有错误,望纠正。
1. 导入
import sys
sys.path.append('/usr/local/python')
from openpose import pyopenpose as op
在跑运行例程的时候可能会报找不到model,需要修改:
params["model_folder"] = "../openpose/models/"
#这一句需要修改成我们的model路径
2. 使用
github下面有一个文档是cpp调用pybind的代码,要看懂首先要指导pybind11的用法。简单学习了解openpose将什么函数封装成了python的API,因为在python下没办法go to definition ,所以还是比较费时间的。
https://blog.csdn.net/zhuikefeng/article/details/107224507
因为我的目的主要是利用openpose去识别用户的姿态,根据不同的姿态去作遥操作,所以我只需要简单根据他的demo改一个能实时读取各个关节点的程序就可以了。
# From Python
# It requires OpenCV installed for Python
import sys
import cv2
import os
from sys import platform
import argparse
from numpy import true_divide
try:
# Import Openpose (Windows/Ubuntu/OSX)
dir_path = os.path.dirname(os.path.realpath(__file__))
try:
# Windows Import
if platform == "win32":
# Change these variables to point to the correct folder (Release/x64 etc.)
sys.path.append(dir_path + '/../../python/openpose/Release');
os.environ['PATH'] = os.environ['PATH'] + ';' + dir_path + '/../../x64/Release;' + dir_path + '/../../bin;'
import pyopenpose as op
else:
# Change these variables to point to the correct folder (Release/x64 etc.)
#sys.path.append('../../python');
sys.path.append('/usr/local/python')
# If you run `make install` (default path is `/usr/local/python` for Ubuntu), you can also access the OpenPose/python module from there. This will install OpenPose and the python library at your desired installation path. Ensure that this is in your python path in order to use it.
# sys.path.append('/usr/local/python')
from openpose import pyopenpose as op
except ImportError as e:
print('Error: OpenPose library could not be found. Did you enable `BUILD_PYTHON` in CMake and have this Python script in the right folder?')
raise e
# Flags
parser = argparse.ArgumentParser()
parser.add_argument("--image_path", default="../../../examples/media/COCO_val2014_000000000192.jpg", help="Process an image. Read all standard formats (jpg, png, bmp, etc.).")
args = parser.parse_known_args()
# Custom Params (refer to include/openpose/flags.hpp for more parameters)
params = dict()
params["model_folder"] = "../openpose/models/"
params["net_resolution"] = "-1x80"#"320x320"
#params["face"] = True
params["hand"] = True
# Add others in path?
for i in range(0, len(args[1])):
curr_item = args[1][i]
if i != len(args[1])-1: next_item = args[1][i+1]
else: next_item = "1"
if "--" in curr_item and "--" in next_item:
key = curr_item.replace('-','')
if key not in params: params[key] = "1"
elif "--" in curr_item and "--" not in next_item:
key = curr_item.replace('-','')
if key not in params: params[key] = next_item
# Construct it from system arguments
# op.init_argv(args[1])
# oppython = op.OpenposePython()
# Starting OpenPose
opWrapper = op.WrapperPython()
#WrapperPython是在C++里面定义的一个类,初始化的时候需要传入线程管理状态,默认是异步的。
opWrapper.configure(params)
#configure是传入用户定义的传输,用dict()的形式
opWrapper.start()
datum = op.Datum()
cap = cv2.VideoCapture(0)#捕获摄像头,修改传入的参数来确定是调用内置摄像头还是外接usb摄像头,测试下传入2的时候调用的外接摄像头
while True:
ret, frame = cap.read()
datum.cvInputData=frame
opWrapper.emplaceAndPop(op.VectorDatum([datum]))
cv2.imshow("capture", datum.cvOutputData)
if cv2.waitKey(200) & 0xFF == ord('q'):#waitkey同时表示每一帧的显示间隔
break
#print("Body keypoints: \n" + str(datum.poseKeypoints))
print("hand keypoints: \n" + str(datum.handKeypoints))
except Exception as e:
print(e)
sys.exit(-1)
参考了以下github项目中利用tf-pose来测试检测人体姿势的效果。电脑内置摄像头和一个内窥镜usb外接摄像头的效果都不是特别好,需要离得比较远才能完成将人放在视野中。
https://github.com/Kamisama12/Human-Pose-Estimation-Benchmarking-and-Action-Recognition
在测试上面git项目过程中有会有一些报错,包括keras的import语句需要更新的,可能是因为上面的项目时间相对比较久了。大部分通过pip安装缺失的库都可以解决。其中一个比较特殊的是:
KeyError: “The name ‘TfPoseEstimator/image:0’ refers to a Tensor which does not exist. The operation, ‘TfPoseEstimator/image’, does not exist in the graph.”
解决方法:在tf_pose里面的estimator.py,在import tensorflow as tf下面增加一句代码:
import tensorflow as tf
tf.compat.v1.disable_eager_execution()
接下来打算借用这个git项目中的网络架构来构建自己的数据集和分类器。
3. 建立数据集
数据集分成两部分,第一部分是raw_data是我们需要进行学习的每个动作的图片集,第二部分是调用openpose对这些图片集提取关键节点坐标之后的数据,是以narray的形式去存储的。
参照github项目中每个姿势按照10fps的速率进行截图,大小是640 x 480进行原始数据的制作。为了简化步骤先定义两种姿势,手平放和竖放。
在写数据记录的代码过程中遇到一个坑,我用了下面的代码去新建一个包含25个空列表的二维列表,却发现传一个数据进去会直接导致他被复制25次。
Mydatacollecting=[[]]*25
引用官方的文档:
Note also that the copies are shallow; nested structures are not copied. This often haunts new Python programmers; consider:
>>> lists = [[]] * 3
>>> lists
[[], [], []]
>>> lists[0].append(3)
>>> lists
[[3], [3], [3]]
What has happened is that [[]] is a one-element list containing an empty list, so all three elements of [[]] * 3 are (pointers to) this single empty list. Modifying any of the elements of lists modifies this single list. You can create a list of different lists this way:
>>>
>>> lists = [[] for i in range(3)]
>>> lists[0].append(3)
>>> lists[1].append(5)
>>> lists[2].append(7)
>>> lists
[[3], [5], [7]]
所以要使用 列表生成式法来新建二维list
test = [[0 for i in range(m)] for j in range(n)]
我的数据集记录代码截取:
Mydata={ "Nose":[],
"Neck":[],
"RShoulder":[],
"RElbow":[],
"RWrist":[],
"LShoulder":[],
"LElbow":[],
"LWrist":[],
"MidHip":[],
"RHip":[],
"RKnee":[],
"RAnkle":[],
"LHip":[],
"LKnee":[],
"LAnkle":[],
"REye":[],
"LEye":[],
"REar":[],
"LEar":[],
"LBigToe":[],
"LSmallToe":[],
"LHeel":[],
"RBigToe":[],
"RSmallToe":[],
"RHeel":[]
}
Mydatacollecting=[[] for i in range(25)]
while True:
ret, frame = cap.read()
datum.cvInputData=frame
opWrapper.emplaceAndPop(op.VectorDatum([datum]))
image=cv2.resize(frame,(640,480))
#原始图片保存
if time.time()-start >= 0.1:#每过0.1s存储一帧
if ret:
for i in range(datum.poseKeypoints[0].shape[0]):#长度25 0-24
Mydatacollecting[i].append(str(datum.poseKeypoints[0][i][0:2]))#将每个关键点的x,y存到对应位置
# print(str(datum.poseKeypoints[0][i][0:2]))
# print("测试数据:",str(datum.poseKeypoints[0][i]))
# print(Mydatacollecting)
if not cv2.imwrite(objectPath+"pose_handsdown_%d.jpg"%image_count,image) or \
not cv2.imwrite(objectPath+"pose_handsdown_pro%d.jpg"%image_count,datum.cvOutputData):
raise Exception("Could not write image")
image_count+=1
start=time.time()
image = cv2.putText(datum.cvOutputData, 'test', org, font,
fontScale, color, thickness, cv2.LINE_AA)
cv2.imshow("capture", image)
if cv2.waitKey(20) & 0xFF == ord('q'):#waitkey同时表示每一帧的显示间隔
break
#datum.poseKeypoints 是一个三维的数组,
# 第一个参数代表画面中的人数,
# 第二个参数代表关节点数量,
# 第三个参数代表每个关节点的坐标数量
'''
我们使用pandas存储每一帧的数据,抛弃置信度。
'''
print("Body keypoints: \n" + str(datum.poseKeypoints[0][1][0:2]))
print("Body keypoints: \n" + str(datum.poseKeypoints.shape[1]))
#print("hand keypoints: \n" + str(datum.handKeypoints))#这个数据可以用narray形式去提取
start=0
for i in Mydata:
Mydata[i]=Mydatacollecting[start]
start+=1
df = pd.DataFrame(Mydata)
df.to_csv('./Mydata_handsdown.csv')
3.数据处理
因为需要通过openpose采集我训练要用的数据集,所以可能还要了解openpose返回的数据结构。后面继续更新,目前打印出来的数据结构如下:
hand keypoints:
[array([[[6.1635260e+02, 2.5399574e+02, 2.2469690e-01],
[6.9517633e+02, 1.7431525e+02, 2.1143548e-02],
[6.9089240e+02, 1.5803644e+02, 2.2210607e-02],
[6.6690259e+02, 2.4200082e+02, 1.7688887e-02],
[6.7975427e+02, 2.5056862e+02, 9.3517723e-03],
[6.8660852e+02, 1.7602881e+02, 2.1608334e-02],
[6.8832208e+02, 1.8973727e+02, 1.3314827e-02],
[6.9431952e+02, 2.0001863e+02, 1.1652143e-02],
[7.1402545e+02, 2.0344576e+02, 8.6863795e-03],
[6.8832208e+02, 1.8716695e+02, 1.2914335e-02],
[6.8746527e+02, 1.9659152e+02, 1.1416899e-02],
[6.8832208e+02, 2.0772964e+02, 1.0228495e-02],
[7.1488226e+02, 2.0772964e+02, 8.6638909e-03],
[6.8917883e+02, 1.8888049e+02, 1.1802982e-02],
[6.8917883e+02, 2.0173219e+02, 1.1355090e-02],
[7.1573901e+02, 2.2315169e+02, 1.0488740e-02],
[7.1916614e+02, 2.3343303e+02, 9.0386569e-03],
[7.0031702e+02, 1.9916185e+02, 1.1225181e-02],
[7.0288733e+02, 2.0772964e+02, 8.8226581e-03],
[7.1830939e+02, 2.2914914e+02, 9.5383683e-03],
[7.2173645e+02, 2.3343303e+02, 8.5081300e-03]]], dtype=float32), array([[[0., 0., 0.],
每个关键节点有三个数据,在2D的模式下应该是x,y和置信度。
后面学习参考项目的数据处理和网络架构,并套用到自己的项目上。内容写到下一篇上面把。