不知不觉距离上一次写博客刚好过去20天,这一次撰写博客主要针对我过去20天工作的做一个总结。
之前环境已经搭建好了,接下来就要开始做手势识别的内容了。这些天里,我的主要工作分为以下几个方面:
这一部分我打算粗略的讲一下,并推荐一些我看过觉得有用的资料。
基于视觉的手势识别目前大致可以分为基于机器学习的和基于深度学习的,但两者之间还是有交叉一致的地方,两者流程如下所示。
更多细节,大家可以参考文献《视觉动态手势识别综述》
但是我做的没有文献里讲的那么多,目前只进行到手势检测与分割。手势检测与分割可以借鉴一下下面的资料:
基于OpenCV的手势识别完整项目(Python3.7)
手势识别与分割
https://github.com/mahaveerverma/hand-gesture-recognition-opencv
CSDN和GitHub上有很多资源,但是使用c#的资料很少很少,需要对代码进行重写。如果有这部分功底不是很强的兄弟,可以看c++的代码,和c#代码相似度更高,更方便对代码转换。
总结:
说说自己在手势分割的这部分的体会,首先对于新手来说处理好复杂背景是一件非常难得工作,我一度很抓狂,甚至想换一个背景简单的工作地方,然后给自己配一件纯白的衣服,但是想想还是算了,最后做出来的东西还是需要具有一定的普适性,所以找了一份OpenCVSharp的教程学习对图像进行处理,结果差强人意。但是当时用更高级的算法时,算法时间复杂度太高,运行之后很卡顿,达不到我项目要求,所以我还是放弃了。不过还是学了很多东西,以后设备更好的话,可以再次尝试使用这种方式。
之前的工作进行不下去,所以我开始转战,思考着我有着花大价钱买来的kinect,它能不能发挥点更大的作用呢?很nice!转到我上次发的博客中,提到过下载安装kinect SDK后,电脑上会多出三个东西。
发现这三个软件里,刚好有关于手势的东西,所以尝试了一下,结果真的可以实现手势识别,接下来我就对使用这个软件建立一个完整的手势识别工程进行讲解。在这个过程中,我们需要使用到kinect Studio、Visual Gesture Builder以及Unity3D
解压之后,双击Kinect.2.0.1410.19000,导入基础插件,将KinectView文件夹拖入到Assets文件夹中
进入MainScence,运行就基本可以调用kinect了,如果看不到骨骼信息,就走两步,可能是东西挡住了身体识别不出来。
接下来导入Kinect.VisualGestureBuilder.2.0.1410.19000这个插件,导入完成环境就搭建好了。
接下来关于使用kinect studio和VBG对手势进行录制、剪辑、标记、生成、测试的过程我就不详说,大家可以参考:Visual Gesture Builder姿态动作识别教程和Unity5 利用Kinect Studio 和Gesture Builder建立自定义姿势分类器,最终生成一个.gba文件就可以了。
值得注意的是,在第二个资料里,他将.gba文件填在了Database Path里,这个地方是不对的,这样会导致最后的路径中的反斜杠不是朝一个方向的,所以索性这里什么都不写,空着。
这个是他的方式。
这是我的方式
代码
using UnityEngine;
using System.Collections;
using System.Collections.Generic;
using Windows.Kinect;
using Microsoft.Kinect.VisualGestureBuilder;
public class GestureSourceManager_right : MonoBehaviour
{
//没用到,事件我觉得自定义比较好,看个人吧
public struct EventArgs
{
public string name;
public float confidence;
public EventArgs(string _name, float _confidence)
{
name = _name;
confidence = _confidence;
}
}
//定义手势状态
private enum gestureState
{
Unknown = 0,
moving = 1,
gribing = 2
}
public BodySourceManager _BodySource;
public string databasePath;
private KinectSensor _Sensor;
private VisualGestureBuilderFrameSource _Source;
private VisualGestureBuilderFrameReader _Reader;
private VisualGestureBuilderDatabase _Database;
// Gesture Detection Events
public delegate void GestureAction(EventArgs e);
public event GestureAction OnGesture;
private gestureState currentGesture;
// Use this for initialization
void Start()
{
_Sensor = KinectSensor.GetDefault();
if (_Sensor != null)
{
if (!_Sensor.IsOpen)
{
_Sensor.Open();
}
// Set up Gesture Source
_Source = VisualGestureBuilderFrameSource.Create(_Sensor, 0);
// open the reader for the vgb frames
_Reader = _Source.OpenReader();
if (_Reader != null)
{
_Reader.IsPaused = true;
_Reader.FrameArrived += GestureFrameArrived;
}
// load the ‘Squat’ gesture from the gesture database
string path = System.IO.Path.Combine(Application.streamingAssetsPath, databasePath);
Debug.Log("database path is " + path);
_Database = VisualGestureBuilderDatabase.Create(path + "/right_action1.gbd");
// Load all gestures
IList<Gesture> gesturesList = _Database.AvailableGestures;
for (int x = 0; x < gesturesList.Count; x++)
{
Gesture g = gesturesList[x];
//if (g.Name.Equals("Squat"))
//{
_Source.AddGesture(g);
//}
}
//for (int g = 0; g < gesturesList.Count; g++)
// {
// Gesture gesture = gesturesList[g];
// _Source.AddGesture(gesture);
//}
}
currentGesture = gestureState.Unknown;
}
// Public setter for Body ID to track
public void SetBody(ulong id)
{
if (id > 0)
{
_Source.TrackingId = id;
_Reader.IsPaused = false;
Debug.Log("id is " + id);
}
else
{
_Source.TrackingId = 0;
_Reader.IsPaused = true;
}
}
// Update Loop, set body if we need one
void Update()
{
if (!_Source.IsTrackingIdValid)
{
//print ("found");
FindValidBody();
}
}
// Check Body Manager, grab first valid body
void FindValidBody()
{
if (_BodySource != null)
{
Body[] bodies = _BodySource.GetData();
if (bodies != null)
{
foreach (Body body in bodies)
{
if (body.IsTracked)
{
SetBody(body.TrackingId);
break;
}
}
}
}
}
/// Handles gesture detection results arriving from the sensor for the associated body tracking Id
private void GestureFrameArrived(object sender, VisualGestureBuilderFrameArrivedEventArgs e)
{
//Debug.Log ("GestureFrameArrived CALLED!");
VisualGestureBuilderFrameReference frameReference = e.FrameReference;
using (VisualGestureBuilderFrame frame = frameReference.AcquireFrame())
{
if (frame != null)
{
// get the discrete gesture results which arrived with the latest frame
IDictionary<Gesture, DiscreteGestureResult> discreteResults = frame.DiscreteGestureResults;
//定义手势字典
Dictionary<string, double> DataOnOneFrame = new Dictionary<string, double>();
if (discreteResults != null)
{
DiscreteGestureResult result = null;
foreach (Gesture gesture in _Source.Gestures)
{
print("G: " + gesture.Name);
if (gesture.GestureType == GestureType.Discrete)
{
discreteResults.TryGetValue(gesture, out result);
//将所有手势存在字典里,用来判断当前手势
DataOnOneFrame.Add(gesture.Name, result.Confidence);
//if (result != null)
//{
// // Fire Event
// //OnGesture (new EventArgs (gesture.Name, result.Confidence));
// Debug.Log("Detected Gesture " +gesture.Name + " with Confidence "+result.Confidence);
//}
//else
//{
// Debug.Log("result is NULL@: "+result);
//}
}
}
//判断手势
JudgeGesture(DataOnOneFrame);
DataOnOneFrame.Clear();
}
}
}
}
//这个函数用来判断手势,具体判断方法自定义,其中包括手势状态切换
private void JudgeGesture(Dictionary<string, double> OneFrameData)
{
if (OneFrameData["grib_Right"] < 0.1 && OneFrameData["move_Right"]<0.1)
{
currentGesture = gestureState.Unknown;
print("current stage is unknown!");
}
else if (OneFrameData["grib_Right"] > OneFrameData["move_Right"]|| OneFrameData["grib_Right"] == OneFrameData["move_Right"])
{
if (currentGesture != gestureState.gribing)
{
currentGesture = gestureState.gribing;
print("gesture stage has changed, current stage is gribing!");
}
}
else if (OneFrameData["grib_Right"] < OneFrameData["move_Right"])
{
if(currentGesture != gestureState.moving)
{
currentGesture = gestureState.moving;
print("gesture stage has changed, current stage is moving!");
}
}
}
}
随着手势种类的增多和手势复杂度的提高,如果手势是运动的话,离散型的手势库已经不是很实用了,因为错判率会逐渐提高,这个时候就需要定义连续手势。根据我个人的实践感受,连续手势只是在离散手势的基础上增加了完成度progress这个参数,可能也是我没有感受到更加细微的变化吧,随着实践增多,理解的也会更透彻。所以运动手势增多时,使用连续手势就可以使用离散手势的confidence参数和连续手势的progress参数两个参数共同来判定当前手势,极大地降低了手势的错判率。在写代码之前,先参考一下这两个兄弟写的关于连续手势库的建立:
Kinect For Unity3D 利用Kinect Studio 和Visual Gesture Builder建立自定义姿势之录制连续动作,判断Progress
C#动手实践:Kinect V2 开发(3): V2中的大杀器——Visual Gesture Builder 手势识别+一站式解决方案
代码之前的部分基本上没有什么变化,只是手势库路径要稍微修改一下,代码从GestureFrameArrived()之后开始变化。
/// Handles gesture detection results arriving from the sensor for the associated body tracking Id
private void GestureFrameArrived(object sender, VisualGestureBuilderFrameArrivedEventArgs e)
{
//Debug.Log ("GestureFrameArrived CALLED!");
VisualGestureBuilderFrameReference frameReference = e.FrameReference;
using (VisualGestureBuilderFrame frame = frameReference.AcquireFrame())
{
if (frame != null)
{
// get the discrete gesture results which arrived with the latest frame
IDictionary<Gesture, DiscreteGestureResult> discreteResults = frame.DiscreteGestureResults;
IDictionary<Gesture, ContinuousGestureResult> continuousResults = frame.ContinuousGestureResults;
Dictionary<string, double> OneGroupOfGesture = new Dictionary<string, double>();
//ContinuousGestureResult
//print(discreteResults);
if (discreteResults != null && continuousResults != null)
{
DiscreteGestureResult Dresult = null;
ContinuousGestureResult Cresult = null;
int count = 0;
foreach (Gesture gesture in _Source.Gestures)
{
count++;
print(gesture.Name+count);
if (gesture.GestureType == GestureType.Discrete)
{
//print("Discrete: " +gesture.Name);
discreteResults.TryGetValue(gesture, out Dresult);
OneGroupOfGesture.Add(gesture.Name, Dresult.Confidence);
}
else if (gesture.GestureType == GestureType.Continuous)
{
//print("Continuous:" + gesture.Name);
continuousResults.TryGetValue(gesture, out Cresult);
OneGroupOfGesture.Add(gesture.Name, Cresult.Progress);
}
//将同一手势的离散数据和连续数据成组,用来判断
if (OneGroupOfGesture.Count == 2)
{
//foreach(string key in OneGroupOfGesture.Keys)
//{
// Debug.Log("for key = " + key + ", value = "+OneGroupOfGesture[key]);
//}
JudgeGesture(OneGroupOfGesture);
OneGroupOfGesture.Clear();
}
}
}
}
}
}
private void stopToDo(object a)
{
IsDetecting = true;
print("Detecting!");
}
private void JudgeGesture(Dictionary<string, double> gesture)
{
int stopTime = 5000;
if (IsDetecting)
{
foreach (string key in gesture.Keys)
{
switch (key)
{
case "conSwipDown_Left":
//前面是判断置信度,后面是判断完成度,参数可调提高识别率
if (gesture["conSwipDown_Left"] > 0.8 && gesture["conSwipDownProgress_Left"] > 0.8)
{
print("swiping down!");
IsDetecting = false;
print("stop detecting, start waiting " + stopTime + "ms");
//定时器用来避免一段时间内重复检测同一手势,使得检测倒一手势之后,停止stoptime毫秒之后在检测下一手势
Timer StopDetecing = new Timer(new TimerCallback(stopToDo), null, stopTime, -1);
}
break;
case "conSwipUp_Left":
if (gesture["conSwipUp_Left"] > 0.8 && gesture["conSwipUpProgress_Left"] > 0.8)
{
print("swiping up!");
IsDetecting = false;
print("stop detecting, start waiting" + stopTime + "ms");
Timer StopDetecing = new Timer(new TimerCallback(stopToDo), null, stopTime, -1);
}
break;
case "conSwipLeft_Left":
if (gesture["conSwipLeft_Left"] > 0.5 && gesture["conSwipLeftProgress_Left"] > 0.3)
{
print("swiping left!");
IsDetecting = false;
print("stop detecting, start waiting" + stopTime + "ms");
Timer StopDetecing = new Timer(new TimerCallback(stopToDo), null, stopTime, -1);
}
break;
case "conSwipRight_Left":
if (gesture["conSwipRight_Left"] > 0.8 && gesture["conSwipRightProgress_Left"] > 0.8)
{
print("swiping right!");
IsDetecting = false;
print("stop detecting, start waiting"+stopTime+"ms");
Timer StopDetecing = new Timer(new TimerCallback(stopToDo), null, stopTime, -1);
}
break;
default:
break;
}
}
}
}
这20天的工作基本就这些,大家如有什么疑问,欢迎交流!
之后的工作是使用kinect定位手的空间位置,也欢迎大家交流想法。