Micro-expression recognition: an updated review of current trends,challenges and solutions 论文笔记

一篇微表情识别的综述,发表在The Visual Computer上。写得蛮详细的,记录一下阅读的笔记

Abstract 摘要:

Challenges in these areas remain relevant due to the nature ofME’s split-second transition with minute intensity levels.
微表情识别的挑战在于,微表情是发生在短时间内微笑变化。

1 Introduction 简介

Micro-expression (ME) is a transitory motion of the human face that usually lasts between 1/25 and 1/5 s.
微表情是脸部微笑的变化,持续时间一般为1/25到1/5秒。
Contrasting from normal facial expressions, it is difficult to intentionally produce or neutralize ME which makes for effective evidence of lie detection that hap pens in scenarios where a person has something to lose or gain.
与普通面部表征不同,微表情的获取更加不易。
Ekman has categorized human emotions into seven universal emotions, i.e. anger, happiness, sadness, disgust,surprise, fear, and contempt.
埃克曼将人类的情绪分为了七种:愤怒,高兴,悲伤,憎恶,惊讶,恐惧,轻蔑。
Ekman and Friesen have also introduced Facial Action Coding System (FACS) to define facial expressions through action units (AUs).
埃克曼与弗里森引入了脸部运动编码系统(FACS),通过运动单元(AU)定义脸部的表现特征。
AU is an observable component of facial movement where distinct facial areas are used to detect fine-grained expression changes on faces.
AU是一个可观测到的脸部运动组成部分,可区别脸部区域用来检测脸部表现的细小变化。
There are currently a total of 44 AUs that happen independently or simultaneously with other AUs to express an emotion.
目前定义了44个AU,通过独立运动或与其他AU联动,共同表达一个情绪。
the accuracy of human recognition with AU is only about 40% due to short-lived ME occurrences.
  因为微表情发生时间较短,通过AU的微表情识别率仅为40%。
current challenges such as environmental variation, spontaneous subtle motion, and imbalanced datasets that greatly impact detection and recognition accuracy.
现有的挑战包括环境的变化,自发的细微表情,不均衡的数据集等,这些均影响着检测与识别的准确度
Environmental variation is the most challenging issue in ME recognition which includes illumination variation and head-pose variation.
环境变化是微表情识别中最具挑战的问题,包括光照变化和头部姿态变化。
The low intensity of subtle and spontaneous facial movement is a major challenge for ME recognition which renders emotion recognition non-distinguishable through the naked eyes.
低强度细小的自发面部运动是微表情识别的一个难点。
Although recommended in evaluating ME recognition system, their imbalanced data distribution across expressions may lead to biases in results.
样本数量不均衡的数据集,容易导致结果的偏差。

2 Context 相关内容

2.1 Datase 数据集

Micro-expression recognition: an updated review of current trends,challenges and solutions 论文笔记_第1张图片
所有常用数据集的特点如表所示。

2.2 General pipeline 通用流程

The ME recognition process can be divided into image acquisition, face detection, pre-processing, ME spotting, feature extraction, and ME classification.
微表情识别可以分为图像获取、人脸检测、预处理、微表情检测、特征提取和微表情分类等几步。
Micro-expression recognition: an updated review of current trends,challenges and solutions 论文笔记_第2张图片
several pre-processing steps are implemented to overcome lighting variation or noise attack.
预处理部分主要用于解决光线变化和噪声干扰等问题。
research on ME spotting from images or videos is a potential future research direction.
  微表情检测是未来一个有潜力的研究方向。

2.3 Pre-processing预处理

Pre-processing in ME recognition usually involves face detection, face registration, motion magnification, and temporal normalization.
微表情识别的预处理包括:面部检测、面部识别、运动增强和时序归一化。

2.3.1 Face detection and registration面部检测与归一化

The face registration stage aligns a detected face onto a reference face
面部注册的作用在于将用关键点将面部图像映射到一个标准姿态上。

2.3.2 Motion magnification运动增强

motion magnification techniques are introduced to increase distinguishing powers between different motions.
运动增强的目的是为了提升微表情的区分度。

2.3.3 Temporal normalization 时间域的归一化

Temporal interpolation model (TIM) method is commonly used to normalize video lengths.
时序归一化用于对视频长度进行归一化。

2.4 Classification分类

Classification normally refers to the categorization of emotions based on selected features input.
分类是指通过输入的特征对微表情进行类别划分。

3 Features for ME representation 微表情特征

3.1 Low-level representation 底层表征

Low-level features are normally represented in the form of descriptors containing a bunch of visual data cue without explicit semantic meaning/knowledge.
低层特征仅仅是对一系列视频数据进行描述,不涉及语义信息与先验知识。
In this paper, we briefly describe features in the following family: local binary pattern (LBP), optical flow, gradient based, and their respective variants.
本课题包括的低层特征包括:LBP、光流、梯度相关及这些方法的变种。

3.2 Mid-level representation 中层表征

low-level features remain inadequate in representing subtle motions due to short duration, low intensity, noise and head-pose changes.
底层特征的缺点是,对于因短持续时间、低强度、噪声和头部姿态变化造成的微小运动,其表征能力不满足要求。
Mid-level feature is a technique to transform local features into image representations for classification purpose where weightage is added to bring explicit meanings and knowledge to local features.
中层特征是在底层特征的基础上,添加了权重以反映一些现实意义及先验知识。
The most common mid-level technique is the bag-of-words (BoW) representation that is commonly used in affect recognition.
中层特征里面最常用的是词袋模型。
To conclude, there are limited mid-level representations proposed to handle ME recognition.
  中层特征在微表情识别中用得不多。

3.3 High-level representation 高层表征

A high-level representation can be defined as a set of semantic data that are human interpretable, where the high-level features are a combination of several low-level features.
一个高层特征,可以理解为一系列人工可解释的特征组成的集合,一个高层特征可能是多个低层特征组合而成的。

4 Micro-expression spotting 微表情检测(感觉就是从视频序列里面检测到微表情开始与结束)

ME spotting is a stage where frames containing emotions are detected in time for a given video.
微表情检测是指,从视频序列中检测到包含表情的时间段。

4.1 Appearance-based approach基于外观的方法

Appearance-based approach normally refers to feature representation constructed in pixel-wise level, especially by the intensity value.
基于外观的方法主要是借鉴像素级别的特征,尤其是强度值(的变化特征);
However, this method is indifferent to non-micro-expression movements, such as eye blinking.
Gabor小波和LBP等方法,缺点是多细微运动不敏感,如眨眼。
Besides intensity-based feature, methods such as 3D gradient histogram descriptor or histogram of oriented gradients are also reported for ME spotting.
基于强度特征,3D梯度直方图描述子与梯度方向直方图被引入到了微表情检测领域。

4.2 Dynamic approach动态方法

the feature is constructed based on non-rigid motion changes of subtle expression where motion changes are extracted for spotting purpose.
动态表征是指,提取微表情的非刚体运动特征。

4.3 Generic approach通用方法

其他一些的方法。

5 Recognition results and discussion 识别结果与讨论

5.1 Result 结果

Leave-one-video-out (LOVO) and leave-one-subject-out(LOSO) cross-validation are the most common methods used in ME recognition performance measurement.
  LOVO和LOSO交叉验证法是微表情识别中最常用的性能评价方法.
Instead of LOVO and LOSO, k-fold validation, repeated random sub-sampling validation, or basic hold-out methods are also used for performance evaluation with recall and precision graphs reported in other works.
除了LOVO和LOSO, k-fold验证法、重复随机采样验证或基础保持验证法,也被用于评估方法准确度。

5.2 Discussion and future recommendation 讨论与未来研究的建议

To overcome head-pose variation works on face registration, faces in all frames can be aligned and normalized into the same position and size.
克服头部变化影响的方法,就是使用归一化,将面部图像转换成标准的位置和尺寸。
As far as illumination change is concerned, all algorithms were heavily tested in controlled and  even illumination.
光照变化的影响,暂时没有好的方法,都是在受控的光照环境下进行的研究。
Hence,Euler magnification  is introduced in the pre-processing stage to amplify low- intensity movements.
  对于第二个挑战(面部运动幅度较小和无意识的运动),一般采用欧拉增强的方法,在预处理阶段放大低强度的运动。
In pre-processing, works on magnification are lacking compared to feature extraction and classification.
预处理阶段,与特征提取和分类相比,运动增强相关工作较少。
Studies in ME spotting are limited.
微表情检测相关工作较少。
Most of the proposed features are focused on low-level approach with only two existing works on mid-level features.
手工特征中,低层特征研究的较多,中层特征的研究有待提升。

6 Conclusion 结论

Feature representations are evolving from low-level approach to mid-level and high-level approach.
  特征提取需要从低层特征向中层特征发展。

你可能感兴趣的:(论文解读,计算机视觉,论文阅读)