CoreImage和GPUImage的结合使用

大纲

简介
效果预览
技术点
设计思路
- 概述
- 自定义滤镜设计说明
难点分析
- 坐标转换
- 视频帧处理
待优化点
参考资料

简介

本文主要介绍如何结合使用CoreImage的人脸识别和GPUImage滤镜功能，实现在人脸的矩形区域实时添加滤镜的功能。

git地址：DEMO

效果预览

demo实现的效果如下动图所示：

未命名.gif

技术点

CoreImage
GPUImage
OpenGL

设计思路

1. 概述

设计大体思路在于利用GPUImage自定义一个滤镜，这个滤镜的实际效果可以根据自己所需来实现，比如demo中实现的反色效果。重点在于自定义滤镜需要提供一个设置参数mask，参数类型为矩形区域坐标（x，y，width，height），我们可以通过这个参数设置滤镜的作用范围。在自定义的滤镜中，片元着色器获取到传入的这个mask值，然后对每个纹素（可以理解为像素），进行区域判断，如果在mask所设定的区域内，则进行滤镜效果变换，如果不在的话，则不进行处理。

大体流程可以简单分为下面几步：

第一步，获取图像或者视频帧的CIImage对象。
第二步，通过CoreImage识别图像中的人脸，获取人脸的矩形区域坐标 (x , y, width, height)。
第三步，获取矩形区域坐标后，把值赋给自定义滤镜的mask参数。
第四步，自定义滤镜把mask值传递给自己的片元着色器。片元着色器根据传入的区域坐标，决定滤镜的作用区域。

2. 自定义滤镜设计说明

我自定义了一个滤镜类GPUImageCustomColorInvertFilter，这个滤镜提供的主要功能是给视频或者静态图片实现局部反色滤镜的功能。代码如下所示：

//GPUImageCustomColorInvertFilter.h文件
#import "GPUImageFilter.h"

@interface GPUImageCustomColorInvertFilter : GPUImageFilter

@property (nonatomic, assign) CGRect mask; //滤镜作用范围

@end

//GPUImageCustomColorInvertFilter.m文件

#import "GPUImageCustomColorInvertFilter.h"

NSString *const kGPUImageCustomInvertFragmentShaderString = SHADER_STRING
(
 varying highp vec2 textureCoordinate;
 
 uniform sampler2D inputImageTexture;
 
 uniform lowp vec4 mask;
 
 void main()
 {
     lowp vec4 textureColor = texture2D(inputImageTexture, textureCoordinate);
     
     //根据mask判断，当前像素是否在指定举行区域内
     if(gl_FragCoord.x < (mask.x + mask.z) && gl_FragCoord.y < (mask.y + mask.w) && gl_FragCoord.x > mask.x && gl_FragCoord.y > mask.y) {
         gl_FragColor = vec4((1.0 - textureColor.rgb), textureColor.w);//实现反色效果的代码，可以根据自己所需实现不同效果
     }else {
         gl_FragColor = textureColor;
     }
 }
);

@interface GPUImageCustomColorInvertFilter() {
    GLint maskUniform;
}

@end

@implementation GPUImageCustomColorInvertFilter

- (id)init;
{
    if ((self = [super initWithFragmentShaderFromString:kGPUImageCustomInvertFragmentShaderString])) {
        maskUniform = [filterProgram uniformIndex:@"mask"];
    }
    
    return self;
}

- (void)setMask:(CGRect)mask {
    _mask = mask;
    
    NSLog(@"dkTest: %s mask %@", __func__, NSStringFromCGRect(mask));
    
    GPUVector4 maskVector4 = {mask.origin.x, mask.origin.y, mask.size.width, mask.size.height};
    [self setVec4:maskVector4 forUniform:maskUniform program:filterProgram];
}

@end

代码比较简单，我们主要关注两个点：

我们传递参数给自定义滤镜后，自定义滤镜如何传递值给片元着色器。
片元着色器是如何工作的。

2.1 滤镜如何传递值给片元着色器

对于第一点，GPUImage提供了接口方法setVec4:forUniform:program,我们可以通过这个方法，把需要的mask值传入到片元着色器中,当然其他类型的值也有相应的方法，例如以GPUImage中GPUImageContrastFilter的contrast参数为例，可以使用setFoloat:forUniform:program方法，把值传到了片元着色器。

- (void)setMask:(CGRect)mask {
    _mask = mask;
    
    GPUVector4 maskVector4 = {mask.origin.x, mask.origin.y, mask.size.width, mask.size.height};
    [self setVec4:maskVector4 forUniform:maskUniform program:filterProgram];
}

然后我们可以通过调用方法maskUniform = [filterProgram uniformIndex:@"mask"];，设置在片元着色器中用以获取我们传入值的变量名。这样我们就可以在片元着色器中声明uniform lowp vec4 mask;，直接使用mask的值。其中uniform代表修饰的变量为全局变量，lowp表示精度。

2.2 片元着色器是如何工作的

对于片元着色器是如何工作的，我觉得可以简单的理解为对于每一个像素rgba值的处理。着色器会固定接受两个参数，即2D纹理图像inputImageTexture和纹理坐标textureCoordinate，然后通过texture2D方法去获取纹素，这是一个纹理图片的像素。接着对该像素进行相应的处理。处理完成后，赋值到gl_FragColor中，进行输出。

NSString *const kGPUImageCustomInvertFragmentShaderString = SHADER_STRING
(
 varying highp vec2 textureCoordinate;
 
 uniform sampler2D inputImageTexture;
 
 uniform lowp vec4 mask;
 
 void main()
 {
     lowp vec4 textureColor = texture2D(inputImageTexture, textureCoordinate);
     
     //根据mask判断，当前像素是否在指定举行区域内
     if(gl_FragCoord.x < (mask.x + mask.z) && gl_FragCoord.y < (mask.y + mask.w) && gl_FragCoord.x > mask.x && gl_FragCoord.y > mask.y) {
         gl_FragColor = vec4((1.0 - textureColor.rgb), textureColor.w);
     }else {
         gl_FragColor = textureColor;
     }
 }
);

难点分析

个人觉得功能实现的主要的难点，第一，在于coreImage和我们熟悉的UIkit以及OpenGL三者之间的坐标转换。第二，在与视频处理的时候，如何获取每一帧的CIImage图像，从而获取当前帧人脸的位置。

1. 坐标转换

1.1 coreImage和UIKit的坐标转换

coreImage中人脸识别完成后，返回的坐标是基于以左下角为原点的坐标系，如下所示：

coreImageCoor.jpg

而我们熟悉的UIKit的坐标，是基于左上角为原点的坐标系，如下所示：

UIkitCoor.png

所以当获取到CIFaceFeature返回的人脸坐标的时候，我们需要转换成我们以左上角为原点的坐标系的坐标，示例代码如下所示

// 
CIImage* image = [CIImage imageWithCGImage:imageView.image.CGImage];
CIDetector* detector = [CIDetector detectorOfType:CIDetectorTypeFace 
                                          context:... options:...];

//设置坐标转换需要的transform
CGAffineTransform transform = CGAffineTransformMakeScale(1, -1);
transform = CGAffineTransformTranslate(transform,
                                    0, -imageView.bounds.size.height);

//人脸识别
NSArray* features = [detector featuresInImage:image];
for(CIFaceFeature* faceFeature in features) {

//进行坐标转换
  const CGRect faceRect = CGRectApplyAffineTransform(faceFeature.bounds, transform);

  UIView* faceView = [[UIView alloc] initWithFrame:faceRect];
  ...
}

特别注意的一点，因为识别返回的坐标是图片真实的坐标，如果需要把人脸识别的矩形区域标注到我们自己的view上，比如如下所示的人脸红框：

redrectangle.png

我们需要注意展示图片的imageView可能会把image给拉伸了，所以人脸识别的坐标需要乘以拉伸系数，示例代码如下所示：

self.widthScale = imageSize.width / imageViewSize.width;
self.heigthScale = imageSize.height / imageViewSize.height;

CGRect rect = CGRectMake(feature.bounds.origin.x / self.widthScale, feature.bounds.origin.y / self.heigthScale, feature.bounds.size.width / self.widthScale, feature.bounds.size.height / self.heigthScale);

1.2 OpenGL坐标

片元着色器存在着很多类型的坐标，例如世界坐标，观察坐标，裁剪坐标，屏幕坐标等等，而我们需要用到的就是屏幕坐标。只有知道了像素点的屏幕坐标，我们才能对比人脸识别出来的区域，判断是否对该像素点进行处理。经过查找资料，发现OpenGL提供了gl_FragCoord值，它描述了当前像素点在屏幕上的xy轴坐标，这正是我们所需要的。

所以我们坐标处理可以分为以下几步：
* 第一步，从CIFeature中获取人脸的矩形区域。
* 第二步，把矩形区域的坐标作为以左上角为原点的坐标。
* 第三部，对比片元着色器中的点是否在在该矩形区域中，重要一点，这里矩形区域无需的坐标无需做上文所说的伸缩系数处理。

着色器坐标对比代码如下所示：

 void main()
 {
     lowp vec4 textureColor = texture2D(inputImageTexture, textureCoordinate);
     //坐标对比
     if(gl_FragCoord.x < (mask.x + mask.z) && gl_FragCoord.y < (mask.y + mask.w) && gl_FragCoord.x > mask.x && gl_FragCoord.y > mask.y) {
         gl_FragColor = vec4((1.0 - textureColor.rgb), textureColor.w);
     }else {
         gl_FragColor = textureColor;
     }
 }
);

2. 视频帧处理

对于静态图片的局部滤镜添加处理，我们可以很简单的理解为下面的简单步骤：

localProcess.png

但是对于视频的处理，我们需要考虑如何获取到每一个视频帧的CIImage，并实时监测到人脸区域坐标，然后把参数传送给滤镜。下面介绍如何实现这个功能。

2.1 获取视频的每一帧数据

看了下GPUImage的代码，发现GPUImageVideoCamera类中定义了一个代理方法：

@protocol GPUImageVideoCameraDelegate 

@optional
- (void)willOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer;
@end

继续看这个方法在哪个地方使用：

- (void)captureOutput:(AVCaptureOutput *)captureOutput didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer fromConnection:(AVCaptureConnection *)connection
{
  .....
        runAsynchronouslyOnVideoProcessingQueue(^{
             ....
            if (self.delegate)
            {
                [self.delegate willOutputSampleBuffer:sampleBuffer];
            }
            ...
        });
   .....
}

上述方法是AVCaptureVideoDataOutPut中定义的代理方法，摄像头开启后，获取每一个视频帧，都会调用这个代理方法。所以我们可以利用willOutputSampleBuffer:获取到视频的每一帧图像，即sampleBuffer。这样就解决了我们的获取视频帧的问题。

2.2 视频帧转为CIImage

我们获取到sampleBuffer，可以通过它获取到对应的CIImage，代码如下所示：

 CVPixelBufferRef pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);
 CFDictionaryRef attachments = CMCopyDictionaryOfAttachments(kCFAllocatorDefault, sampleBuffer, kCMAttachmentMode_ShouldPropagate);
    
    //从帧中获取到的图片相对镜头下看到的会向左旋转90度，所以后续坐标的转换要注意。
 CIImage *convertedImage = [[CIImage alloc] initWithCVPixelBuffer:pixelBuffer options:(__bridge NSDictionary *)attachments];

特别要注意一点，获取的CIImage相对于我们手机上看到的图，图的方向会向左旋转90。如下所示：

videoCIImage.png

所以后续我们的人脸识别以及识别后的坐标转换都需要特别的注意。

2.3 人脸识别

如上文所示，图像是向左旋转了90度，所以识别人脸的时候，我们需要正确设置CIDetectorImageOrientation，否则识别会失败。代码如下所示：

    NSDictionary *imageOptions = nil;
    UIDeviceOrientation curDeviceOrientation = [[UIDevice currentDevice] orientation];
    int exifOrientation;
    enum {
        PHOTOS_EXIF_0ROW_TOP_0COL_LEFT          = 1, //   1  =  0th row is at the top, and 0th column is on the left (THE DEFAULT).
        PHOTOS_EXIF_0ROW_TOP_0COL_RIGHT         = 2, //   2  =  0th row is at the top, and 0th column is on the right.
        PHOTOS_EXIF_0ROW_BOTTOM_0COL_RIGHT      = 3, //   3  =  0th row is at the bottom, and 0th column is on the right.
        PHOTOS_EXIF_0ROW_BOTTOM_0COL_LEFT       = 4, //   4  =  0th row is at the bottom, and 0th column is on the left.
        PHOTOS_EXIF_0ROW_LEFT_0COL_TOP          = 5, //   5  =  0th row is on the left, and 0th column is the top.
        PHOTOS_EXIF_0ROW_RIGHT_0COL_TOP         = 6, //   6  =  0th row is on the right, and 0th column is the top.
        PHOTOS_EXIF_0ROW_RIGHT_0COL_BOTTOM      = 7, //   7  =  0th row is on the right, and 0th column is the bottom.
        PHOTOS_EXIF_0ROW_LEFT_0COL_BOTTOM       = 8  //   8  =  0th row is on the left, and 0th column is the bottom.
    };
    
    BOOL isUsingFrontFacingCamera = NO;
    
    AVCaptureDevicePosition currentCameraPosition = [self.camera cameraPosition];
    
    if (currentCameraPosition != AVCaptureDevicePositionBack) {
        isUsingFrontFacingCamera = YES;
    }
    
    switch (curDeviceOrientation) {
        case UIDeviceOrientationPortraitUpsideDown:
            exifOrientation = PHOTOS_EXIF_0ROW_LEFT_0COL_BOTTOM;
            break;
        case UIDeviceOrientationLandscapeLeft:
            if (isUsingFrontFacingCamera) {
                exifOrientation = PHOTOS_EXIF_0ROW_BOTTOM_0COL_RIGHT;
            }else {
                exifOrientation = PHOTOS_EXIF_0ROW_TOP_0COL_LEFT;
            }
            break;
        case UIDeviceOrientationLandscapeRight:
            if (isUsingFrontFacingCamera) {
                exifOrientation = PHOTOS_EXIF_0ROW_TOP_0COL_LEFT;
            }else {
                exifOrientation = PHOTOS_EXIF_0ROW_BOTTOM_0COL_RIGHT;
            }
            
            break;
        default:
            exifOrientation = PHOTOS_EXIF_0ROW_RIGHT_0COL_TOP; //值为6。确定初始化原点坐标的位置，坐标原点为右上。其中横的为y，竖的为x
            break;
    }
    
    //exifOrientation的值用于确定图片的方向
    imageOptions = [NSDictionary dictionaryWithObject:[NSNumber numberWithInt:exifOrientation] forKey:CIDetectorImageOrientation];
    
    NSArray *features = [self.faceDetector featuresInImage:convertedImage options:imageOptions];

2.4 人脸坐标转换

人脸识别的得到的坐标仍然是基于左下角为原点的坐标系，如下所示：

hen.png

但是在视频帧获取到的CIImage上，人脸识别坐标是建立在向左旋转了90度的图片上，我们实际显示的图像应该让它向右旋转90度，视图回到垂直的位置。如下所示：

videFrameRotate.png

我们对比发现，这时候coreImage坐标的原点和我们熟悉的UIKit坐标的原点是重合的，只是x轴和y轴的坐标变换了位置，所以我们只需转换下x，y坐标以及交换长宽就能完成坐标的转换。示例代码如下所示：

  for (CIFeature *feature in featureArray) {
      CGRect faceRect = feature.bounds;
      CGFloat temp = faceRect.size.width;
      faceRect.size.width = faceRect.size.height; //长宽互换
      faceRect.size.height = temp;

      temp = faceRect.origin.x;
      faceRect.origin.x = faceRect.origin.y;
      faceRect.origin.y = temp;
      ....
   }

获取到坐标后，传值给自定义滤镜即可，滤镜中的对比代码再上文已经提到了。

待优化点

未实现多人脸识别。
视频脸部添加滤镜过程中，cpu使用率过高，待优化。

参考资料

着色器

CoreImage和UIKit坐标