VideoToolBox 编码H.264

使用VideoToolBox硬编码获取H264

简单介绍一下 H264，H.264技术是一种新的视频压缩编码标准，该标准采用了多项提高图像质量和增加压缩比的技术措施，可用于SDTV、HDTV和DVD等；

H264结构示图 (图1)

该文章主要介绍如何使用VideoToolBox的创建并编码H264数据流；

在这篇文章中，我们采用AVFoundation 框架 通过摄像头来获取摄像头采集的的原始视频流，然后交给 VideoToolBox 将原始视频流数据压缩成H.264格式;

在这里我们创建一个 SystemCaptureManager 类，用来对 AVCaptureSession进行封装，通过 AVCaptureDeviceInput获取到 iphone 设备的前后摄像头和 音频设备，并通过AVCaptureVideoDataOutput的代理方法拿到原始的 视频流 和音频流 ；
采集后的 音频数据和视频数据，通过该代理方法回调；

@protocol SystemCaptureManagerDelegate 
@optional

- (void)captureSampleBuffer:(CMSampleBufferRef)sampleBuffer type:(SystemCaptureType)type;

@end

逻辑示意图如下图所示

原始码流转 h.264 (图2)

1.初始化VideoToolBox 创建

      OSStatus status = VTCompressionSessionCreate(kCFAllocatorDefault, (int32_t)_videoConfig.width, (int32_t)_videoConfig.height, kCMVideoCodecType_H264, NULL, NULL, NULL, VideoEncodeCallback, (__bridge void * _Nullable)(self), &_vtSession);
      if (status != noErr) {
          NSLog(@"VTCompressionSession create failed. status=%d", (int)status);
          return self;
      }
      //设置编码器属性
      //设置是否实时执行
      status = VTSessionSetProperty(_vtSession, kVTCompressionPropertyKey_RealTime, kCFBooleanTrue);
      NSLog(@"VTSessionSetProperty: set RealTime return: %d", (int)status);
      
      //指定编码比特流的配置文件和级别。直播一般使用baseline，可减少由于b帧带来的延时
      status = VTSessionSetProperty(_vtSession, kVTCompressionPropertyKey_ProfileLevel, kVTProfileLevel_H264_Baseline_AutoLevel);
      NSLog(@"VTSessionSetProperty: set profile return: %d", (int)status);
      
      //设置码率均值(比特率可以高于此。默认比特率为零，表示视频编码器。应该确定压缩数据的大小。注意，比特率设置只在定时时有效）
      CFNumberRef bit = (__bridge CFNumberRef)@(_videoConfig.bitRate);
      status = VTSessionSetProperty(_vtSession, kVTCompressionPropertyKey_AverageBitRate, bit);
      NSLog(@"VTSessionSetProperty: set AverageBitRate return: %d", (int)status);
      
      //码率限制(只在定时时起作用)*待确认
      CFArrayRef limits = (__bridge CFArrayRef)@[@(_videoConfig.bitRate / 4), @(_videoConfig.bitRate * 4)];
      status = VTSessionSetProperty(_vtSession, kVTCompressionPropertyKey_DataRateLimits,limits);
      NSLog(@"VTSessionSetProperty: set DataRateLimits return: %d", (int)status);
      
      //设置关键帧间隔(GOPSize)GOP太大图像会模糊
      CFNumberRef maxKeyFrameInterval = (__bridge CFNumberRef)@(_videoConfig.fps);
      status = VTSessionSetProperty(_vtSession, kVTCompressionPropertyKey_MaxKeyFrameInterval, maxKeyFrameInterval);
      NSLog(@"VTSessionSetProperty: set MaxKeyFrameInterval return: %d", (int)status);
      
      //设置fps(预期)
      CFNumberRef expectedFrameRate = (__bridge CFNumberRef)@(_videoConfig.fps);
      status = VTSessionSetProperty(_vtSession, kVTCompressionPropertyKey_ExpectedFrameRate, expectedFrameRate);
      NSLog(@"VTSessionSetProperty: set ExpectedFrameRate return: %d", (int)status);
      
      //准备编码
      status = VTCompressionSessionPrepareToEncodeFrames(_vtSession);
      NSLog(@"VTSessionSetProperty: set PrepareToEncodeFrames return: %d", (int)status);

2.编码

编码部分，该demo 提供2种方式给encoder 编码器，第一种参数是 CMSampleBufferRef ，第二种是NSData 类型的 YUV数据；
但最终都是需要调用 VTCompressionSessionEncodeFrame () 将CVImageBufferRef视频帧送进编码器；

2.1 CMSampleBufferRef 类型视频帧数据

方法1比较简单，通过 CMSampleBufferRef 拿到视频帧原始数据 CVImageBufferRef，传入相应的PTS 、DTS；

- (void)encodeVideoSampleBuffer:(CMSampleBufferRef)sampleBuffer {
    CFRetain(sampleBuffer);
    dispatch_async(_encodeQueue, ^{
        //帧数据
        CVImageBufferRef imageBuffer = (CVImageBufferRef)CMSampleBufferGetImageBuffer(sampleBuffer);
        //该帧的时间戳
        self.frameID++;
        CMTime timeStamp = CMTimeMake(self.frameID, 1000);
        //持续时间
        CMTime duration = kCMTimeInvalid;
        //编码
        VTEncodeInfoFlags flags;
        OSStatus status = VTCompressionSessionEncodeFrame(_vtSession, imageBuffer, timeStamp, duration, NULL, NULL, &flags);
        if (status != noErr) {
            NSLog(@"VTCompression: encode failed: status=%d",(int)status);
        }
        CFRelease(sampleBuffer);
    });
    
}

2.1 NSData 类型YUV 数据

方法2 适用一些从网络或者接口获取 Data类型的方式；
在这里我们模拟了通过接口获取到的NSData 类型的YUV格式视频数据；
因为知道这里的 NSData 类型是YUV420f类型，也就是NV12，所以
也就知道了，数据在内存中是如何分布的（如图所示）;
不熟悉的同学可以看看这篇文章常用图像像素格式 NV12、NV2、I420、YV12、YUYV

如下图所示

image.png

具体代码的解释，已经写在注释里了，可自行观看（高手可略过）

- (void)encodeYUVData:(NSData *)YUVData {
    
    // 把YUV 数据还原成cvpixeBuffer  交给videoToolBox;
    size_t pixelWidth = self.videoConfig.width;
    //视频高度
    size_t pixelHeight = self.videoConfig.height;
    
    CVPixelBufferRef pixelBuf = NULL;
    CVPixelBufferCreate(NULL, pixelWidth, pixelHeight, kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange, NULL, &pixelBuf);
    if (CVPixelBufferLockBaseAddress(pixelBuf, 0) != kCVReturnSuccess) {
        NSLog(@"encode video lock base address failed");
        return;
    }
    size_t y_size = pixelWidth * pixelHeight;
    size_t uv_size = y_size / 4;
    uint8_t *yuv_frame = (uint8_t *)YUVData.bytes;
    
    //处理y frame
    uint8_t *y_frame = CVPixelBufferGetBaseAddressOfPlane(pixelBuf, 0);
    memcpy(y_frame, yuv_frame, y_size);
    
    uint8_t *uv_frame = CVPixelBufferGetBaseAddressOfPlane(pixelBuf, 1);
    memcpy(uv_frame, yuv_frame + y_size, uv_size * 2);
    
    //硬编码 CmSampleBufRef
    
    //时间戳
    uint32_t ptsMs = self.timestamp + 1; //self.vFrameCount++ * 1000.f / self.videoConfig.fps;
    
    CMTime pts = CMTimeMake(ptsMs, 1000);
    OSStatus status = VTCompressionSessionEncodeFrame(_vtSession, pixelBuf, pts, kCMTimeInvalid, NULL, pixelBuf, NULL);
    if (status != noErr) {
        NSLog(@" error ");
    }
    CVPixelBufferUnlockBaseAddress(pixelBuf, 0);
    CFRelease(pixelBuf);
}

3.获取SPS 、PPS

H.264码流第一个 NALU是 SPS（序列参数集Sequence Parameter Set）
H.264码流第二个 NALU是 PPS（图像参数集Picture Parameter Set）
H.264码流第三个 NALU 是IDR（即时解码器刷新）

H264结构示图 (图1)

通过上图，我们得知， SPS和 PPS 非常重要，所以编码后，我们需要把SPS 和PPS 当作属性存起来；

获取 sps和pps的参考代码如下；

        size_t spsSize, spsCount;
        size_t ppsSize, ppsCount;
        const uint8_t *spsData, *ppsData;
        //获取图像源格式
        CMFormatDescriptionRef formatDesc = CMSampleBufferGetFormatDescription(sampleBuffer);
        OSStatus status1 = CMVideoFormatDescriptionGetH264ParameterSetAtIndex(formatDesc, 0, &spsData, &spsSize, &spsCount, 0);
        OSStatus status2 = CMVideoFormatDescriptionGetH264ParameterSetAtIndex(formatDesc, 1, &ppsData, &ppsSize, &ppsCount, 0);

4.获取NALU

//获取NALU数据
    size_t lengthAtOffset, totalLength;
    char *dataPoint;
    
    //将数据复制到dataPoint
    CMBlockBufferRef blockBuffer = CMSampleBufferGetDataBuffer(sampleBuffer);
    OSStatus error = CMBlockBufferGetDataPointer(blockBuffer, 0, &lengthAtOffset, &totalLength, &dataPoint);
    if (error != kCMBlockBufferNoErr) {
        NSLog(@"VideoEncodeCallback: get datapoint failed, status = %d", (int)error);
        return;
    }
    
    //循环获取nalu数据
    size_t offet = 0;
    //返回的nalu数据前四个字节不是0001的startcode(不是系统端的0001)，而是大端模式的帧长度length
    const int lengthInfoSize = 4;
    
    while (offet < totalLength - lengthInfoSize) {
        uint32_t naluLength = 0;
        //获取nalu 数据长度
        memcpy(&naluLength, dataPoint + offet, lengthInfoSize);
        //大端转系统端
        naluLength = CFSwapInt32BigToHost(naluLength);
        //获取到编码好的视频数据
        NSMutableData *data = [NSMutableData dataWithCapacity:4 + naluLength];
        [data appendBytes:startCode length:4];
        [data appendBytes:dataPoint + offet + lengthInfoSize length:naluLength];
        
        //将NALU数据回调到代理中
        dispatch_async(encoder.callbackQueue, ^{
            [encoder.delegate videoEncodeCallback:data];
        });
        
        //移动下标，继续读取下一个数据
        offet += lengthInfoSize + naluLength;
    }

5.写入文件

从上面的代码的逻辑中，我们拿到了SPS和PPS数据，也拿到了NALU 的数据部分，
在写入文件中，需要给SPS、PPS 和NALU 前加一个分隔符0x00 0x00 0x00 0x01 , 然后将数据写入沙盒文件；

在这里我们定义了一个FileManager 的类，通过在沙盒内创建一个xxx.h264的文件，通过VideoToolBox编码后的H264保存到这个文件内；


-(FILE *)h264_file {
    if (!_h264_file) {
        // 3.文件管理器 （写H264）
        self.fileManager = [[FileManager alloc]init];
        NSString *fileName =  [self.fileManager createRandomMediaTypeName:MEDIA_TYPE_H264];
        NSString *filePath = [self.fileManager createFileWithFileName:fileName];
        const char *h264File = filePath.UTF8String;
        FILE *h264_file = fopen(h264File, "wb");
        _h264_file = h264_file;
        NSLog(@"h264 file path %@",filePath);
    }
    return _h264_file;
}

最后，文件保存成功（一定要注意的是先写sps、pps 再写NALU 否则不能正常播放）；
如下图所示

image.png

然后我们把沙盒内的H264 文件拷贝出来，通过ffplay 播放；
提示 homebrew 一定要提前安装；
安装命令如下ffmpeg 'hombrew install ffmpeg

image.png

源码地址源码地址： https://github.com/hunter858/OpenGL_Study/AVFoundation/VideoToolBox-encoder

VideoToolBox 编码H.264

逻辑示意图 如下图所示

1.初始化VideoToolBox 创建

2.编码

2.1 CMSampleBufferRef 类型视频帧数据

2.1 NSData 类型YUV 数据

3.获取SPS 、PPS

4.获取NALU

5.写入文件

你可能感兴趣的:(VideoToolBox 编码H.264)

逻辑示意图如下图所示