VideoToolBox编码器解读

VTCompressionSession

要将相机采集到的数据进行编码，使用VideoToolBox的编码器进行编码

编码器作用就是将未压缩的CVPixelBuffer数据编码成压缩的CMSampleBuffer数据

未压缩的数据是CVPixelBuffer，压缩的数据是CMBlockBuffer,都是被封装在CMSampleBuffer里

压缩数据和未压缩数据的区别

image.png

编码器的作用如下

image.png

编码器使用的工作流如下：

image.png

1.创建编码器，需要提供的数据

image.png

函数

//OSStatus 返回创建的状态
OSStatus VTCompressionSessionCreate(
    //1. 一个分配器 默认为Null
    CFAllocatorRef allocator, 
    //2. 视频帧的像素宽度
    int32_t width, 
    //3. 视频帧的像素高度
    int32_t height, 
    //4. 编码类型 常用H264编码 kCMVideoCodecType_H264
    CMVideoCodecType codecType, 
    //5. 编码方式Null由videoToolBox选择
    CFDictionaryRef encoderSpecification, 
    //6. 创建一个像素缓冲池的属性 Null为由videoToolBox创建
    CFDictionaryRef sourceImageBufferAttributes, 
    //7. 数据压缩分配器  默认为Null
    CFAllocatorRef compressedDataAllocator, 
    //8. 输出回调   VTCompressionSessionEncodeFrame
    VTCompressionOutputCallback outputCallback, 
    //9. 回调对象 
    void *outputCallbackRefCon,
    //10. VTCompressionSession 要创建的对象 
    VTCompressionSessionRef  _Nullable *compressionSessionOut
    );

从Camera摄像头采集到的CMSampleBuffer容器数据,获取其包含的未编码的数据CVPixelBuffer(即CVImageBufferRef)，拿到宽高，并使用编码成kCMVideoCodecType_H264的基本数据Element Stream流，填充session

CVImageBufferRef imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);
      
size_t width = CVPixelBufferGetWidth(imageBuffer);
size_t height = CVPixelBufferGetHeight(imageBuffer);
      
VTCompressionSessionRef session;
OSStatus ret = VTCompressionSessionCreate(NULL, (int)width, (int)height, kCMVideoCodecType_H264, NULL, NULL, NULL, OutputCallback, NULL, &session);

2.设置编码器的属性

image.png

VTSessionSetProperty(session, key: kVTCompressionPropertyKey_RealTime, value: kCFBooleanTrue)
VTSessionSetProperty(session, key: kVTCompressionPropertyKey_ProfileLevel, value: kVTProfileLevel_H264_High_AutoLevel)
VTSessionSetProperty(session, key: kVTCompressionPropertyKey_AllowFrameReordering, value: kCFBooleanFalse) // 不产生 B 帧

3.开始编码

VTCompressionSessionEncodeFrame开始编码，编码后的数据在回调进行处理

image.png

/*!
    @function   VTCompressionSessionEncodeFrame
    @abstract
        Call this function to present frames to the compression session.
        Encoded frames may or may not be output before the function returns.
    @discussion
        The client should not modify the pixel data after making this call.
        The session and/or encoder will retain the image buffer as long as necessary. 
    @param  session
        The compression session.
    @param  imageBuffer
        A CVImageBuffer containing a video frame to be compressed.  
        Must have a nonzero reference count.
    @param  presentationTimeStamp
        The presentation timestamp for this frame, to be attached to the sample buffer.
        Each presentation timestamp passed to a session must be greater than the previous one.
    @param  duration
        The presentation duration for this frame, to be attached to the sample buffer.  
        If you do not have duration information, pass kCMTimeInvalid.
    @param  frameProperties
        Contains key/value pairs specifying additional properties for encoding this frame.
        Note that some session properties may also be changed between frames.
        Such changes have effect on subsequently encoded frames.
    @param  sourceFrameRefcon
        Your reference value for the frame, which will be passed to the output callback function.
    @param  infoFlagsOut
        Points to a VTEncodeInfoFlags to receive information about the encode operation.
        The kVTEncodeInfo_Asynchronous bit may be set if the encode is (or was) running
        asynchronously.
        The kVTEncodeInfo_FrameDropped bit may be set if the frame was dropped (synchronously).
        Pass NULL if you do not want to receive this information.
*/
VT_EXPORT OSStatus
VTCompressionSessionEncodeFrame(
    CM_NONNULL VTCompressionSessionRef  session,
    CM_NONNULL CVImageBufferRef         imageBuffer,
    CMTime                              presentationTimeStamp,
    CMTime                              duration, // may be kCMTimeInvalid
    CM_NULLABLE CFDictionaryRef         frameProperties,
    void * CM_NULLABLE                  sourceFrameRefcon,
    VTEncodeInfoFlags * CM_NULLABLE     infoFlagsOut )

PTS: 即presentationTimestamp，每一次传入的一定比前面一帧的要大，编码后被依附到CVSampleBuffer里，也可以不设置

let currentTimeMills = Int64(CFAbsoluteTimeGetCurrent() * 1000)
if self.encodingTimeMills == -1 {
    self.encodingTimeMills = currentTimeMills
}
let encodingDuration = currentTimeMills - self.encodingTimeMills
        
let presentationTimestamp = CMTimeMake(value: encodingDuration, timescale: 1000) // 当前编码视频帧的时间戳，单位为毫秒
        
VTCompressionSessionEncodeFrame(session, imageBuffer, presentationTimestamp, kCMTimeInvalid, NULL, NULL, NULL);

编码的回调处理

编码后，得到的是压缩的数据CMSampleBuffer,

回调处理

void OutputCallback(void *outputCallbackRefCon,
                   void *sourceFrameRefCon,
                   OSStatus status,
                   VTEncodeInfoFlags infoFlags,
                   CMSampleBufferRef sampleBuffer)

要对压缩数据CMSampleBuffer进行处理.

image.png

压缩后的数据CMSampleBuffer, 是MEPEG-4形式，需要转为Elementary Stream形式，再在下面两处地方使用

一般有两种类型的使用，一是网络流，另一个是写到文件，他们都是接受Elementary Stream形式的数据，在回调里，要进行相应的处理，转为Elementary Stream形式

以写入到文件.mov为例,

一个CMSampleBuffer只能是一帧数据，但一帧可能有多个NALU单元，所以需要循环的把NALU转到NSData里,一个NALU由startcode码+type类型+数据组成

image.png

在抽取NALU时，因为CMSampleBuffer里的数据是MPEG-4形式的，即含有一个四个字节大小的header，header里存储的是NALU数据的大小，要进行相应的转化

image.png

另外，对于关键帧I帧，还需要抽取sps，pps，转到NSData里

image.png

void encodeOutputDataCallback(void * CM_NULLABLE outputCallbackRefCon, void * CM_NULLABLE sourceFrameRefCon, OSStatus status, VTEncodeInfoFlags infoFlags, CM_NULLABLE CMSampleBufferRef sampleBuffer)
{
    if (noErr != status || nil == sampleBuffer)
    {
        NSLog(@"VEVideoEncoder::encodeOutputCallback Error : %d!", (int)status);
        return;
    }
    
    if (nil == outputCallbackRefCon)
    {
        return;
    }
    
    if (!CMSampleBufferDataIsReady(sampleBuffer))
    {
        return;
    }
    
    if (infoFlags & kVTEncodeInfo_FrameDropped)
    {
        NSLog(@"VEVideoEncoder::H264 encode dropped frame.");
        return;
    }
    
    VEVideoEncoder *encoder = (__bridge VEVideoEncoder *)outputCallbackRefCon;
    const char header[] = "\x00\x00\x00\x01";
    size_t headerLen = (sizeof header) - 1;
    NSData *headerData = [NSData dataWithBytes:header length:headerLen];
    
    // 判断是否是关键帧
    bool isKeyFrame = !CFDictionaryContainsKey((CFDictionaryRef)CFArrayGetValueAtIndex(CMSampleBufferGetSampleAttachmentsArray(sampleBuffer, true), 0), (const void *)kCMSampleAttachmentKey_NotSync);
    
    if (isKeyFrame)
    {
        NSLog(@"VEVideoEncoder::编码了一个关键帧");
        CMFormatDescriptionRef formatDescriptionRef = CMSampleBufferGetFormatDescription(sampleBuffer);
        
        // 关键帧需要加上SPS、PPS信息
        size_t sParameterSetSize, sParameterSetCount;
        const uint8_t *sParameterSet;
        OSStatus spsStatus = CMVideoFormatDescriptionGetH264ParameterSetAtIndex(formatDescriptionRef, 0, &sParameterSet, &sParameterSetSize, &sParameterSetCount, 0);
        
        size_t pParameterSetSize, pParameterSetCount;
        const uint8_t *pParameterSet;
        OSStatus ppsStatus = CMVideoFormatDescriptionGetH264ParameterSetAtIndex(formatDescriptionRef, 1, &pParameterSet, &pParameterSetSize, &pParameterSetCount, 0);
        
        if (noErr == spsStatus && noErr == ppsStatus)
        {
            NSData *sps = [NSData dataWithBytes:sParameterSet length:sParameterSetSize];
            NSData *pps = [NSData dataWithBytes:pParameterSet length:pParameterSetSize];
            NSMutableData *spsData = [NSMutableData data];
            [spsData appendData:headerData];
            [spsData appendData:sps];
            if ([encoder.delegate respondsToSelector:@selector(videoEncodeOutputDataCallback:isKeyFrame:)])
            {
                [encoder.delegate videoEncodeOutputDataCallback:spsData isKeyFrame:isKeyFrame];
            }
            
            NSMutableData *ppsData = [NSMutableData data];
            [ppsData appendData:headerData];
            [ppsData appendData:pps];
            
            if ([encoder.delegate respondsToSelector:@selector(videoEncodeOutputDataCallback:isKeyFrame:)])
            {
                [encoder.delegate videoEncodeOutputDataCallback:ppsData isKeyFrame:isKeyFrame];
            }
        }
    }
    
    CMBlockBufferRef blockBuffer = CMSampleBufferGetDataBuffer(sampleBuffer);
    size_t length, totalLength;
    char *dataPointer;
    status = CMBlockBufferGetDataPointer(blockBuffer, 0, &length, &totalLength, &dataPointer);
    if (noErr != status)
    {
        NSLog(@"VEVideoEncoder::CMBlockBufferGetDataPointer Error : %d!", (int)status);
        return;
    }
    
    size_t bufferOffset = 0;
    static const int avcHeaderLength = 4;
    while (bufferOffset < totalLength - avcHeaderLength)
    {
        // 读取 NAL 单元长度
        uint32_t nalUnitLength = 0;
        memcpy(&nalUnitLength, dataPointer + bufferOffset, avcHeaderLength);
        
        // 网络上传输数据普遍采用的都是大端
        // 大端转小端
        nalUnitLength = CFSwapInt32BigToHost(nalUnitLength);
        
        NSData *frameData = [[NSData alloc] initWithBytes:(dataPointer + bufferOffset + avcHeaderLength) length:nalUnitLength];
        
        NSMutableData *outputFrameData = [NSMutableData data];
        [outputFrameData appendData:headerData];
        [outputFrameData appendData:frameData];
        
        bufferOffset += avcHeaderLength + nalUnitLength;
        
        if ([encoder.delegate respondsToSelector:@selector(videoEncodeOutputDataCallback:isKeyFrame:)])
        {
            [encoder.delegate videoEncodeOutputDataCallback:outputFrameData isKeyFrame:isKeyFrame];
        }
    }
    
}

4.完成编码

VTCompressionSessionCompleteFrames(session, completeUntilPresentationTimeStamp);

5.销毁编码器

if (session) {
    VTCompressionSessionInvalidate(session);
    CFRelease(session);
}

VideoToolBox编码器解读

VTCompressionSession

1.创建编码器，需要提供的数据

2.设置编码器的属性

3.开始编码

编码的回调处理

4.完成编码

5.销毁编码器

你可能感兴趣的:(VideoToolBox编码器解读)