ios 硬解码h264视频的坑

最近这两天在写一个ios demo,用 VideoToolBox 硬解码从网络上实时接收过来的原始h264 nalu 数据(裸数据)。

网络裸数据如下:


ios 硬解码h264视频的坑_第1张图片
ios 硬解码h264视频的坑_第2张图片

后面的原始数据帧类似以上。。。

(1)开始用了一个正常思路的方式:
循环获取原始数据 -----> 分割大包中的nalu并单个送入硬解码 (nalu的前四个字节需要做调整:
由00 00 00 01 变为大端的 nalu 的长度(不包括头四个字节)) 

发现一个奇异现象:在模拟器中基本可以正常显示,偶尔有点水波现象,但是基本是正常的,不影响观看,然后部署到真机(iphone6-ios 9.3)中,出现绿屏或者出现一半正常显示一半出现绿屏。

大致代码如下:

/** Find the beginning and end of a NAL (Network Abstraction Layer) unit in 
a byte buffer containing H264 bitstream data. 
@param[in]   buf        the buffer 
@param[in]   size       the size of the buffer 
@param[out]  nal_start  the beginning offset of the nal 
@param[out]  nal_end    the end offset of the nal 
@return                 the length of the nal, or 0 if did not find start of nal, or -1 if did not find end of nal */
static int findNalUnit(uint8_t* buf, int size, int* nal_start, int* nal_end)
{
    int i;
    // find start
    *nal_start = 0;
    *nal_end = 0;
    i = 0;
    while (   //( next_bits( 24 ) != 0x000001 && next_bits( 32 ) != 0x00000001 ) 
          (buf[i] != 0 || buf[i+1] != 0 || buf[i+2] != 0x01) &&
          (buf[i] != 0 || buf[i+1] != 0 || buf[i+2] != 0 || buf[i+3] != 0x01)
           )
    { 
        i++; // skip leading zero
        if (i+4 >= size)
        {
            return 0;
        } // did not find nal start
    }
    if  (buf[i] != 0 || buf[i+1] != 0 || buf[i+2] != 0x01) // ( next_bits( 24 ) != 0x000001 )
    {
        i++;
    }
    if  (buf[i] != 0 || buf[i+1] != 0 || buf[i+2] != 0x01)
    {
        /* error, should never happen */
        return 0;
    }
    i+= 3;
    *nal_start = i;
    while (   //( next_bits( 24 ) != 0x000000 && next_bits( 24 ) != 0x000001 )
           (buf[i] != 0 || buf[i+1] != 0 || buf[i+2] != 0) &&
           (buf[i] != 0 || buf[i+1] != 0 || buf[i+2] != 0x01)
           )
    {
        i++;
        // FIXME the next line fails when reading a nal that ends exactly at the end of the data
        if (i+3 >= size) 
        {
            *nal_end = size;
            return (*nal_end - *nal_start);//return -1;
        } // did not find nal end, stream ended first
    }
    *nal_end = i;
    return (*nal_end - *nal_start);
}

- (BOOL)decodeNalu:(uint32_t)frame withSize(uint32_t)frameSize {
    // LOGD(@">>>>>>>>>>开始解码");

    if (frame == NULL || frameSize == 0)
        return NO;

    uint8_t* p = frame, *pf;
    size_t sz = frameSize;
    int nal_start, nal_end;

    while (![[NSThread currentThread] isCancelled] && findNalUnit(p, sz, &nal_start, &nal_end) > 0) {
        CVPixelBufferRef pixelBuffer = NULL;
        int nalu_type = p[nal_start] & 0x1f;
        int nal_len = nal_end - nal_start;
        uint8_t *pnal_size = (uint8_t*)(&nal_len);
        //{(uint8_t)(nal_len >> 24), (uint8_t)(nal_len >> 16), (uint8_t)(nal_len >> 8), (uint8_t)nal_len};
        if (nal_start == 3) { //big-endian
            p[-1] = *(pnal_size + 3);
            p[0]  = *(pnal_size + 2);
            p[1]  = *(pnal_size + 1);
            p[2]  = *(pnal_size);
            pf = p - 1;
        }
        else if (nal_start == 4) {
            p[0] = *(pnal_size + 3);
            p[1] = *(pnal_size + 2);
            p[2] = *(pnal_size + 1);
            p[3] = *(pnal_size);
            pf = p;
        }
        switch (nalu_type)
        {
            case 0x05:
                LOGD(@"nalu_type:%d Nal type is IDR frame", nalu_type);
                if ([self initH264Decoder]) {
                    pixelBuffer = [self decode:pf withSize:(nal_len + 4)];
                }
                break;
            case 0x07:
                LOGD(@"nalu_type:%d Nal type is SPS", nalu_type);
                if (_sps == NULL) {
                    _spsSize = nal_len;
                    _sps = (uint8_t*)malloc(_spsSize);
                    memcpy(_sps, &pf[4], _spsSize);
                }
                break;
            case 0x08:
                LOGD(@"nalu_type:%d Nal type is PPS", nalu_type);
                if (_pps == NULL) {
                    _ppsSize = nal_len;
                    _pps = (uint8_t*)malloc(_ppsSize);
                    memcpy(_pps, &pf[4], _ppsSize);
                }
                break;
            default:
                LOGD(@"nalu_type:%d Nal type is B/P frame", nalu_type);
                if ([self initH264Decoder]) {
                    pixelBuffer = [self decode:pf withSize:(nal_len + 4)];
                }
                break;
       }
        p += nal_start;
        p += nal_len;
        sz -= nal_end;
    }
(2)另外一种方式:
循环获取原始数据 -----> 重新打包大包中的nalu (每个nalu的头部改为当前nalu的长度)

使用这种方式在模拟器和真机上都是正常显示的,没有花屏和绿屏现象。所以,对于获取到的一帧数据可能被分成了多个nalu,解码的时候不需要再拆分成单个nalu单独去解码,这样硬解码器内部认为此单nalu不是一个完整的帧,导致花屏。

大致代码如下:

//
//  WBH264Play.m
//  wenba_rtc
//
//  Created by zhouweiwei on 16/11/20.
//  Copyright © 2016年 zhouweiwei. All rights reserved.
//

#import 
#import "WBH264Play.h"

#define kH264outputWidth  160
#define kH264outputHeight 120

static const uint8_t *avc_find_startcode_internal(const uint8_t *p, const uint8_t *end)
{
    const uint8_t *a = p + 4 - ((intptr_t)p & 3);
    
    for (end -= 3; p < a && p < end; p++) {
        if (p[0] == 0 && p[1] == 0 && p[2] == 1)
            return p;
    }
    
    for (end -= 3; p < end; p += 4) {
        uint32_t x = *(const uint32_t*)p;
        //      if ((x - 0x01000100) & (~x) & 0x80008000) // little endian
        //      if ((x - 0x00010001) & (~x) & 0x00800080) // big endian
        if ((x - 0x01010101) & (~x) & 0x80808080) { // generic
            if (p[1] == 0) {
                if (p[0] == 0 && p[2] == 1)
                    return p;
                if (p[2] == 0 && p[3] == 1)
                    return p+1;
            }
            if (p[3] == 0) {
                if (p[2] == 0 && p[4] == 1)
                    return p+2;
                if (p[4] == 0 && p[5] == 1)
                    return p+3;
            }
        }
    }
    
    for (end += 3; p < end; p++) {
        if (p[0] == 0 && p[1] == 0 && p[2] == 1)
            return p;
    }
    
    return end + 3;
}

const uint8_t *avc_find_startcode(const uint8_t *p, const uint8_t *end)
{
    const uint8_t *out= avc_find_startcode_internal(p, end);
    if(p*)displayDelegate {
    
    [self close];

    if (width == 0 || height == 0) {
        _out_width = kH264outputWidth;
        _out_height = kH264outputHeight;
    }
    else {
        _out_width = width;
        _out_height = height;
    }
    _vsize = _out_width * _out_height * 3;
    _vdata = (uint8_t*)malloc(_vsize * sizeof(uint8_t));

    _buf_out = (uint8_t*)malloc(_out_width * _out_height * sizeof(uint8_t));

    self.delegate = displayDelegate;

    thread = [[NSThread alloc] initWithTarget:self selector:@selector(run) object:nil];
    //thread.name = @"Thread";
    [thread start];

    return YES;
}

- (void)setH264DecoderInterface:(NSObject*)displayDelegate {
    self.delegate = displayDelegate;
}

- (void)run {
    size_t out_size = 0;

    while (![[NSThread currentThread] isCancelled]) {
        /*这里从网路端循环获取视频数据*/
        if (api_video_get(_uid, _vdata, &out_size) == 0 && out_size > 0) {
            if ([self decodeNalu:_vdata withSize:out_size]) {
            }
        }

        [NSThread sleepForTimeInterval:0.005];
    }
}

- (void)stop {
    LOGD(@"uid:%u decoder stop", _uid);

    if (_thread != nil) {
        if (!_thread.isCancelled) {
            [_thread cancel];
            LOGD(@"uid:%u thread cancel", _uid);
        }
    }
    
    LOGD(@"uid:%u decoder stoped", _uid);

    if (_decoderFormatDesc != nil) {
        CFRelease(_decoderFormatDesc);
        _decoderFormatDesc = nil;
    }

    if (_deocderSession != nil) {
        VTDecompressionSessionWaitForAsynchronousFrames(_deocderSession);
        VTDecompressionSessionInvalidate(_deocderSession);
        CFRelease(_deocderSession);
        _deocderSession = nil;
    }

    _uid = 0;

    _out_width = kH264outputWidth;
    _out_height = kH264outputHeight;

    if (_vdata != NULL) {
        free(_vdata);
        _vdata = NULL;
        _vsize = 0;
    }

    if (_sps != NULL) {
        free(_sps);
        _sps = NULL;
        _spsSize = 0;
    }

    if (_pps != NULL) {
        free(_pps);
        _pps = NULL;
        _ppsSize = 0;
    }

    if (_buf_out != NULL) {
        free(_buf_out);
        _buf_out = NULL;
    }

    self.delegate = nil;
}

- (void)close {
    [self stop];
    _thread = nil;

    LOGD(@"uid:%u decoder close", _uid);
}

-(BOOL)initH264Decoder {
    if (_deocderSession) {
        return YES;
    }

    if (!_sps || !_pps || _spsSize == 0 || _ppsSize == 0) {
        return NO;
    }

    const uint8_t* const parameterSetPointers[2] = { _sps, _pps };
    const size_t parameterSetSizes[2] = { _spsSize, _ppsSize };
    OSStatus status = CMVideoFormatDescriptionCreateFromH264ParameterSets(kCFAllocatorDefault,
                                                                          2, //param count
                                                                          parameterSetPointers,
                                                                          parameterSetSizes,
                                                                          4, //nal start code size
                                                                          &_decoderFormatDescription);
    if (status == noErr) {
        NSDictionary* destinationPixelBufferAttributes = @{
                                                           (id)kCVPixelBufferPixelFormatTypeKey : [NSNumber numberWithInt:kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange]
                                                           //硬解必须是 kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange 或者是kCVPixelFormatType_420YpCbCr8Planar
                                                           //因为iOS是nv12  其他是nv21
                                                           , (id)kCVPixelBufferWidthKey  : [NSNumber numberWithInt:kH264outputWidth]
                                                           , (id)kCVPixelBufferHeightKey : [NSNumber numberWithInt:kH264outputHeight]
                                                           //, (id)kCVPixelBufferBytesPerRowAlignmentKey : [NSNumber numberWithInt:kH264outputWidth*2]
                                                           , (id)kCVPixelBufferOpenGLCompatibilityKey : [NSNumber numberWithBool:NO]
                                                           , (id)kCVPixelBufferOpenGLESCompatibilityKey : [NSNumber numberWithBool:YES]
                                                           };

        VTDecompressionOutputCallbackRecord callBackRecord;
        callBackRecord.decompressionOutputCallback = didDecompress;
        callBackRecord.decompressionOutputRefCon = (__bridge void *)self;

        status = VTDecompressionSessionCreate(kCFAllocatorDefault,
                                              _decoderFormatDescription,
                                              NULL,
                                              (__bridge CFDictionaryRef)destinationPixelBufferAttributes,
                                              &callBackRecord,
                                              &_deocderSession);
        VTSessionSetProperty(_deocderSession, kVTDecompressionPropertyKey_ThreadCount, (__bridge CFTypeRef)[NSNumber numberWithInt:1]);
        VTSessionSetProperty(_deocderSession, kVTDecompressionPropertyKey_RealTime, kCFBooleanTrue);
    } else {
        LOGE(@"reset decoder session failed status=%d", status);
        return NO;
    }
    
    return YES;
}

- (BOOL)resetH264Decoder {
    if(_deocderSession) {
        VTDecompressionSessionWaitForAsynchronousFrames(_deocderSession);
        VTDecompressionSessionInvalidate(_deocderSession);
        CFRelease(_deocderSession);
        _deocderSession = NULL;
    }
    return [self initH264Decoder];
}

- (CVPixelBufferRef)decode:(uint8_t *)frame withSize:(uint32_t)frameSize {
    if (frame == NULL || _deocderSession == nil)
        return NULL;

    CVPixelBufferRef outputPixelBuffer = NULL;
    CMBlockBufferRef blockBuffer = NULL;

    OSStatus status  = CMBlockBufferCreateWithMemoryBlock(NULL,
                                                          (void *)frame,
                                                          frameSize,
                                                          kCFAllocatorNull,
                                                          NULL,
                                                          0,
                                                          frameSize,
                                                          FALSE,
                                                          &blockBuffer);
    if(status == kCMBlockBufferNoErr) {
        CMSampleBufferRef sampleBuffer = NULL;
        const size_t sampleSizeArray[] = {frameSize};
//        status = CMSampleBufferCreateReady(kCFAllocatorDefault,
//                                           blockBuffer,
//                                           _decoderFormatDescription ,
//                                           1, 0, NULL, 1, sampleSizeArray,
//                                           &sampleBuffer);
        status = CMSampleBufferCreate(NULL, blockBuffer, TRUE, 0, 0, _decoderFormatDescription, 1, 0, NULL, 0, NULL, &sampleBuffer);

        if (status == kCMBlockBufferNoErr && sampleBuffer) {
            VTDecodeFrameFlags flags = 0;
            VTDecodeInfoFlags flagOut = 0;
            status = VTDecompressionSessionDecodeFrame(_deocderSession,
                                                       sampleBuffer,
                                                       flags,
                                                       &outputPixelBuffer,
                                                       &flagOut);

            if (status == kVTInvalidSessionErr) {
                LOGE(@"Invalid session, reset decoder session");
                [self resetH264Decoder];
            } else if(status == kVTVideoDecoderBadDataErr) {
                LOGE(@"decode failed status=%d(Bad data)", status);
            } else if(status != noErr) {
                LOGE(@"decode failed status=%d", status);
            }
        }

        if (sampleBuffer != NULL)
            CFRelease(sampleBuffer);
    }
    if (blockBuffer != NULL)
        CFRelease(blockBuffer);
    
    return outputPixelBuffer;
}

- (BOOL)decodeNalu:(uint8_t *)frame withSize:(uint32_t)frameSize {
    // LOGD(@">>>>>>>>>>开始解码");

    if (frame == NULL || frameSize == 0)
        return NO;

    int size = frameSize;
    const uint8_t *p = frame;
    const uint8_t *end = p + size;
    const uint8_t *nal_start, *nal_end;
    int nal_len, nalu_type;

    size = 0;
    nal_start = avc_find_startcode(p, end);
    while (![[NSThread currentThread] isCancelled]) {
        while (![[NSThread currentThread] isCancelled] && nal_start < end && !*(nal_start++));
        if (nal_start == end)
            break;

        nal_end = avc_find_startcode(nal_start, end);
        nal_len = nal_end - nal_start;
        
        nalu_type = nal_start[0] & 0x1f;
        if (nalu_type == 0x07) {
            if (_sps == NULL) {
                _spsSize = nal_len;
                _sps = (uint8_t*)malloc(_spsSize);
                memcpy(_sps, nal_start, _spsSize);
            }
        }
        else if (nalu_type == 0x08) {
            if (_pps == NULL) {
                _ppsSize = nal_len;
                _pps = (uint8_t*)malloc(_ppsSize);
                memcpy(_pps, nal_start, _ppsSize);
            }
        }
        else {
            _buf_out[size + 0] = (uint8_t)(nal_len >> 24);
            _buf_out[size + 1] = (uint8_t)(nal_len >> 16);
            _buf_out[size + 2] = (uint8_t)(nal_len >> 8 );
            _buf_out[size + 3] = (uint8_t)(nal_len);

            memcpy(_buf_out + 4 + size, nal_start, nal_len);
            size += 4 + nal_len;
        }

        nal_start = nal_end;
    }

    if ([self initH264Decoder]) {
        CVPixelBufferRef pixelBuffer = NULL;
        pixelBuffer = [self decode:_buf_out withSize:size];
    }

    return size > 0 ? YES : NO;
}

@end
注意:ios 的视频部分坑很多,需要自己亲身实践才能得之精髓,另外需要结合Apple 的官方例子进行测试可以事半功倍。
下一篇打算讲解下《ios硬编码h264视频设置帧率的坑》,此部分网络上基本没有正确的解答,比如采集摄像头
30fps,设置成12fps,基本都是不生效的,这里面有点坑。
发私信要demo代码的太多了,为了服务更好,可以合作。
并提供各种推拉流、编解码、播放相关等一体化的流媒体直播、点播解决方案。

你可能感兴趣的:(ios 硬解码h264视频的坑)