AudioToolBox 编码AAC

什么是PCM?什么是AAC?
本篇我们主要介绍通过AudioToolBox 将PCM 编码成AAC 格式,并通过文件保存到沙盒中;

什么是PCM?

PCM全称Pulse-Code Modulation,翻译一下是脉冲调制编码。
其实大可以不用关心英文释义,之所以这么命名是因为一些历史原因。
在音视频中,PCM是一种用数字表示采样模拟信号的方法。
要将一段音频模拟信号转换为数字表示,包含如下三个步骤:

Sampling(采样)
Quantization(量化)
Coding(编码)

通常,我们可以通过一条曲线在坐标中显示连续的模拟信号,如下图所示

image.png

更详细解释可以参考 PCM数据格式介绍

什么是AAC

AAC,全称Advanced Audio Coding,是一种专为声音数据设计的文件压缩格式。与MP3不同,它采用了全新的算法进行编码,更加高效,具有更高的“性价比”。利用AAC格式,可使人感觉声音质量没有明显降低的前提下,更加小巧

关于AAC的介绍可以参考文章 音频格式之AAC(高级音频编码技术)

具体格式分析可以参考直观的看这篇文章 音频编码之aac编码原理;

结构示意图如下图所示:


image.png

1.音频流采集

在音频采集这里,我们还是采用 AVFoundation框架 对 captureAVCaptureSessionSession 封装,添加input的音频输入源,然后设置一个output输出,通过采集音频设备,最后通过代理方法拿到 音频流PCM 数据

@protocol SystemCaptureManagerDelegate 
@optional
- (void)captureSampleBuffer:(CMSampleBufferRef)sampleBuffer type:(SystemCaptureType)type;
@end

2.AudioToolBox 创建

创建一个 AudioEncoder类,封装AudioToolBox,通过 AudioConfig配置,来控制输出的音频数据的格式;这里主要 采样率采样深度声道数码率

具体关于AudioToolBox的创建 可参考如下代码:

-(void)start {
    //设置输入源 PCM 的音频参数
    AudioStreamBasicDescription inputAudioDes = {0};
    inputAudioDes.mSampleRate = self.audioConfig.sampleRate;
    inputAudioDes.mFormatID = kAudioFormatLinearPCM;
    inputAudioDes.mFormatFlags = kAudioFormatFlagsNativeEndian | kAudioFormatFlagIsSignedInteger | kAudioFormatFlagIsPacked;
    inputAudioDes.mChannelsPerFrame = (uint32_t)self.audioConfig.channelCount;
    inputAudioDes.mBitsPerChannel = (uint32_t)self.audioConfig.sampleSize;
    inputAudioDes.mFramesPerPacket = 1;
    inputAudioDes.mBitsPerChannel = 16;
    inputAudioDes.mBytesPerFrame = inputAudioDes.mBitsPerChannel / 8 * inputAudioDes.mChannelsPerFrame;
    inputAudioDes.mBytesPerPacket = inputAudioDes.mBytesPerFrame * inputAudioDes.mFramesPerPacket;;
    
    //设置输出AAC 的编码参数
    AudioStreamBasicDescription outputAudioDes = {0};
    outputAudioDes.mFormatID = kAudioFormatMPEG4AAC;
    outputAudioDes.mFormatFlags = kMPEG4Object_AAC_LC;
    outputAudioDes.mSampleRate = self.audioConfig.sampleRate;
    outputAudioDes.mChannelsPerFrame = (uint32_t)self.audioConfig.channelCount;  ///声道数
    outputAudioDes.mFramesPerPacket = 1024;///每个packet 的帧数 ,这是一个比较大的固定数值
    outputAudioDes.mBytesPerFrame = 0; //每帧的大小  如果是压缩格式设置为0
    outputAudioDes.mReserved = 0; // 8字节对齐,填0;
    

    
    uint32_t outDesSize = sizeof(outputAudioDes);
    AudioFormatGetProperty(kAudioFormatProperty_FormatInfo, 0, NULL, &outDesSize, &outputAudioDes);
    OSStatus status = AudioConverterNew(&inputAudioDes, &outputAudioDes, &_audioConverter);
    if (status != noErr) {
        NSLog(@"硬编码AAC创建失败");
    }
    
    //设置码率
    uint32_t aBitrate = (uint32_t)self.audioConfig.bitRate;
    uint32_t aBitrateSize = sizeof(aBitrate);
    status = AudioConverterSetProperty(_audioConverter, kAudioConverterEncodeBitRate, aBitrateSize, &aBitrate);
    pcmBufferSize = 0;
}

2.编码PCM数据

关于编码部分,在这里我们采用两种方式编码,一种是直接编码 CMSampleBufferRef 数据,一种是编码NSData 类型的PCM数据 (主要是模拟通过接口获得 PCM);

2.1 编码 CMSampleBuffer
- (void)encodeAudioSampleBuffer:(CMSampleBufferRef)sampleBuffer  {
    __weak typeof(self) weakSelf = self;
    
    // 获取PCM 数据到 inputPcmBuffer 缓冲区
    CMBlockBufferRef blockBuffer = CMSampleBufferGetDataBuffer(sampleBuffer);
    size_t  inputPCMLength = 0;
    char *inputPcmBuffer;
    OSStatus status = CMBlockBufferGetDataPointer(blockBuffer, 0, NULL, &inputPCMLength, &inputPcmBuffer);
    if (status != kCMBlockBufferNoErr) {
         NSLog(@"Get ACC from blockBuffer error \n");
    }
    
    // 开辟AAC 缓冲空间并初始化
    uint8_t *aacBuffer = malloc(inputPCMLength);
    memset(aacBuffer, 0, inputPCMLength);
    
    // 将PCM 数据放入缓冲队列,并记录当前缓冲区的大小
    memcpy(pcmBuffer + pcmBufferSize, inputPcmBuffer, inputPCMLength);
    pcmBufferSize += inputPCMLength;
    
    size_t maxBufferSize = BytesPerPacket * AACFramePerPacket * self.audioConfig.channelCount;
    NSMutableData *rawAAC = [NSMutableData new];
    
    if (pcmBufferSize >= maxBufferSize) {
        NSUInteger count = pcmBufferSize / maxBufferSize;
        for (NSInteger index = 0; index < count; index++) {
            
            UInt8 *aacBuffer = malloc(maxBufferSize);
            memset(aacBuffer, 0, maxBufferSize);
            
            //输入源 PCM 数据
            AudioBufferList inputBufferlist ;
            inputBufferlist.mNumberBuffers = 1;
            inputBufferlist.mBuffers ->mNumberChannels = (UInt32) self.audioConfig.channelCount;
            inputBufferlist.mBuffers->mDataByteSize = (UInt32)maxBufferSize;
            inputBufferlist.mBuffers->mData = pcmBuffer;
            // 输出的AAC 
            AudioBufferList outputBufferlist ;
            outputBufferlist.mNumberBuffers = 1;
            outputBufferlist.mBuffers ->mNumberChannels = inputBufferlist.mBuffers->mNumberChannels;
            outputBufferlist.mBuffers->mDataByteSize =  (UInt32)maxBufferSize;
            outputBufferlist.mBuffers->mData = aacBuffer;
            
            UInt32 outputNumPackets = 1;
            OSStatus status = AudioConverterFillComplexBuffer(_audioConverter, aacEncodeInputDataProc, &inputBufferlist, &outputNumPackets, &outputBufferlist, NULL);
            if (status != noErr) {
                NSLog(@"audio converter fillComplexBuffer error %d",status);
            }
            [rawAAC appendBytes:outputBufferlist.mBuffers[0].mData length:outputBufferlist.mBuffers[0].mDataByteSize];
            NSUInteger leftBufferSize = pcmBufferSize - maxBufferSize;
            if (leftBufferSize > 0) {
                memcpy(pcmBuffer, pcmBuffer + maxBufferSize, leftBufferSize);
            }
            pcmBufferSize -= maxBufferSize;
        }
        dispatch_async(_callbackQueue, ^{
            [weakSelf delegateWithAACData:rawAAC];
        });
    }
}
2.2 编码 NSData 类型 的PCM 数据
- (void)encodeAudioData:(NSData *)pcmData{
    __weak typeof(self) weakSelf = self;
    
    NSMutableData *rawAAC = [NSMutableData new];
    
   size_t maxBufferSize = BytesPerPacket * AACFramePerPacket * self.audioConfig.channelCount;
    
    memcpy(pcmBuffer + pcmBufferSize, pcmData.bytes, pcmData.length);
    pcmBufferSize += pcmData.length;
    
    if (pcmBufferSize >= maxBufferSize) {
        NSUInteger count = pcmBufferSize / maxBufferSize;
        for (NSInteger index = 0; index < count; index++) {
            
            UInt8 *aacBuffer = malloc(maxBufferSize);
            memset(aacBuffer, 0, maxBufferSize);
         
            AudioBufferList inputBufferlist ;
            inputBufferlist.mNumberBuffers = 1;
            inputBufferlist.mBuffers ->mNumberChannels = (UInt32) self.audioConfig.channelCount;
            inputBufferlist.mBuffers->mDataByteSize = (UInt32) maxBufferSize;
            inputBufferlist.mBuffers->mData = pcmBuffer;
            
            AudioBufferList outputBufferlist ;
            outputBufferlist.mNumberBuffers = 1;
            outputBufferlist.mBuffers ->mNumberChannels = inputBufferlist.mBuffers->mNumberChannels;
            outputBufferlist.mBuffers->mDataByteSize = (UInt32) maxBufferSize;
            outputBufferlist.mBuffers->mData = aacBuffer;
            
            UInt32 outputNumPackets = 1;
           
            OSStatus status = AudioConverterFillComplexBuffer(_audioConverter, aacEncodeInputDataProc, &inputBufferlist, &outputNumPackets, &outputBufferlist, NULL);
            if (status != noErr) {
                NSLog(@"audio converter fillComplexBuffer error %d",status);
            }
            [rawAAC appendBytes:outputBufferlist.mBuffers[0].mData length:outputBufferlist.mBuffers[0].mDataByteSize];
            NSUInteger leftBufferSize = pcmBufferSize - maxBufferSize;
            if (leftBufferSize > 0) {
                memcpy(pcmBuffer, pcmBuffer + maxBufferSize, leftBufferSize);
            }
            pcmBufferSize -= maxBufferSize;
        }
        dispatch_async(_callbackQueue, ^{
            [weakSelf delegateWithAACData:rawAAC];
        });
    }
}

3.ADTS Header 封装

简单来说AAC 流ADTS HeaderAAC ES组成;
重点来了 如何创建 ADTS Header

image.png

ADTS Heder的讲解可以看这篇文章 AAC 格式分析(notes 4)

创建一个7 bytes length 的ADTS Header 代码如下:

- (NSData *)adtsHeaderWithLength:(int)data_length profile:(int)profile sampleRate:(int)sampleRate channles:(int)channles {
    
    int adtsLength = 7;
    profile = 2;  /// AAC LC
    int chanCfg = 1;
    char *adts_header = malloc(sizeof(char) * adtsLength);
    int fullLength = adtsLength + data_length;
    int freqIdx = [self fregWithSampleBuffer:sampleRate];    //对应44100采样率;
    /*
    A 12 syncword 0xFFF, all bits must be 1
    //  11111111
    */
    adts_header[0] = 0xFF;
    /*
    B 1 MPEG Version: 0 for MPEG-4, 1 for MPEG-2
    C 2 Layer: always 0
    D 1 protection absent, Warning, set to 1 if there is no CRC and 0 if there is CRC
    ///  1111 1001
    */
    adts_header[1] = 0xF9;
    /*
    E 2 profile, the MPEG-4 Audio Object Type minus 1
    F 4 MPEG-4 Sampling Frequency Index (15 is forbidden)
    G 1 private bit, guaranteed never to be used by MPEG, set to 0 when encoding, ignore when decoding
    H 3 MPEG-4 Channel Configuration (in the case of 0, the channel configuration is sent via an inband
     11
    */
    adts_header[2] = (char)(((profile-1) << 6));
    adts_header[2] |= (char)(freqIdx << 2);
    adts_header[2] |= (char)(chanCfg >> 2);
    
    /*
      前两位已经被H占了
     I 1 originality, set to 0 when encoding, ignore when decoding
     J 1 home, set to 0 when encoding, ignore when decoding
     K 1 copyrighted id bit, the next bit of a centrally registered copyright identifier, set to 0 when encoding, ignore when decoding
     L 1 copyright id start, signals that this frame's copyright id bit is the first bit of the copyright id, set to 0 when encoding, ignore when decoding

     xx0000xx
     */
    adts_header[3] = (char)((chanCfg & 3) <<6); //chanCfg 的2bit
    
    /*
     M 13 frame length, this value must include 7 or 9 bytes of header length: FrameLength = (ProtectionAbsent == 1 ? 7 : 9) + size(AACFrame)
     0x7FF = 11111111111
     */
    adts_header[3]  |= (char)((fullLength & 0x18) >> 11);//这里只占了2bit 所以,13bit 又移11位
    adts_header[4] = (char)((fullLength &0x7FF) >> 3);
   
    //前3bit 是fulllength 的低位
    adts_header[5] =  (char)((fullLength & 7) << 5);
    /*
     O 11 Buffer fullness
     */
    adts_header[5] |= 0x1f;
    /*
     Q 16 CRC if protection absent is 0
     */
    adts_header[6] = (char)0xFC;
    
    NSData *data = [[NSData alloc] initWithBytes:adts_header length:adtsLength];
    return data;
}

- (int)fregWithSampleBuffer:(NSUInteger)sampelBuffer {
    char value = 0x0;
    if (sampelBuffer == 96000)     { value = 0x0; }
    else if (sampelBuffer == 88200){ value = 0x1; }
    else if (sampelBuffer == 64000){ value = 0x2; }
    else if (sampelBuffer == 48000){ value = 0x3; }
    else if (sampelBuffer == 44100){ value = 0x4; }
    else if (sampelBuffer == 32000){ value = 0x5; }
    else if (sampelBuffer == 24000){ value = 0x6; }
    else if (sampelBuffer == 22050){ value = 0x7; }
    else if (sampelBuffer == 16000){ value = 0x8; }
    else if (sampelBuffer == 12000){ value = 0x9; }
    else if (sampelBuffer == 11025){ value = 0xa; }
    else if (sampelBuffer == 8000) { value = 0xb; }
    return value;
}

4.总结

最后我们在沙盒内找到该 AAC 文件,并用ffplay 播放,用ffprobe 可以查看该AAC的参数 是否和我们配置的参数是一样的;

image.png
Input #0, aac, from '/Users/pengchao/Desktop/222.xcappdata/AppData/Documents/aac/2022_06_24_17:29:07.aac':
  Duration: 00:00:09.63, bitrate: 105 kb/s
  Stream #0:0: Audio: aac (LC), 44100 Hz, mono, fltp, 105 kb/s

源码地址: https://github.com/hunter858/OpenGL_Study/AVFoundation/AudioToolBox-encoder

你可能感兴趣的:(AudioToolBox 编码AAC)