什么是PCM?什么是AAC?
本篇我们主要介绍通过AudioToolBox 将PCM 编码成AAC 格式,并通过文件保存到沙盒中;
什么是PCM?
PCM全称Pulse-Code Modulation,翻译一下是脉冲调制编码。
其实大可以不用关心英文释义,之所以这么命名是因为一些历史原因。
在音视频中,PCM是一种用数字表示采样模拟信号的方法。
要将一段音频模拟信号转换为数字表示,包含如下三个步骤:
Sampling(采样)
Quantization(量化)
Coding(编码)
通常,我们可以通过一条曲线在坐标中显示连续的模拟信号,如下图所示
更详细解释可以参考 PCM数据格式介绍
什么是AAC
AAC
,全称Advanced Audio Coding
,是一种专为声音数据设计的文件压缩格式。与MP3不同,它采用了全新的算法进行编码,更加高效,具有更高的“性价比”。利用AAC格式,可使人感觉声音质量没有明显降低的前提下,更加小巧
关于AAC
的介绍可以参考文章 音频格式之AAC(高级音频编码技术)
具体格式分析可以参考直观的看这篇文章 音频编码之aac编码原理;
结构示意图如下图所示:
1.音频流采集
在音频采集这里,我们还是采用 AVFoundation
框架 对 captureAVCaptureSessionSession
封装,添加input的音频输入源,然后设置一个output输出
,通过采集音频设备,最后通过代理方法拿到 音频流PCM 数据
;
@protocol SystemCaptureManagerDelegate
@optional
- (void)captureSampleBuffer:(CMSampleBufferRef)sampleBuffer type:(SystemCaptureType)type;
@end
2.AudioToolBox 创建
创建一个 AudioEncoder
类,封装AudioToolBox
,通过 AudioConfig
配置,来控制输出的音频数据的格式;这里主要 采样率
、采样深度
、声道数
、码率
;
具体关于AudioToolBox的创建 可参考如下代码:
-(void)start {
//设置输入源 PCM 的音频参数
AudioStreamBasicDescription inputAudioDes = {0};
inputAudioDes.mSampleRate = self.audioConfig.sampleRate;
inputAudioDes.mFormatID = kAudioFormatLinearPCM;
inputAudioDes.mFormatFlags = kAudioFormatFlagsNativeEndian | kAudioFormatFlagIsSignedInteger | kAudioFormatFlagIsPacked;
inputAudioDes.mChannelsPerFrame = (uint32_t)self.audioConfig.channelCount;
inputAudioDes.mBitsPerChannel = (uint32_t)self.audioConfig.sampleSize;
inputAudioDes.mFramesPerPacket = 1;
inputAudioDes.mBitsPerChannel = 16;
inputAudioDes.mBytesPerFrame = inputAudioDes.mBitsPerChannel / 8 * inputAudioDes.mChannelsPerFrame;
inputAudioDes.mBytesPerPacket = inputAudioDes.mBytesPerFrame * inputAudioDes.mFramesPerPacket;;
//设置输出AAC 的编码参数
AudioStreamBasicDescription outputAudioDes = {0};
outputAudioDes.mFormatID = kAudioFormatMPEG4AAC;
outputAudioDes.mFormatFlags = kMPEG4Object_AAC_LC;
outputAudioDes.mSampleRate = self.audioConfig.sampleRate;
outputAudioDes.mChannelsPerFrame = (uint32_t)self.audioConfig.channelCount; ///声道数
outputAudioDes.mFramesPerPacket = 1024;///每个packet 的帧数 ,这是一个比较大的固定数值
outputAudioDes.mBytesPerFrame = 0; //每帧的大小 如果是压缩格式设置为0
outputAudioDes.mReserved = 0; // 8字节对齐,填0;
uint32_t outDesSize = sizeof(outputAudioDes);
AudioFormatGetProperty(kAudioFormatProperty_FormatInfo, 0, NULL, &outDesSize, &outputAudioDes);
OSStatus status = AudioConverterNew(&inputAudioDes, &outputAudioDes, &_audioConverter);
if (status != noErr) {
NSLog(@"硬编码AAC创建失败");
}
//设置码率
uint32_t aBitrate = (uint32_t)self.audioConfig.bitRate;
uint32_t aBitrateSize = sizeof(aBitrate);
status = AudioConverterSetProperty(_audioConverter, kAudioConverterEncodeBitRate, aBitrateSize, &aBitrate);
pcmBufferSize = 0;
}
2.编码PCM数据
关于编码部分,在这里我们采用两种方式编码,一种是直接编码 CMSampleBufferRef
数据,一种是编码NSData
类型的PCM
数据 (主要是模拟通过接口获得 PCM);
2.1 编码 CMSampleBuffer
- (void)encodeAudioSampleBuffer:(CMSampleBufferRef)sampleBuffer {
__weak typeof(self) weakSelf = self;
// 获取PCM 数据到 inputPcmBuffer 缓冲区
CMBlockBufferRef blockBuffer = CMSampleBufferGetDataBuffer(sampleBuffer);
size_t inputPCMLength = 0;
char *inputPcmBuffer;
OSStatus status = CMBlockBufferGetDataPointer(blockBuffer, 0, NULL, &inputPCMLength, &inputPcmBuffer);
if (status != kCMBlockBufferNoErr) {
NSLog(@"Get ACC from blockBuffer error \n");
}
// 开辟AAC 缓冲空间并初始化
uint8_t *aacBuffer = malloc(inputPCMLength);
memset(aacBuffer, 0, inputPCMLength);
// 将PCM 数据放入缓冲队列,并记录当前缓冲区的大小
memcpy(pcmBuffer + pcmBufferSize, inputPcmBuffer, inputPCMLength);
pcmBufferSize += inputPCMLength;
size_t maxBufferSize = BytesPerPacket * AACFramePerPacket * self.audioConfig.channelCount;
NSMutableData *rawAAC = [NSMutableData new];
if (pcmBufferSize >= maxBufferSize) {
NSUInteger count = pcmBufferSize / maxBufferSize;
for (NSInteger index = 0; index < count; index++) {
UInt8 *aacBuffer = malloc(maxBufferSize);
memset(aacBuffer, 0, maxBufferSize);
//输入源 PCM 数据
AudioBufferList inputBufferlist ;
inputBufferlist.mNumberBuffers = 1;
inputBufferlist.mBuffers ->mNumberChannels = (UInt32) self.audioConfig.channelCount;
inputBufferlist.mBuffers->mDataByteSize = (UInt32)maxBufferSize;
inputBufferlist.mBuffers->mData = pcmBuffer;
// 输出的AAC
AudioBufferList outputBufferlist ;
outputBufferlist.mNumberBuffers = 1;
outputBufferlist.mBuffers ->mNumberChannels = inputBufferlist.mBuffers->mNumberChannels;
outputBufferlist.mBuffers->mDataByteSize = (UInt32)maxBufferSize;
outputBufferlist.mBuffers->mData = aacBuffer;
UInt32 outputNumPackets = 1;
OSStatus status = AudioConverterFillComplexBuffer(_audioConverter, aacEncodeInputDataProc, &inputBufferlist, &outputNumPackets, &outputBufferlist, NULL);
if (status != noErr) {
NSLog(@"audio converter fillComplexBuffer error %d",status);
}
[rawAAC appendBytes:outputBufferlist.mBuffers[0].mData length:outputBufferlist.mBuffers[0].mDataByteSize];
NSUInteger leftBufferSize = pcmBufferSize - maxBufferSize;
if (leftBufferSize > 0) {
memcpy(pcmBuffer, pcmBuffer + maxBufferSize, leftBufferSize);
}
pcmBufferSize -= maxBufferSize;
}
dispatch_async(_callbackQueue, ^{
[weakSelf delegateWithAACData:rawAAC];
});
}
}
2.2 编码 NSData 类型 的PCM 数据
- (void)encodeAudioData:(NSData *)pcmData{
__weak typeof(self) weakSelf = self;
NSMutableData *rawAAC = [NSMutableData new];
size_t maxBufferSize = BytesPerPacket * AACFramePerPacket * self.audioConfig.channelCount;
memcpy(pcmBuffer + pcmBufferSize, pcmData.bytes, pcmData.length);
pcmBufferSize += pcmData.length;
if (pcmBufferSize >= maxBufferSize) {
NSUInteger count = pcmBufferSize / maxBufferSize;
for (NSInteger index = 0; index < count; index++) {
UInt8 *aacBuffer = malloc(maxBufferSize);
memset(aacBuffer, 0, maxBufferSize);
AudioBufferList inputBufferlist ;
inputBufferlist.mNumberBuffers = 1;
inputBufferlist.mBuffers ->mNumberChannels = (UInt32) self.audioConfig.channelCount;
inputBufferlist.mBuffers->mDataByteSize = (UInt32) maxBufferSize;
inputBufferlist.mBuffers->mData = pcmBuffer;
AudioBufferList outputBufferlist ;
outputBufferlist.mNumberBuffers = 1;
outputBufferlist.mBuffers ->mNumberChannels = inputBufferlist.mBuffers->mNumberChannels;
outputBufferlist.mBuffers->mDataByteSize = (UInt32) maxBufferSize;
outputBufferlist.mBuffers->mData = aacBuffer;
UInt32 outputNumPackets = 1;
OSStatus status = AudioConverterFillComplexBuffer(_audioConverter, aacEncodeInputDataProc, &inputBufferlist, &outputNumPackets, &outputBufferlist, NULL);
if (status != noErr) {
NSLog(@"audio converter fillComplexBuffer error %d",status);
}
[rawAAC appendBytes:outputBufferlist.mBuffers[0].mData length:outputBufferlist.mBuffers[0].mDataByteSize];
NSUInteger leftBufferSize = pcmBufferSize - maxBufferSize;
if (leftBufferSize > 0) {
memcpy(pcmBuffer, pcmBuffer + maxBufferSize, leftBufferSize);
}
pcmBufferSize -= maxBufferSize;
}
dispatch_async(_callbackQueue, ^{
[weakSelf delegateWithAACData:rawAAC];
});
}
}
3.ADTS Header 封装
简单来说AAC 流
由 ADTS Header
和AAC ES
组成;
重点来了 如何创建 ADTS Header
呢
ADTS Heder
的讲解可以看这篇文章 AAC 格式分析(notes 4)
创建一个7 bytes length
的ADTS Header 代码如下:
- (NSData *)adtsHeaderWithLength:(int)data_length profile:(int)profile sampleRate:(int)sampleRate channles:(int)channles {
int adtsLength = 7;
profile = 2; /// AAC LC
int chanCfg = 1;
char *adts_header = malloc(sizeof(char) * adtsLength);
int fullLength = adtsLength + data_length;
int freqIdx = [self fregWithSampleBuffer:sampleRate]; //对应44100采样率;
/*
A 12 syncword 0xFFF, all bits must be 1
// 11111111
*/
adts_header[0] = 0xFF;
/*
B 1 MPEG Version: 0 for MPEG-4, 1 for MPEG-2
C 2 Layer: always 0
D 1 protection absent, Warning, set to 1 if there is no CRC and 0 if there is CRC
/// 1111 1001
*/
adts_header[1] = 0xF9;
/*
E 2 profile, the MPEG-4 Audio Object Type minus 1
F 4 MPEG-4 Sampling Frequency Index (15 is forbidden)
G 1 private bit, guaranteed never to be used by MPEG, set to 0 when encoding, ignore when decoding
H 3 MPEG-4 Channel Configuration (in the case of 0, the channel configuration is sent via an inband
11
*/
adts_header[2] = (char)(((profile-1) << 6));
adts_header[2] |= (char)(freqIdx << 2);
adts_header[2] |= (char)(chanCfg >> 2);
/*
前两位已经被H占了
I 1 originality, set to 0 when encoding, ignore when decoding
J 1 home, set to 0 when encoding, ignore when decoding
K 1 copyrighted id bit, the next bit of a centrally registered copyright identifier, set to 0 when encoding, ignore when decoding
L 1 copyright id start, signals that this frame's copyright id bit is the first bit of the copyright id, set to 0 when encoding, ignore when decoding
xx0000xx
*/
adts_header[3] = (char)((chanCfg & 3) <<6); //chanCfg 的2bit
/*
M 13 frame length, this value must include 7 or 9 bytes of header length: FrameLength = (ProtectionAbsent == 1 ? 7 : 9) + size(AACFrame)
0x7FF = 11111111111
*/
adts_header[3] |= (char)((fullLength & 0x18) >> 11);//这里只占了2bit 所以,13bit 又移11位
adts_header[4] = (char)((fullLength &0x7FF) >> 3);
//前3bit 是fulllength 的低位
adts_header[5] = (char)((fullLength & 7) << 5);
/*
O 11 Buffer fullness
*/
adts_header[5] |= 0x1f;
/*
Q 16 CRC if protection absent is 0
*/
adts_header[6] = (char)0xFC;
NSData *data = [[NSData alloc] initWithBytes:adts_header length:adtsLength];
return data;
}
- (int)fregWithSampleBuffer:(NSUInteger)sampelBuffer {
char value = 0x0;
if (sampelBuffer == 96000) { value = 0x0; }
else if (sampelBuffer == 88200){ value = 0x1; }
else if (sampelBuffer == 64000){ value = 0x2; }
else if (sampelBuffer == 48000){ value = 0x3; }
else if (sampelBuffer == 44100){ value = 0x4; }
else if (sampelBuffer == 32000){ value = 0x5; }
else if (sampelBuffer == 24000){ value = 0x6; }
else if (sampelBuffer == 22050){ value = 0x7; }
else if (sampelBuffer == 16000){ value = 0x8; }
else if (sampelBuffer == 12000){ value = 0x9; }
else if (sampelBuffer == 11025){ value = 0xa; }
else if (sampelBuffer == 8000) { value = 0xb; }
return value;
}
4.总结
最后我们在沙盒内找到该 AAC
文件,并用ffplay
播放,用ffprobe
可以查看该AAC
的参数 是否和我们配置的参数是一样的;
Input #0, aac, from '/Users/pengchao/Desktop/222.xcappdata/AppData/Documents/aac/2022_06_24_17:29:07.aac':
Duration: 00:00:09.63, bitrate: 105 kb/s
Stream #0:0: Audio: aac (LC), 44100 Hz, mono, fltp, 105 kb/s
源码地址: https://github.com/hunter858/OpenGL_Study/AVFoundation/AudioToolBox-encoder