Hello! AudioStreamBasicDescription

原文地址

Apple是如何定义Audio的

In Core Audio, the following definitions apply:

  • An audio stream is a continuous series of data that represents a sound, such as a song.
  • A channel is a discrete track of monophonic audio. A monophonic stream has one channel; a stereo stream has two channels.
  • A sample is single numerical value for a single audio channel in an audio stream.
  • A frame is a collection of time-coincident samples. For instance, a linear PCM stereo sound file has two samples per frame, one for the left channel and one for the right channel.
  • A packet is a collection of one or more contiguous frames. A packet defines the smallest meaningful set of frames for a given audio data format, and is the smallest data unit for which time can be measured. In linear PCM audio, a packet holds a single frame. In compressed formats, it typically holds more; in some formats, the number of frames per packet varies.
  • The sample rate for a stream is the number of frames per second of uncompressed (or, for compressed formats, the equivalent in decompressed) audio.

AudioStreamBasicDescription 结构

struct AudioStreamBasicDescription
{
    Float64             mSampleRate;
    AudioFormatID       mFormatID;
    AudioFormatFlags    mFormatFlags;
    UInt32              mBytesPerPacket;
    UInt32              mFramesPerPacket;
    UInt32              mBytesPerFrame;
    UInt32              mChannelsPerFrame;
    UInt32              mBitsPerChannel;
    UInt32              mReserved;
};
typedef struct AudioStreamBasicDescription  AudioStreamBasicDescription;

PCM时采样频率叫做sample rate。
每一次采样可以得到若干采样数据,对应多个channel。
每一个采样点得到的若干采样数据组合起来,叫做一个frame。
若干frame组合起来叫做一个packet。

AudioStreamBasicDescription 各字段的含义

mSampleRate

  • 采样率,表示录音设备在单位时间内对声音信号进行了多少次采样,常用的采样率有 16000 32000 44100 等

AudioFormatID

采样数据的类型,PCM,AAC等

kAudioFormatLinearPCM               = 'lpcm',
kAudioFormatMPEG4AAC                = 'aac ',
kAudioFormatMPEGLayer3              = '.mp3',

mFormatFlags

描述AudioBufferList的格式

  kAudioFormatFlagIsFloat                     = (1U << 0),     // 0x1
    kAudioFormatFlagIsBigEndian                 = (1U << 1),     // 0x2
    kAudioFormatFlagIsSignedInteger             = (1U << 2),     // 0x4
    kAudioFormatFlagIsPacked                    = (1U << 3),     // 0x8
    kAudioFormatFlagIsAlignedHigh               = (1U << 4),     // 0x10
    kAudioFormatFlagIsNonInterleaved            = (1U << 5),     // 0x20
    kAudioFormatFlagIsNonMixable                = (1U << 6),     // 0x40

kAudioFormatFlagIsFloat

是否是浮点数, 没有设置,默认是 int 类型

kAudioFormatFlagIsBigEndian

是否是大端, 没有设置,默认是小端

kAudioFormatFlagIsSignedInteger

是否是 signed int, 没有设置,默认是 unsigned int

kAudioFormatFlagIsPacked

是否mBitsPerChannel 会占满整个通道,如果没有占满, 就会依高位对齐或低位对齐。
没有设置的时候,满足 ((mBitsPerSample / 8) * mChannelsPerFrame) == mBytesPerFrame 的条件,默认会设置此选项。

kAudioFormatFlagIsNonInterleaved

设置 是否是平面类型,是否是交错类型。

音频数据的layout是分交错布局和平面布局,一个双声道音频数据为例则数据有两种布局的可能

  1. 交错布局:LRLRLR...
  2. 平面布局:
  • 平面1 LLLLLL...
  • 平面2 RRRRRR...

mChannelsPerFrame

描述音频文件的声道数。 单声道 1 双声道 2 。这个值不能为0

mBitsPerChannel

每个音频样本的bit位数,1byte = 8bit,一般值为 8 16 32

mBytesPerFrame

每一音频帧中的字节数
计算方法

  • 交错布局: mBytesPerFrame = mBitsPerChannel / 8 * mBitsPerChannel
  • 平面布局: mBytesPerFrame = mBitsPerChannel / 8

mFramesPerPacket

一个数据包中的帧数,每个packet的帧数。如果是未压缩的音频数据,值是1。动态帧率格式,这个值是一个较大的固定数字,比如说AAC的1024。如果是动态大小帧数(比如Ogg格式)设置为0。

mBytesPerPacket

一个数据包中的字节数,mBytesPerPacket = mBytesPerFrame * mFramesPerPacket

mReserved

填充结构以强制统一 8 字节对齐。必须设置为 0

你可能感兴趣的:(Hello! AudioStreamBasicDescription)