该编码器采用双线并行的结构,上半部分是子带编码系统,下半部分是感知音频编码系统。下半部分的目的是为不同子带分配不同的量化比特数。下面就这两个部分分别展开描述:
含义:将原始信号分解为若干个子频带,对其分别编码处理后再合成为全频带信号
为什么?:很多信源在同一时间表现出不同特性,信号需要被分解,针对不同子路采用不同的量化比特与编码方式。
将PCM样本变换到32个子带的频域信号
低通滤波器与各子带中心频率调制得到滤波器组:
缺点:
对每个子带的12个样点进行计算,查表获得12个样点中最大的比例因子。
第二层中一帧对应36个自带样值,是一层的三倍,应该传3个比例因子。
为了降低比例因子传输码率,每帧中每个子带的三个比例因子被划分成特定的几种模式,1/2/3个比例因子和比例因子选择信息(每子带2比特)被一起传送。
码流结构的设计:
有利于使用低复杂性和少延时的解码器,编码后的声音信号包含多个短且恒定间隔的切入点
编码数据允许在编码码流中插入整数倍的切入点,以利记录、播放和编辑短的声音序列,并能精确指定编辑点
思想:分析信号,去掉不能被人耳感知的部分,即去除信息冗余。
- 时频矛盾点:
子带编码系统中子带应该粗分,以让时域具有较高的分辨率——确保在短暂冲激信号情况下,编码的声音具有较高的质量。
感知音频编码系统中,信号在FFT处应该提高频率分辨率——因为掩蔽阈值是从功率谱密度推导而来的
听觉阈值电平是自适应的,会随着听到的不同频率声音而发生变化。
也就是说,本频点的总掩蔽=其他频点的Tone对其产生的掩蔽+其他频点Noise对其产生的掩蔽。
每个子带的掩蔽阈值=本子带内各频点中最小的掩蔽阈值
当某个纯音被以它为中心频率,且具有一定带宽的连续噪声所掩蔽时,若果该纯银呢刚好被听到时的功率等于这一频带内的噪声功率,这个带宽为临界频带宽度。
该模块的两个输入:
SMR:每个子带信号掩蔽比 SMR=信号能量/掩蔽阈值
AB:码率
比特分配过程:
对每个子带计算噪声-掩蔽比NMR,信掩比SMR-是信噪比SNR,
N M R = S M R – S N R NMR = SMR–SNR NMR=SMR–SNR
使整个一帧和每个子带的掩噪比最小。这是一个循环过程,每一次为最高NMR的子带分配比特,即使获益最大的子带的量化级别增加一级,当然所用比特数不能超过一帧所能提供的最大数目。
循环,重新计算分配了更多比特子带的NMR,分配,直到没有比特可用。
按照以下顺序一次处理多达2个通道的一个音频重叠帧
选择某个数据帧,输出该帧所分配的比特数、该帧的比例因子、该帧的比特分配结果
对main()函数做如下修改
time_t start_time, end_time;
int total_time;
//===============================================
FILE* Trace=NULL;
fopen_s(&Trace, "tace.txt", "w");
//=================================================
输出第一帧分配的比特数
while (get_audio (musicin, buffer, num_samples, nch, &header) > 0) {
if (glopts.verbosity > 1)
if (++frameNum % 10 == 0)
fprintf (stderr, "[%4u]\r", frameNum);
fflush (stderr);
win_buf[0] = &buffer[0][0];
win_buf[1] = &buffer[1][0];
adb = available_bits (&header, &glopts);//获得比特数
//=================================================================================
if (frameNum == 1)
fprintf(Trace, "bits allocated in the frame No.%d is %d\n", frameNum, adb);
//==================================================================================
输出第一帧的比特分配结果和比例因子(样本test.wav是单声道的)
#ifdef NEWENCODE
sf_transmission_pattern (scalar, scfsi, &frame);
main_bit_allocation_new (smr, scfsi, bit_alloc, &adb, &frame, &glopts);
//main_bit_allocation (smr, scfsi, bit_alloc, &adb, &frame, &glopts);
if (error_protection)
CRC_calc (&frame, bit_alloc, scfsi, &crc);
write_header (&frame, &bs);
//encode_info (&frame, &bs);
if (error_protection)
putbits (&bs, crc, 16);
write_bit_alloc (bit_alloc, &frame, &bs);
//encode_bit_alloc (bit_alloc, &frame, &bs);
write_scalefactors(bit_alloc, scfsi, scalar, &frame, &bs);
//encode_scale (bit_alloc, scfsi, scalar, &frame, &bs);
subband_quantization_new (scalar, *sb_sample, j_scale, *j_sample, bit_alloc,
*subband, &frame);
//subband_quantization (scalar, *sb_sample, j_scale, *j_sample, bit_alloc,
// *subband, &frame);
write_samples_new(*subband, bit_alloc, &frame, &bs);
//sample_encoding (*subband, bit_alloc, &frame, &bs);
#else
transmission_pattern (scalar, scfsi, &frame);
main_bit_allocation (smr, scfsi, bit_alloc, &adb, &frame, &glopts);
//========================================================================================
if (frameNum == 1)
{
fprintf(Trace, "bit allocation:\n");
for (int i = 0; i < frame.sblimit; i++)
fprintf(Trace, "subband[%d]:%d bits\n", i, bit_alloc[0][i]);
}
//==========================================================================================
if (error_protection)
CRC_calc (&frame, bit_alloc, scfsi, &crc);
encode_info (&frame, &bs);
if (error_protection)
encode_CRC (crc, &bs);
encode_bit_alloc (bit_alloc, &frame, &bs);
encode_scale (bit_alloc, scfsi, scalar, &frame, &bs);
subband_quantization (scalar, *sb_sample, j_scale, *j_sample, bit_alloc,
*subband, &frame);
sample_encoding (*subband, bit_alloc, &frame, &bs);
//=============================================================================================================================
if (frameNum == 1)
{
fprintf(Trace, "scalefactors:\n");
for (int i = 0; i < frame.sblimit; i++)
fprintf(Trace, "subband[%d] scalefactors:%d %d %d\n", i, scalar[0][0][i], scalar[0][1][i], scalar[0][2][i]);
}
//==============================================================================================================================
输出第一帧的比例因子
修改函数 print_config(),输出文件名、采样率和目标码率
void print_config (frame_info * frame, int *psy, char *inPath,
char *outPath,FILE *Trace)
{
frame_header *header = frame->header;
if (glopts.verbosity == 0)
return;
fprintf (stderr, "--------------------------------------------\n");
fprintf (stderr, "Input File : '%s' %.1f kHz\n",
(strcmp (inPath, "-") ? inPath : "stdin"),
s_freq[header->version][header->sampling_frequency]);//音频采样率
//=======================================================================================
fprintf(Trace, "文件名: '%s\n", (strcmp(inPath, "-") ? inPath : "stdin"));
fprintf(Trace,"采样率: %.1f kHz\n", s_freq[header->version][header->sampling_frequency]);
//======================================================================================
fprintf (stderr, "Output File: '%s'\n",
(strcmp (outPath, "-") ? outPath : "stdout"));
fprintf (stderr, "%d kbps ", bitrate[header->version][header->bitrate_index]);//目标码率
//======================================================================================
fprintf(Trace, "目标码率: %d kbps\n", bitrate[header->version][header->bitrate_index]);
//========================================================================================
fprintf (stderr, "%s ", version_names[header->version]);
记录闹市噪声与加上闹市噪声的原音乐,转换为.wav格式(在线音频格式转换)
以下为输出结果,对比发现,原音乐信号较为平稳 ,子带多传送一个或两个比例因子,而噪声与加噪音乐的子带大多传送三个比例因子:
噪声:
文件名: '噪声.wav
采样率: 44.1 kHz
目标码率: 192 kbps
bits allocated in the frame No.1 is 5008
bit allocation:
subband[0]:4 bits
subband[1]:5 bits
subband[2]:3 bits
subband[3]:6 bits
subband[4]:6 bits
subband[5]:5 bits
subband[6]:5 bits
subband[7]:5 bits
subband[8]:5 bits
subband[9]:5 bits
subband[10]:4 bits
subband[11]:4 bits
subband[12]:4 bits
subband[13]:3 bits
subband[14]:4 bits
subband[15]:5 bits
subband[16]:4 bits
subband[17]:5 bits
subband[18]:4 bits
subband[19]:3 bits
subband[20]:3 bits
subband[21]:3 bits
subband[22]:1 bits
subband[23]:1 bits
subband[24]:0 bits
subband[25]:0 bits
subband[26]:0 bits
subband[27]:0 bits
subband[28]:0 bits
subband[29]:0 bits
scalefactors:
subband[0] scalefactors:14 32 63
subband[1] scalefactors:12 37 63
subband[2] scalefactors:15 33 63
subband[3] scalefactors:14 34 63
subband[4] scalefactors:15 40 63
subband[5] scalefactors:19 37 63
subband[6] scalefactors:19 41 63
subband[7] scalefactors:23 46 63
subband[8] scalefactors:22 40 63
subband[9] scalefactors:19 42 63
subband[10] scalefactors:19 42 63
subband[11] scalefactors:21 39 63
subband[12] scalefactors:22 42 63
subband[13] scalefactors:26 42 63
subband[14] scalefactors:23 41 63
subband[15] scalefactors:17 42 63
subband[16] scalefactors:18 36 63
subband[17] scalefactors:17 36 63
subband[18] scalefactors:15 40 63
subband[19] scalefactors:17 35 63
subband[20] scalefactors:16 36 63
subband[21] scalefactors:16 41 63
subband[22] scalefactors:20 38 63
subband[23] scalefactors:21 40 63
subband[24] scalefactors:25 42 63
subband[25] scalefactors:22 39 63
subband[26] scalefactors:19 41 63
subband[27] scalefactors:17 39 63
subband[28] scalefactors:18 35 63
subband[29] scalefactors:17 36 63
原文件:
文件名: 'test.wav
采样率: 44.1 kHz
目标码率: 192 kbps
bits allocated in the frame No.1 is 5008
bit allocation:
subband[0]:8 bits
subband[1]:8 bits
subband[2]:6 bits
subband[3]:8 bits
subband[4]:7 bits
subband[5]:8 bits
subband[6]:8 bits
subband[7]:6 bits
subband[8]:5 bits
subband[9]:6 bits
subband[10]:6 bits
subband[11]:7 bits
subband[12]:6 bits
subband[13]:6 bits
subband[14]:6 bits
subband[15]:5 bits
subband[16]:5 bits
subband[17]:5 bits
subband[18]:4 bits
subband[19]:6 bits
subband[20]:3 bits
subband[21]:3 bits
subband[22]:0 bits
subband[23]:0 bits
subband[24]:0 bits
subband[25]:0 bits
subband[26]:0 bits
subband[27]:0 bits
subband[28]:0 bits
subband[29]:0 bits
scalefactors:
subband[0] scalefactors:11 11 11
subband[1] scalefactors:12 12 12
subband[2] scalefactors:21 18 18
subband[3] scalefactors:25 25 25
subband[4] scalefactors:29 29 29
subband[5] scalefactors:28 23 26
subband[6] scalefactors:22 22 22
subband[7] scalefactors:21 21 21
subband[8] scalefactors:32 28 28
subband[9] scalefactors:34 30 30
subband[10] scalefactors:31 31 31
subband[11] scalefactors:30 30 26
subband[12] scalefactors:27 24 24
subband[13] scalefactors:23 23 23
subband[14] scalefactors:26 22 25
subband[15] scalefactors:30 25 25
subband[16] scalefactors:26 26 26
subband[17] scalefactors:29 29 29
subband[18] scalefactors:31 31 30
subband[19] scalefactors:26 26 26
subband[20] scalefactors:34 34 31
subband[21] scalefactors:34 31 31
subband[22] scalefactors:38 38 38
subband[23] scalefactors:39 50 50
subband[24] scalefactors:43 51 57
subband[25] scalefactors:41 54 54
subband[26] scalefactors:45 52 52
subband[27] scalefactors:42 54 54
subband[28] scalefactors:44 52 52
subband[29] scalefactors:43 52 52
原文件+噪声:
文件名: '加噪.wav
采样率: 44.1 kHz
目标码率: 192 kbps
bits allocated in the frame No.1 is 5008
bit allocation:
subband[0]:4 bits
subband[1]:4 bits
subband[2]:3 bits
subband[3]:6 bits
subband[4]:5 bits
subband[5]:4 bits
subband[6]:5 bits
subband[7]:4 bits
subband[8]:5 bits
subband[9]:5 bits
subband[10]:4 bits
subband[11]:5 bits
subband[12]:4 bits
subband[13]:4 bits
subband[14]:4 bits
subband[15]:5 bits
subband[16]:3 bits
subband[17]:4 bits
subband[18]:4 bits
subband[19]:4 bits
subband[20]:3 bits
subband[21]:3 bits
subband[22]:3 bits
subband[23]:1 bits
subband[24]:0 bits
subband[25]:1 bits
subband[26]:0 bits
subband[27]:0 bits
subband[28]:0 bits
subband[29]:0 bits
scalefactors:
subband[0] scalefactors:14 32 63
subband[1] scalefactors:12 37 63
subband[2] scalefactors:15 33 63
subband[3] scalefactors:14 34 63
subband[4] scalefactors:15 40 63
subband[5] scalefactors:19 37 63
subband[6] scalefactors:19 41 63
subband[7] scalefactors:23 46 63
subband[8] scalefactors:22 40 63
subband[9] scalefactors:19 42 63
subband[10] scalefactors:19 42 63
subband[11] scalefactors:21 39 63
subband[12] scalefactors:22 42 63
subband[13] scalefactors:26 42 63
subband[14] scalefactors:23 41 63
subband[15] scalefactors:17 42 63
subband[16] scalefactors:18 36 63
subband[17] scalefactors:17 36 63
subband[18] scalefactors:15 40 63
subband[19] scalefactors:17 35 63
subband[20] scalefactors:16 36 63
subband[21] scalefactors:16 41 63
subband[22] scalefactors:20 38 63
subband[23] scalefactors:21 40 63
subband[24] scalefactors:25 42 63
subband[25] scalefactors:22 39 63
subband[26] scalefactors:19 41 63
subband[27] scalefactors:17 39 63
subband[28] scalefactors:18 35 63
subband[29] scalefactors:17 36 63