首先我要说明一下,我下面要说的不是对于音频数据算法处理的内容,而是对于Wav文件编码头信息的解释以及一些简单的处理。
如果你已经完全了解了,就可以不用看了。
---------------------------------------------------------------------
百度百科:
WAV为微软公司(Microsoft)开发的一种声音文件格式,文件作为多媒体中使用的声波文件格式之一,它是以RIFF(Resource Interchange File Format)格式为标准的。每个WAV文件的头四个字节便是“RIFF”。WAV文件由文件头和数据体两大部分组成。其中文件头又分为RIFF/WAV文件标识段和声音数据格式说明段两部分,包含了音频流的编码参数。
WAV对音频流的编码没有硬性规定,除了PCM(Pulse Code Modulation脉冲编码调制)之外,还有几乎所有支持ACM规范的编码都可以为WAV的音频流进行编码,如MP3编码同样也可以运用在WAV中,只要安装好了相应的Decode(指令解码),就可以欣赏这些WAV了。
在windows平台下,基于PCM编码的WAV是被支持得最好的音频格式,所有音频软件都能完美支持,由于本身可以达到较高的音质要求,因此,WAV也是音乐编辑创作的首选格式,适合保存音乐素材。因此,基于PCM编码的WAV被作为了一种中介的格式,常常使用在其他编码的相互转换之中,例如MP3转换成WMA。
WAV文件可以存储大量格式的数据,通常采用的音频编码方式是脉冲编码调制(PCM)。由于WAV格式源自Windows/Intel环境,因而采用Little-Endian(小字节序、低字节序)字节顺序进行存储。
看了百科,基本上什么是Wav文件,应该没什么问题了。
下面开始正题,
1. Wav编码格式的转换。
如果你用的是XP或者其他什么windows 系统,估计系统都有自带的录音机。利用这个录音机就可以很方便的转化 Wav文件的编码,这里的选择也非常多,什么采样频率,什么8位16位都有。
这个的前提是你只需要手动转化,或者在PC这个windows平台上,一般都可以通过代码实现。如果在其他平台上,比如,手机,相机。那么你就需要真正的代码实现。
如果你够勤劳,那么你应该可以通过google 或者百度,找到标准的转化代码。
我在这里就举个最简单的例子 比如 U-LAW编码格式的WAV文件转化为PCM编码的WAV文件。通过查询这个编码标准G711 ,可以在googl代码中搜索 到 g711的编码转换源代码。
下面就是g711的源代码。其他的编码也有G72x等,
/*
* This source code is a product of Sun Microsystems, Inc. and is provided
* for unrestricted use. Users may copy or modify this source code without
* charge.
*
* SUN SOURCE CODE IS PROVIDED AS IS WITH NO WARRANTIES OF ANY KIND INCLUDING
* THE WARRANTIES OF DESIGN, MERCHANTIBILITY AND FITNESS FOR A PARTICULAR
* PURPOSE, OR ARISING FROM A COURSE OF DEALING, USAGE OR TRADE PRACTICE.
*
* Sun source code is provided with no support and without any obligation on
* the part of Sun Microsystems, Inc. to assist in its use, correction,
* modification or enhancement.
*
* SUN MICROSYSTEMS, INC. SHALL HAVE NO LIABILITY WITH RESPECT TO THE
* INFRINGEMENT OF COPYRIGHTS, TRADE SECRETS OR ANY PATENTS BY THIS SOFTWARE
* OR ANY PART THEREOF.
*
* In no event will Sun Microsystems, Inc. be liable for any lost revenue
* or profits or other special, indirect and consequential damages, even if
* Sun has been advised of the possibility of such damages.
*
* Sun Microsystems, Inc.
* 2550 Garcia Avenue
* Mountain View, California 94043
*/
/*
* g711.c
*
* u-law, A-law and linear PCM conversions.
*/
#define SIGN_BIT (0x80) /* Sign bit for a A-law byte. */
#define QUANT_MASK (0xf) /* Quantization field mask. */
#define NSEGS (8) /* Number of A-law segments. */
#define SEG_SHIFT (4) /* Left shift for segment number. */
#define SEG_MASK (0x70) /* Segment field mask. */
static short seg_end[8] = {0xFF, 0x1FF, 0x3FF, 0x7FF,
0xFFF, 0x1FFF, 0x3FFF, 0x7FFF};
/* copy from CCITT G.711 specifications */
unsigned char _u2a[128] = { /* u- to A-law conversions */
1, 1, 2, 2, 3, 3, 4, 4,
5, 5, 6, 6, 7, 7, 8, 8,
9, 10, 11, 12, 13, 14, 15, 16,
17, 18, 19, 20, 21, 22, 23, 24,
25, 27, 29, 31, 33, 34, 35, 36,
37, 38, 39, 40, 41, 42, 43, 44,
46, 48, 49, 50, 51, 52, 53, 54,
55, 56, 57, 58, 59, 60, 61, 62,
64, 65, 66, 67, 68, 69, 70, 71,
72, 73, 74, 75, 76, 77, 78, 79,
81, 82, 83, 84, 85, 86, 87, 88,
89, 90, 91, 92, 93, 94, 95, 96,
97, 98, 99, 100, 101, 102, 103, 104,
105, 106, 107, 108, 109, 110, 111, 112,
113, 114, 115, 116, 117, 118, 119, 120,
121, 122, 123, 124, 125, 126, 127, 128};
unsigned char _a2u[128] = { /* A- to u-law conversions */
1, 3, 5, 7, 9, 11, 13, 15,
16, 17, 18, 19, 20, 21, 22, 23,
24, 25, 26, 27, 28, 29, 30, 31,
32, 32, 33, 33, 34, 34, 35, 35,
36, 37, 38, 39, 40, 41, 42, 43,
44, 45, 46, 47, 48, 48, 49, 49,
50, 51, 52, 53, 54, 55, 56, 57,
58, 59, 60, 61, 62, 63, 64, 64,
65, 66, 67, 68, 69, 70, 71, 72,
73, 74, 75, 76, 77, 78, 79, 79,
80, 81, 82, 83, 84, 85, 86, 87,
88, 89, 90, 91, 92, 93, 94, 95,
96, 97, 98, 99, 100, 101, 102, 103,
104, 105, 106, 107, 108, 109, 110, 111,
112, 113, 114, 115, 116, 117, 118, 119,
120, 121, 122, 123, 124, 125, 126, 127};
static int
search(
int val,
short *table,
int size)
{
int i;
for (i = 0; i < size; i++) {
if (val <= *table++)
return (i);
}
return (size);
}
/*
* linear2alaw() - Convert a 16-bit linear PCM value to 8-bit A-law
*
* linear2alaw() accepts an 16-bit integer and encodes it as A-law data.
*
* Linear Input Code Compressed Code
* ------------------------ ---------------
* 0000000wxyza 000wxyz
* 0000001wxyza 001wxyz
* 000001wxyzab 010wxyz
* 00001wxyzabc 011wxyz
* 0001wxyzabcd 100wxyz
* 001wxyzabcde 101wxyz
* 01wxyzabcdef 110wxyz
* 1wxyzabcdefg 111wxyz
*
* For further information see John C. Bellamy's Digital Telephony, 1982,
* John Wiley & Sons, pps 98-111 and 472-476.
*/
unsigned char
linear2alaw(
int pcm_val) /* 2's complement (16-bit range) */
{
int mask;
int seg;
unsigned char aval;
if (pcm_val >= 0) {
mask = 0xD5; /* sign (7th) bit = 1 */
} else {
mask = 0x55; /* sign bit = 0 */
pcm_val = -pcm_val - 8;
}
/* Convert the scaled magnitude to segment number. */
seg = search(pcm_val, seg_end, 8);
/* Combine the sign, segment, and quantization bits. */
if (seg >= 8) /* out of range, return maximum value. */
return (0x7F ^ mask);
else {
aval = seg << SEG_SHIFT;
if (seg < 2)
aval |= (pcm_val >> 4) & QUANT_MASK;
else
aval |= (pcm_val >> (seg + 3)) & QUANT_MASK;
return (aval ^ mask);
}
}
/*
* alaw2linear() - Convert an A-law value to 16-bit linear PCM
*
*/
int
alaw2linear(
unsigned char a_val)
{
int t;
int seg;
a_val ^= 0x55;
t = (a_val & QUANT_MASK) << 4;
seg = ((unsigned)a_val & SEG_MASK) >> SEG_SHIFT;
switch (seg) {
case 0:
t += 8;
break;
case 1:
t += 0x108;
break;
default:
t += 0x108;
t <<= seg - 1;
}
return ((a_val & SIGN_BIT) ? t : -t);
}
#define BIAS (0x84) /* Bias for linear code. */
/*
* linear2ulaw() - Convert a linear PCM value to u-law
*
* In order to simplify the encoding process, the original linear magnitude
* is biased by adding 33 which shifts the encoding range from (0 - 8158) to
* (33 - 8191). The result can be seen in the following encoding table:
*
* Biased Linear Input Code Compressed Code
* ------------------------ ---------------
* 00000001wxyza 000wxyz
* 0000001wxyzab 001wxyz
* 000001wxyzabc 010wxyz
* 00001wxyzabcd 011wxyz
* 0001wxyzabcde 100wxyz
* 001wxyzabcdef 101wxyz
* 01wxyzabcdefg 110wxyz
* 1wxyzabcdefgh 111wxyz
*
* Each biased linear code has a leading 1 which identifies the segment
* number. The value of the segment number is equal to 7 minus the number
* of leading 0's. The quantization interval is directly available as the
* four bits wxyz. * The trailing bits (a - h) are ignored.
*
* Ordinarily the complement of the resulting code word is used for
* transmission, and so the code word is complemented before it is returned.
*
* For further information see John C. Bellamy's Digital Telephony, 1982,
* John Wiley & Sons, pps 98-111 and 472-476.
*/
unsigned char
linear2ulaw(
int pcm_val) /* 2's complement (16-bit range) */
{
int mask;
int seg;
unsigned char uval;
/* Get the sign and the magnitude of the value. */
if (pcm_val < 0) {
pcm_val = BIAS - pcm_val;
mask = 0x7F;
} else {
pcm_val += BIAS;
mask = 0xFF;
}
/* Convert the scaled magnitude to segment number. */
seg = search(pcm_val, seg_end, 8);
/*
* Combine the sign, segment, quantization bits;
* and complement the code word.
*/
if (seg >= 8) /* out of range, return maximum value. */
return (0x7F ^ mask);
else {
uval = (seg << 4) | ((pcm_val >> (seg + 3)) & 0xF);
return (uval ^ mask);
}
}
/*
* ulaw2linear() - Convert a u-law value to 16-bit linear PCM
*
* First, a biased linear code is derived from the code word. An unbiased
* output can then be obtained by subtracting 33 from the biased code.
*
* Note that this function expects to be passed the complement of the
* original code word. This is in keeping with ISDN conventions.
*/
int
ulaw2linear(
unsigned char u_val)
{
int t;
/* Complement to obtain normal u-law value. */
u_val = ~u_val;
/*
* Extract and bias the quantization bits. Then
* shift up by the segment number and subtract out the bias.
*/
t = ((u_val & QUANT_MASK) << 3) + BIAS;
t <<= ((unsigned)u_val & SEG_MASK) >> SEG_SHIFT;
return ((u_val & SIGN_BIT) ? (BIAS - t) : (t - BIAS));
}
/* A-law to u-law conversion */
unsigned char
alaw2ulaw(
unsigned char aval)
{
aval &= 0xff;
return ((aval & 0x80) ? (0xFF ^ _a2u[aval ^ 0xD5]) :
(0x7F ^ _a2u[aval ^ 0x55]));
}
/* u-law to A-law conversion */
unsigned char
ulaw2alaw(
unsigned char uval)
{
uval &= 0xff;
return ((uval & 0x80) ? (0xD5 ^ (_u2a[0xFF ^ uval] - 1)) :
(0x55 ^ (_u2a[0x7F ^ uval] - 1)));
}
有了这个文件,数据方面的转化就没有问题了,都可以靠它来解决。接下去,光有数据,任何播放器都无法识别这对数据到底是什么编码方式,什么采样率的Wav文件,因此无法播放。
在网上同样可以找到关于wav头文件的内容介绍,其中一篇介绍的还是挺详细的,可以在百度文库里面查到。
对比网上的各种代码,各种介绍头文件的文档,发现真是千奇百怪,各式各样都有。
如果你有兴趣的话,可以去搜索一下Wav头文件格式看看。
按照官方的说法WAV文件是遵循RIFF文件格式标准的:
下图是我按照我的理解画的一张示意图。
对照上图,
可以把Wav头文件格式如下:
RootID :4个BYTE “RIFF”
Size : 4个BYTE 数据块的大小
Data :这个Data 包括了, Format, 以及下面的所有Chunk。
Format : 4个BYTE “WAVE”
ChunkID :4个BYTE “fmt”
ChunkSize : 4个BYTE 当前Chunk的Data部分大小
ChunkData :
audioFormat:2个BYTE form of compression.(0x01 是 PCM编码格式, 0x07是U-LAW编码格式)
numChannels: 2个BYTE 声道数量
sampleRate: 4个BYTE 采样频率 8000, 44100, etc.
byeRate:4个 BYTE , 值=SampleRate * NumChannels * BitsPerSample/8
blockAlign: 2个BYTE, 值=NumChannels * BitsPerSample/8
bitsPerSample: 2个BYTE(PCM编码格式),或者4个BYTE(U-LAW编码格式)
ChunkID :4个BYTE “Data”
ChunkSize : 4个BYTE 当前Chunk的Data部分大小
ChunkData :声音数据。
这个是最简单的WAV头文件信息,有些头文件在Data和fmt之间还有各种其他信息,比如U-LAW 编码格式 的“fact”段,里面就是记录最后Data段的大小。
PCM编码格式的也是如此,只是稍微有些不同。
因此在编码格式的转换上,要注意的是头信息之间的转换,比如8bit 44.1HZ的U-LAW编码格式的Wav文件 转化之后为 PCM 格式编码为16bit的44.1HZ的Wav文件。
下面是参考的方法。
/* 1 - 3 */ static int ConvertFromULAW2PCM ( FILE * input, FILE * output) { int Ret = Ret_OK; int i = 0, j = 0; WavHeader_ULAW sULAWHead; WavHeader_PCM sPCMHead; unsigned char tULaw; short tPcm; //Extend wav format unsigned char *pBuffer = NULL; WavHead_ChunkHead * pSChunkHead = NULL; int iBufferSize = 0; int iChunkCount = 0; //malloc the memory to save the extend info pSChunkHead = (WavHead_ChunkHead *) malloc ( sizeof(WavHead_ChunkHead) * ExtendChunkSize ); pBuffer = (unsigned char *) malloc ( sizeof(unsigned char) * ExtendChunkDataSize); //malloc check if ( NULL != pBuffer && NULL != pSChunkHead) { //start to read the first data fread(&sULAWHead, sizeof(WavHeader_ULAW), 1, input); #if (Debug_Show <= Debug_Level5) printf("-----------Read File's Stand info : \n"); showWavHeadRiff( &sULAWHead.mWavRiff ); showWavHeadFmt( (WavHead_PCM_Fmt *)&sULAWHead.mWavFmt ); printf("-----------Read File's Extend info : \n"); #endif for (i = 0; i < ExtendChunkSize && sULAWHead.mWavData.subchunk2ID != WAV_DATA; i++) { // copy the chunk info pSChunkHead[i].chunkID = sULAWHead.mWavData.subchunk2ID; pSChunkHead[i].chunkSize = sULAWHead.mWavData.subchunk2Size; // copy the chunk data if ( (iBufferSize + pSChunkHead[i].chunkSize) <= ExtendChunkDataSize) { fread(pBuffer + iBufferSize, sizeof(char) * pSChunkHead[i].chunkSize, 1, input); } else { printf(" the extend info out of memory --->fail!\n"); Ret = Ret_Fail; break; } #if (Debug_Show <= Debug_Level5) showWavHeadChunk(&pSChunkHead[i], &pBuffer[iBufferSize]); #endif iBufferSize += pSChunkHead[i].chunkSize; iChunkCount++; // read next chunk head fread(&sULAWHead.mWavData, sizeof(WavHead_Data), 1, input); } #if (Debug_Show <= Debug_Level1) showWavHeadChunkData(pBuffer, iBufferSize); #endif //---------------file head convert------------------- if (0 == Ret) { sPCMHead.mWavRiff.chunkID = sULAWHead.mWavRiff.chunkID; sPCMHead.mWavRiff.format = sULAWHead.mWavRiff.format; sPCMHead.mWavRiff.chunkSize = sULAWHead.mWavRiff.chunkSize + sULAWHead.mWavData.subchunk2Size - 2; //copy the commen Fmt info copyWavHeadFmt(&sPCMHead.mWavFmt.commFmt, &sULAWHead.mWavFmt.commFmt); sPCMHead.mWavFmt.commFmt.subchunk1Size = 0x10; // -2BYTE sPCMHead.mWavFmt.commFmt.audioFormat = 0x01; // pcm code sPCMHead.mWavFmt.bitsPerSample = 16; // 16 bit pcm sPCMHead.mWavFmt.commFmt.byteRate = sPCMHead.mWavFmt.commFmt.sampleRate * sPCMHead.mWavFmt.bitsPerSample * sPCMHead.mWavFmt.commFmt.numChannels / 8; sPCMHead.mWavFmt.commFmt.blockAlign = sPCMHead.mWavFmt.bitsPerSample * sPCMHead.mWavFmt.commFmt.numChannels / 8; sPCMHead.mWavData.subchunk2ID = sULAWHead.mWavData.subchunk2ID; sPCMHead.mWavData.subchunk2Size = sULAWHead.mWavData.subchunk2Size + sULAWHead.mWavData.subchunk2Size; #if (Debug_Show <= Debug_Level1) printf("-----------Write File's Stand info : \n"); showWavHeadRiff( &sPCMHead.mWavRiff ); showWavHeadFmt( &sPCMHead.mWavFmt ); #endif //---------------file head write------------------- fwrite(&sPCMHead.mWavRiff, sizeof(WavHead_Riff), 1, output); fwrite(&sPCMHead.mWavFmt, sizeof(WavHead_PCM_Fmt), 1, output); for (i = 0, j = 0; i < iChunkCount && j < iBufferSize; i++) { fwrite(&pSChunkHead[i], sizeof(WavHead_ChunkHead), 1, output); fwrite(pBuffer + j, sizeof(unsigned char) * pSChunkHead[i].chunkSize, 1, output); j += pSChunkHead[i].chunkSize; } fwrite(&sPCMHead.mWavData, sizeof(WavHead_Data), 1, output); printf(" write extend info --->OK!\n"); //---------------convert the data------------------ while ( 0 != fread(&tULaw, sizeof(unsigned char), 1, input) ) { //Decode a buffer of u-Law values into 16 bit uniform PCM values tPcm = ulaw2linear(tULaw); fwrite(&tPcm, sizeof(short), 1, output); } } else { // to return the value. } } else { Ret = Ret_Fail; } //---------------free the memory------------------ free(pSChunkHead); free(pBuffer); return Ret; }
最后,注意点,编码数据转换的时候,一定要严格按照BYTE读写。
再提供一份 编码格式列表:
#define wave_format_g723_adpcm 0x0014 /* antex electronics corporation */
#define wave_format_antex_adpcme 0x0033 /* antex electronics corporation */
#define wave_format_g721_adpcm 0x0040 /* antex electronics corporation */
#define wave_format_aptx 0x0025 /* audio processing technology */
#define wave_format_audiofile_af36 0x0024 /* audiofile, inc. */
#define wave_format_audiofile_af10 0x0026 /* audiofile, inc. */
#define wave_format_control_res_vqlpc 0x0034 /* control resources limited */
#define wave_format_control_res_cr10 0x0037 /* control resources limited */
#define wave_format_creative_adpcm 0x0200 /* creative labs, inc */
#define wave_format_dolby_ac2 0x0030 /* dolby laboratories */
#define wave_format_dspgroup_truespeech 0x0022 /* dsp group, inc */
#define wave_format_digistd 0x0015 /* dsp solutions, inc. */
#define wave_format_digifix 0x0016 /* dsp solutions, inc. */
#define wave_format_digireal 0x0035 /* dsp solutions, inc. */
#define wave_format_digiadpcm 0x0036 /* dsp solutions, inc. */
#define wave_format_echosc1 0x0023 /* echo speech corporation */
#define wave_format_fm_towns_snd 0x0300 /* fujitsu corp. */
#define wave_format_ibm_cvsd 0x0005 /* ibm corporation */
#define wave_format_oligsm 0x1000 /* ing c. olivetti & c., s.p.a. */
#define wave_format_oliadpcm 0x1001 /* ing c. olivetti & c., s.p.a. */
#define wave_format_olicelp 0x1002 /* ing c. olivetti & c., s.p.a. */
#define wave_format_olisbc 0x1003 /* ing c. olivetti & c., s.p.a. */
#define wave_format_oliopr 0x1004 /* ing c. olivetti & c., s.p.a. */
#define wave_format_ima_adpcm (wave_form_dvi_adpcm) /* intel corporation */
#define wave_format_dvi_adpcm 0x0011 /* intel corporation */
#define wave_format_unknown 0x0000 /* microsoft corporation */
#define wave_format_pcm 0x0001 /* microsoft corporation */
#define wave_format_adpcm 0x0002 /* microsoft corporation */
#define wave_format_alaw 0x0006 /* microsoft corporation */
#define wave_format_mulaw 0x0007 /* microsoft corporation */
#define wave_format_gsm610 0x0031 /* microsoft corporation */
#define wave_format_mpeg 0x0050 /* microsoft corporation */
#define wave_format_nms_vbxadpcm 0x0038 /* natural microsystems */
#define wave_format_oki_adpcm 0x0010 /* oki */
#define wave_format_sierra_adpcm 0x0013 /* sierra semiconductor corp */
#define wave_format_sonarc 0x0021 /* speech compression */
#define wave_format_mediaspace_adpcm 0x0012 /* videologic */
#define wave_format_yamaha_adpcm 0x0020 /* yamaha corporation of america */下面提供一份,我写的测试代码,U-LAW编码和PCM编码的Wav文件之间的相互转换方法。
http://download.csdn.net/detail/gqjjqg/3685230