sph音频转换为其他格式,结合sox工具转为合适的wav音频 ——简记

SPHERE转换工具:用于将 sph 音频文件转换为其他格式的程序


  • 工具介绍:
    sph2pipe can:

  • run on UNIX systems (should also work on MacOS X, via its unix
    shell/command-line interface, using the “Terminal” utility)

  • provide SPHERE-formatted output as well as RIFF, AU, AIFF and raw

  • handle raw sample data as input, using a SPHERE header stored in a
    separate file.

  • trim off the beginning and/or end of the input data, to output just
    a user-specified segment based on either time or sample offsets
    (sph_convert always outputs the entire file)

  • write the output data to stdout, for redirection to any named file,
    or to a pipeline process (sph_convert always writes the data to a
    new file, with a name derived automatically from the input file)

  • support input and output of A-law speech data

  1. 安装:

下载好对应的源码,解压,里面是 3个 *.c files, and 3个 *.h
files ,编译生成可执行文件 sph2pipe

 -- then:

     cd sph2pipe_v2.4

     gcc -o sph2pipe *.c -lm     ## on unix
     gcc -o sph2pipe.exe *.c -lm ## on wintel, using the djgpp compiler

That's it -- no configuration scripts, makefiles or special libraries
are needed (the source code consists of just 3 *.c files, and 3 *.h
files; the standard math library is needed for compilation).  Put the
resulting "sph2pipe" executable in your path and start using it.  If you
don't have gcc, try whatever C compiler you do have; you might need to
change a few details in sph_convert.h, but we hope the code is generic
enough (POSIX compliant) to work anywhere.

  1. 将生成的可执行文件sph2pipe 加入当前用户的环境变量,使任意位置都可以使用该命令
vim .bashrc
export PATH=/home/zql/soft/pythonsoft/sph2pipe_v2.5:$PATH
然后 source 命令使命令立即生效
source .bashrc
  1. 输入sph2pipe命令显示命令格式如下:
Usage: sph2pipe [-h hdr] [-t|-s b:e] [-c 1|2] [-p|-u|-a] [-f typ] infile [outfile]

   default conditions (for 'sph2pipe infile'):
       * input file contains sphere header
       * output full duration of input file
       * output all channels from input file
       * output same sample coding as input file
       * output format is WAV on Wintel machines, SPH elsewhere
       * output is written to stdout

   optional controls (items bracketed separately above can be combined):
       -h hdr -- treat infile as headerless, get sphere info from file 'hdr'
       -t b:e -- output portion between b and e sec (floating point)
       -s b:e -- output portion between b and e samples (integer)
       -c 1   -- only output first channel
       -c 2   -- only output second channel
       -p     -- force conversion to 16-bit linear pcm
       -u     -- force conversion to 8-bit ulaw
       -a     -- force conversion to 8-bit alaw
       -f typ -- select alternate output header format 'typ'
                 five types: sph, raw, au, rif(wav), aif(mac)


  • 直接转换失败,虽然我们的输出后缀wav,但是默认它的输出还是个sph,原因如下 :
  • output format is WAV on Wintel machines, SPH elsewhere
sph2pipe test.sph 1.wav
然后使用 sox 工具查看wav 头信息,发现sox 不支持该 alaw  编码格式的sph
soxi 1.wav 
soxi FAIL formats: can't open input file `1.wav': sph: unsupported coding `alaw'
  • -f typ – select alternate output header format ‘typ’
    five types: sph, raw, au, rif(wav), aif(mac)
sph2pipe  -f rif test.sph 2.wav

soxi 2.wav 

Input File     : '2.wav'
Channels       : 2
Sample Rate    : 8000
Precision      : 13-bit
Duration       : 00:09:59.48 = 4795872 samples ~ 44961.3 CDDA sectors
File Size      : 9.59M
Bit Rate       : 128k
Sample Encoding: 8-bit A-law
  • -p – force conversion to 16-bit linear pcm
sph2pipe -p -f rif test.sph 3.wav
soxi 3.wav 

Input File     : '3.wav'
Channels       : 2
Sample Rate    : 8000
Precision      : 16-bit
Duration       : 00:09:59.48 = 4795872 samples ~ 44961.3 CDDA sectors
File Size      : 19.2M
Bit Rate       : 256k
Sample Encoding: 16-bit Signed Integer PCM

  • -c 1 – only output first channel
  • -c 2 – only output second channel
sph2pipe   的  指定 channel 则输出基本 只有 first channel  的声音  ,因此 这种方式 的wav ,并非我想要的
sph2pipe  -c 1  -f rif test.sph c1.wav
sph2pipe  -c 2  -f rif test.sph c2.wav

sph2pipe  -c 1 -p  -f rif test.sph c1_16k.wav
sph2pipe  -c 2 -p  -f rif test.sph c2_16k.wav

我需要的音频格式为:单通道 16k/8k 16/8 bit

  1. 因此先使用sph2pipe 转为 16bit 的 双通道 的wav ,其默认为 A-law的压缩格式,量化精度 不符合要求,默认采样率为8k;
  2. 再使用wav工具将其转为 单通道 8k 的wav或者转为单通道16k的wav 三个命令如下:
sph2pipe -p -f rif test.sph 3.wav
sox 3.wav -c 1 3-1.wav
sox  3.wav  -r 16000 -c 1 3-1_16k.wav


第一步:sph2pipe 把 sph 音频 转为双通道 8k 16 bit PCM 格式的 wav

# Run example: ./c1_16bit_16k.sh  /home/zql/project/data/myTestdata/test1 /home/zql/project/data/myTestdata/test2 
echo "first arg is $1"
echo "second arg is $2"
for file in $1/*.sph
        echo $file
        filename=$(basename $file .sph)
        #echo ${filename}
        sph2pipe -p -f rif $file $2/${filename}.wav
sph $1 $2

第二步:双通道 8k 16 bit 转为 单通道 8k 16 bit

# cmd example: ./sox_channel_1.sh /home/zql/project/data/myTestdata/c2_16bit_8k /home/zql/project/data/myTestdata/c1_16bit_8k

echo "first arg is $1"
echo "second arg is $2"
for file in $1/*.wav
        echo $file
        echo $filename
        sox $file -c 1 $2/$filename
sox_precision $1 $2

第三步:双通道 8k 16 bit 转为 单通道 16k 16 bit

# cmd example: ./sox_channel_1_16k.sh /home/data/myTestdata/c2_16bit_8k /home/data/myTestdata/c1_16bit_16k

echo "first arg is $1"
echo "second arg is $2"
for file in $1/*.wav
        echo $file
        echo $filename
        sox $file -r 16000 -c 1 $2/$filename
sox_precision $1 $2

sox 工具 链接

本想试下sox是否可以把sph转为wav,但是 我的 这个音频使用sox命令报错如下,不支持这种alaw压缩的音频编码,可能这批数据为alaw 压缩格式的音频,所以不适合吧

 sox test.sph  1.wav
sox FAIL formats: can't open input file `test.sph': sph: unsupported coding `alaw'
