读取TIMIT数据集中WAV文件报错:ValueError: File format b'NIST'... not understood.

报错原因

Timit原始数据虽然是以wav结尾的但是格式却不是wav,而是sphere格式,用python中的sphfile库把他转换成wav。

from sphfile import SPHFile
import glob

if __name__ == "__main__":
    path = r'/timit路径/TRAIN/*/*/*.WAV'
    sph_files = glob.glob(path)
    print(len(sph_files), "train utterences")
    for i in sph_files:
        print(i)
        sph = SPHFile(i)
        filename = i.replace(".WAV", ".wav")
        sph.write_wav(filename)

    path = r'/timit路径/TEST/*/*/*.WAV'
    sph_files_test = glob.glob(path)
    print(len(sph_files_test), "test utterences")
    for i in sph_files_test:
        sph = SPHFile(i)
        sph.write_wav(filename=i.replace(".WAV", ".wav"))

    print("Completed")

参考:
博客
sphfile 官网

你可能感兴趣的:(语音信号处理)