变速类RateTransposer的实现
回到SoundTouch类成员函数void SoundTouch::putSamples(const SAMPLETYPE *samples, uint nSamples)。定义一个SoundTouch类变量之后,通过简单地调用这个类函数,就可以实现音频的相关处理。分析一下他的调用形式,也很简单,第一个参数SAMPLETYPE *samples,指向一个以PCM编码的wave数据缓冲区,第二个参数uint nSamples,就是这个数据缓冲区包含的Sample个数,前面已经讨论过这个Sample的计算方法,这里就不再累述。
先看一下他的实现:
// Adds 'numSamples' pcs of samples from the 'samples' memory position into
// the input of the object.
void SoundTouch::putSamples(const SAMPLETYPE *samples, uint nSamples)
{
if (bSrateSet == FALSE)
{
throw std::runtime_error("SoundTouch : Sample rate not defined");
}
else if (channels == 0)
{
throw std::runtime_error("SoundTouch : Number of channels not defined");
}
#ifndef PREVENT_CLICK_AT_RATE_CROSSOVER
else if (rate <= 1.0f)
{
// transpose the rate down, output the transposed sound to tempo changer buffer
assert(output == pTDStretch);
pRateTransposer->putSamples(samples, nSamples);
pTDStretch->moveSamples(*pRateTransposer);
}
else
#endif
{
// evaluate the tempo changer, then transpose the rate up,
assert(output == pRateTransposer);
pTDStretch->putSamples(samples, nSamples);
pRateTransposer->moveSamples(*pTDStretch);
}
}
前面大致上可以看做是判断SoundTouch类初始化过程是否顺利,重点我们看一下
#ifndef PREVENT_CLICK_AT_RATE_CROSSOVER
else if (rate <= 1.0f)
{
// transpose the rate down, output the transposed sound to tempo changer buffer
assert(output == pTDStretch);
pRateTransposer->putSamples(samples, nSamples);
pTDStretch->moveSamples(*pRateTransposer);
}
else
#endif
{
// evaluate the tempo changer, then transpose the rate up,
assert(output == pRateTransposer);
pTDStretch->putSamples(samples, nSamples);
pRateTransposer->moveSamples(*pTDStretch);
}
这里有一个宏判断#ifndef PREVENT_CLICK_AT_RATE_CROSSOVER,具体有什么用,我一时半会也不太清楚,不过由于整个库都没有对这个宏进行定义,可以看做作者有想法要使用这个宏,但是还没有完善代码,以至于没有使用。rate通过前面介绍的SoundTouch类成员函数calcEffectiveRateAndTempo计算出的一个比率,小于等于1就是播放速度减慢。大于1就是速度加快。从注释也可以看出个一二。对于rate <= 1.0f这种情况。先通过pRateTransposer类变量调用了他自己的类成员函数putSamples。看看代码的具体实现。
// Adds 'nSamples' pcs of samples from the 'samples' memory position into
// the input of the object.
void RateTransposer::putSamples(const SAMPLETYPE *samples, uint nSamples)
{
processSamples(samples, nSamples);
}
简单的调用了类成员函数processSamples来处理。继续分析一下类成员函数processSamples的具体实现
// Transposes sample rate by applying anti-alias filter to prevent folding.
// Returns amount of samples returned in the "dest" buffer.
// The maximum amount of samples that can be returned at a time is set by
// the 'set_returnBuffer_size' function.
void RateTransposer::processSamples(const SAMPLETYPE *src, uint nSamples)
{
uint count;
uint sizeReq;
if (nSamples == 0) return;
assert(pAAFilter);
// If anti-alias filter is turned off, simply transpose without applying
// the filter
if (bUseAAFilter == FALSE)
{
sizeReq = (uint)((float)nSamples / fRate + 1.0f);
count = transpose(outputBuffer.ptrEnd(sizeReq), src, nSamples);
outputBuffer.putSamples(count);
return;
}
// Transpose with anti-alias filter
if (fRate < 1.0f)
{
upsample(src, nSamples);
}
else
{
downsample(src, nSamples);
}
}
通过bUseAAFilter变量来判断是否采用数字滤波器做处理,如果不采用数字滤波器,直接调用了transpose的方法,具体实现代码如下:
// Transposes the sample rate of the given samples using linear interpolation.
// Returns the number of samples returned in the "dest" buffer
inline uint RateTransposer::transpose(SAMPLETYPE *dest, const SAMPLETYPE *src, uint nSamples)
{
if (numChannels == 2)
{
return transposeStereo(dest, src, nSamples);
}
else
{
return transposeMono(dest, src, nSamples);
}
}
针对拥有不同声道的音频数据流,采用的处理是不一样的。从这里我们就可以看出SoundTouch库的局限性,他只能针对2声道或单声道的音频数据进行处理。以单声道为例,调用了RateTransposer类成员函数transposeMono(dest, src, nSamples),可是在RateTransposer类声明中却被定义为纯虚函数。
virtual uint transposeMono(SAMPLETYPE *dest,
const SAMPLETYPE *src,
uint numSamples) = 0;
联想到前面章节提到的构造SoundTouch的过程,在子类RateTransposerXXX中实现。他有定点和浮点两个版本,这里以浮点为例说明,浮点相对来说比较好理解。
// Transposes the sample rate of the given samples using linear interpolation.
// 'Mono' version of the routine. Returns the number of samples returned in
// the "dest" buffer
uint RateTransposerFloat::transposeMono(SAMPLETYPE *dest, const SAMPLETYPE *src, uint nSamples)
{
unsigned int i, used;
used = 0;
i = 0;
// Process the last sample saved from the previous call first...
while (fSlopeCount <= 1.0f)
{
dest[i] = (SAMPLETYPE)((1.0f - fSlopeCount) * sPrevSampleL + fSlopeCount * src[0]);
i++;
fSlopeCount += fRate;
}
fSlopeCount -= 1.0f;
if (nSamples > 1)
{
while (1)
{
while (fSlopeCount > 1.0f)
{
fSlopeCount -= 1.0f;
used ++;
if (used >= nSamples - 1) goto end;
}
dest[i] = (SAMPLETYPE)((1.0f - fSlopeCount) * src[used] + fSlopeCount * src[used + 1]);
i++;
fSlopeCount += fRate;
}
}
end:
// Store the last sample for the next round
sPrevSampleL = src[nSamples - 1];
return i;
}
我们留意到有这样一些代码:
fSlopeCount,sPrevSampleL两个成员变量在初始化时,通过对成员函数resetRegisters()的调用,被初始化为0
// Process the last sample saved from the previous call first...
while (fSlopeCount <= 1.0f)
{
dest[i] = (SAMPLETYPE)((1.0f - fSlopeCount) * sPrevSampleL + fSlopeCount * src[0]);
i++;
fSlopeCount += fRate;
}
fSlopeCount -= 1.0f;
留意一下dest[i] = (SAMPLETYPE)((1.0f - fSlopeCount) * sPrevSampleL + fSlopeCount * src[0]);的形式,和传说中的线性插值算法很像。
k = (y - y0)/(y1-y0)
->y=(1-k) * y0+k * y1
式中k = fSlopeCount,fRate是放大缩小比例因子。
while (1)
{
while (fSlopeCount > 1.0f)
{
fSlopeCount -= 1.0f;
used ++;
if (used >= nSamples - 1) goto end;
}
dest[i] = (SAMPLETYPE)((1.0f - fSlopeCount) * src[used] + fSlopeCount * src[used + 1]);
i++;
fSlopeCount += fRate;
}
fSlopeCount于fRate相加大于1减1,小于1继续循环,就是按照fRate这个比例来重新采样。可能比较抽象,我就举个例子,假如fRate等于1,fSlopeCount += fRate;fSlopeCount初值为0,fSlopeCount就永远等于1。也就是dest[i]永远等于src[used];相当于直接拷贝完后又减1等于0。如果fRate大于1,fSlopeCount += fRate;永远大于1,used ++;他就相当于跳过了一个甚至几个sample。相当于加速前进。小于1就相当于多计算了几个src[used]到src[used+1];之间的值。这就是SoundTouch库中RateTransposer类实现重采样的核心,线性插入法。