很多降噪算法如维纳滤波、MMSE估计器等都依赖先验信噪比(priori SNR)信息
定义先验信噪比(priori SNR)、后验信噪比(posteriori SNR)如下
ξ k ( n ) = E { A k 2 ( n ) } λ d ( k , n ) (1) \xi_{k}(n)=\frac{E\left\{A_{k}^{2}(n)\right\}}{\lambda_{d}(k, n)}\tag1 ξk(n)=λd(k,n)E{Ak2(n)}(1)
γ k ( n ) = Y k 2 ( n ) λ d ( k , n ) (2) \gamma_k(n)=\frac{{Y_{k}^{2}(n)}}{\lambda_{d}(k, n)}\tag2 γk(n)=λd(k,n)Yk2(n)(2)
上式中的 A k ( n ) A_{k}(n) Ak(n)为目标信号的幅度谱、 λ d ( n ) \lambda_d(n) λd(n)为噪声的功率谱,这里先假定噪声平稳且可以在无语音段估计得到。
假设噪声和语音不相关且为加性噪声,则 ξ k \xi_k ξk可以写成下式
ξ k ( n ) = E { A k 2 ( n ) } λ d ( k , n ) = E { Y k 2 ( n ) − λ d ( k , n ) } λ d ( k , n ) = E { Y k 2 ( n ) } λ d ( k , n ) − 1 = E { λ k ( n ) − 1 } (3) \begin{aligned}\xi_{k}(n)&=\frac{E\left\{A_{k}^{2}(n)\right\}}{\lambda_{d}(k, n)}\\&=\frac{E\left\{Y_{k}^{2}(n)-\lambda_d(k,n)\right\}}{\lambda_{d}(k, n)}\\&=\frac{E\left\{Y_{k}^{2}(n)\right\}}{\lambda_{d}(k, n)}-1\\&=E\left\{\lambda_{k}(n)-1\right\}\end{aligned}\tag3 ξk(n)=λd(k,n)E{Ak2(n)}=λd(k,n)E{Yk2(n)−λd(k,n)}=λd(k,n)E{Yk2(n)}−1=E{λk(n)−1}(3)
一般所说的SNR就是指先验信噪比,维纳滤波器就是关于先验信噪比的函数
W = ξ 1 + ξ (4) W=\frac{\xi}{1+\xi}\tag4 W=1+ξξ(4)
后验信噪比也叫瞬时信噪比,如谱减公式中用到的就是后验信噪比
A 2 = ∣ Y ∣ 2 − λ d = 1 λ d ( γ − 1 ) (5) \begin{aligned}A^2&=|Y|^2-\lambda_d\\&=\frac{1}{\lambda_d}(\gamma-1)\end{aligned}\tag5 A2=∣Y∣2−λd=λd1(γ−1)(5)
因为谱减法是直接使用瞬时信噪比,因此会更容易产生音乐噪声,当然,也有很多的改进方法减小音乐噪声的影响。
合并(1)、(3)两式得到以下形式
ξ k ( n ) = E { 1 2 A k 2 ( n ) λ d ( k , n ) + 1 2 [ γ k ( n ) − 1 ] } (6) \xi_k(n)=E\begin{Bmatrix}\frac{1}{2}\frac{A_{k}^{2}(n)}{\lambda_{d}(k, n)}+\frac{1}{2}[\gamma_k(n)-1]\end{Bmatrix}\tag6 ξk(n)=E{21λd(k,n)Ak2(n)+21[γk(n)−1]}(6)
这里还有一个烦人的期望符号 E E E,还是老方法,用一阶递归平滑代替时间平均,则估计得到的 ξ k ( n ) \xi_k(n) ξk(n)为下式:
ξ k ^ ( n ) = α A ^ k 2 ( n − 1 ) λ d ( k , n − 1 ) + ( 1 − α ) m a x ( γ k ( n ) − 1 , 0 ) (7) \hat{\xi_k}(n)=\alpha\frac{\hat{A}_k^2(n-1)}{\lambda_d(k,n-1)}+(1-\alpha)max(\gamma_k(n)-1,0)\tag7 ξk^(n)=αλd(k,n−1)A^k2(n−1)+(1−α)max(γk(n)−1,0)(7)
其中 A ^ k 2 ( n − 1 ) \hat{A}_k^2(n-1) A^k2(n−1)表示上一帧的幅度估计值。
(7)式就是著名的判决引导(Dicision-Directed)公式,被广泛用在了降噪算法的大街小巷里
例如,webrtc的单通道降噪算法源码里可以看到如下模块
// Compute prior and post SNR based on quantile noise estimation.
// Compute DD estimate of prior SNR.
// Inputs:
// * |magn| is the signal magnitude spectrum estimate.
// * |noise| is the magnitude noise spectrum estimate.
// Outputs:
// * |snrLocPrior| is the computed prior SNR.
// * |snrLocPost| is the computed post SNR.
static void ComputeSnr(const NoiseSuppressionC *self,
const float *magn,
const float *noise,
float *snrLocPrior, float *logSnrLocPrior,
float *snrLocPost) {
size_t i;
for (i = 0; i < self->magnLen; i++) {
// Previous post SNR.
// Previous estimate: based on previous frame with gain filter.
float previousEstimateStsa = self->magnPrevAnalyze[i] /
(self->noisePrev[i] + 0.0001f) * self->smooth[i];
// Post SNR.
snrLocPost[i] = 0.f;
if (magn[i] > noise[i]) {
snrLocPost[i] = magn[i] / (noise[i] + 0.0001f) - 1.f;
}
// DD estimate is sum of two terms: current estimate and previous estimate.
// Directed decision update of snrPrior.
snrLocPrior[i] = 2.f * (
DD_PR_SNR * previousEstimateStsa + (1.f - DD_PR_SNR) * snrLocPost[i]);
logSnrLocPrior[i] = logf(snrLocPrior[i] + 1.0f);
} // End of loop over frequencies.
}