Convolutions and Fast Fourier Transform

The Problem

Given two vectors a = ( a 1 , a 2 , . . . , a n − 1 ) a = (a_1, a_2, ..., a_{n-1}) a=(a1,a2,...,an1) and b = ( b 1 , b 2 , . . . , b n − 1 ) b=(b_1, b_2, ..., b_{n-1}) b=(b1,b2,...,bn1). The convolution of two vectors of length n n n is a vector of length 2 n − 1 2n-1 2n1. 定义 convolution: a ∗ b a*b ab, where coordinate k k k is equal to
∑ ( i , j ) : i + j = k  and  i , j < n a i b i \sum_{(i, j):i+j=k \text{ and } i, j (i,j):i+j=k and i,j<naibi In other words,
a ∗ b = ( a 0 b 0 , a 0 b 1 + a 1 b 0 , a 0 b 2 + a 1 b 1 + a 2 b 0 , . . . , a n − 1 b n − 1 + a n − 1 b n − 2 , a n − 1 b n − 1 ) a*b=(a_0b_0, a_0b_1+a_1b_0, a_0b_2+a_1b_1+a_2b_0, ..., a_{n-1}b_{n-1}+a_{n-1}b_{n-2}, a_{n-1}b_{n-1}) ab=(a0b0,a0b1+a1b0,a0b2+a1b1+a2b0,...,an1bn1+an1bn2,an1bn1) 如何理解这个看似复杂的表达式?
Convolutions and Fast Fourier Transform_第1张图片
不同于点乘和向量的和,要求两个向量的基底数量相同,可以对任意基底数的两个向量进行卷积。设若 a = ( a 0 , a 1 , . . . , a m − 1 ) , b = ( b 0 , b 1 , . . . , b n − 1 ) a=(a_0, a_1, ..., a_{m-1}), b=(b_0, b_1, ..., b_{n-1}) a=(a0,a1,...,am1),b=(b0,b1,...,bn1)。那么 a ∗ b a*b ab 是一个基底数量为 m + n − 1 m+n-1 m+n1 的向量,其中 coordinate k k k 等于
∑ ( i , j ) : i + j = k , i < m , j < n a i b j \sum_{(i, j): i+j=k, i(i,j):i+j=k,i<m,j<naibj We can picture this using the table of products a i b j a_ib_j aibj as before; the table is now rectangular, but we still compute coordinates by summing along the diagonals.

Example of Convolution

应用一:
Given two polynomials A ( x ) = a 0 + a 1 x + a 2 x 2 + . . . a m − 1 x m − 1 A(x) = a_0 + a_1 x + a_2 x^2 + ...a_{m-1}x^{m-1} A(x)=a0+a1x+a2x2+...am1xm1 and B ( x ) = b 0 + b 1 x + b 2 x 2 + . . . + b n − 1 x n − 1 B(x)=b_0 + b_1x + b_2 x^2+...+b_{n-1}x^{n-1} B(x)=b0+b1x+b2x2+...+bn1xn1, consider the polynomial C ( x ) = A ( x ) B ( x ) C(x)=A(x)B(x) C(x)=A(x)B(x). In this polynomial C ( x ) C(x) C(x), the coefficient on the x k x^k xk term is equal to c k = ∑ ( i , j ) : i + j = k a i b j c_k=\sum_{(i, j):i+j=k} a_i b_j ck=(i,j):i+j=kaibj In other words, the coefficient vector c c c of C ( x ) C(x) C(x) is the convolution of the coefficient vectors of A ( x ) A(x) A(x) and B ( x ) B(x) B(x).

应用二:signal processing - 卷积和傅立叶分析

应用三:Combining Histogram
Suppose we have the following two histograms:

  • the annual income of all the men in the population
  • the annual income of all the women.

We’d now like to produce a new histogram, showing for each k k k the number of pairs ( M , W ) (M , W ) (M,W) for which man M M M and woman W W W have a combined income of k k k.

  • a = ( a 0 , a 1 , . . . , a m − 1 ) a=(a_0, a_1, ..., a_{m-1}) a=(a0,a1,...,am1) indicate that there are a i a_i ai men with annual income equal to i i i.
  • b = ( b 0 , b 1 , . . . , b m − 1 ) b=(b_0, b_1, ..., b_{m-1}) b=(b0,b1,...,bm1) indicate that there are b j b_j bj men with annual income equal to j j j.

Computing the Convolution

Now we suppose m = n m=n m=n, for each k k k, we calculate the sum ∑ ( i , j ) : i + j = k a i b j \sum_{(i,j):i+j=k}a_ib_j (i,j):i+j=kaibj This is O ( n 2 ) O(n^2) O(n2) arithmetic operations.

Could one design an algorithm that bypasses the quadratic-size definition of convolution and computes it in some smarter way?

  • Fast Fourier Transform (FFT) - computes the convolution of two vectors using O ( n log ⁡ n ) O(n \log n) O(nlogn)

Designing and Analyzing the Algorithm

Suppose we are given a = ( a 0 , a 1 , . . . , a n − 1 ) a=(a_0, a_1, ..., a_{n-1}) a=(a0,a1,...,an1) and b = ( b 0 , b 1 , . . . , b n − 1 ) b = (b_0, b_1, ..., b_{n-1}) b=(b0,b1,...,bn1)

  • A ( x ) = a 0 + a 1 x + a 2 x 2 + . . . + a n − 1 x n − 1 A(x) = a_0+a_1x +a_2 x^2 +...+a_{n-1}x^{n-1} A(x)=a0+a1x+a2x2+...+an1xn1
  • B ( x ) = a 0 + b 1 x + b 2 x 2 + . . . + b n − 1 x n − 1 B(x) = a_0+b_1x +b_2 x^2 +...+b_{n-1}x^{n-1} B(x)=a0+b1x+b2x2+...+bn1xn1
  • C ( x ) = A ( x ) B ( x ) C(x)=A(x)B(x) C(x)=A(x)B(x)
  • c = ( c 0 , c 1 , . . . , c 2 n − 2 ) c=(c_0, c_1, ..., c_{2n-2}) c=(c0,c1,...,c2n2)

Rather than multiplying A A A and B B B symbolically, we can treat them as functions of the variable x x x and multiply them as follows:

  1. Choose 2 n 2n 2n values x 1 , x 2 , . . . , x 2 n x_1,x_2,...,x_{2n} x1,x2,...,x2n and evaluate A ( x j ) A(x_j) A(xj) and B ( x j ) B(x_j) B(xj) for each of j = 1 , 2 , . . . , 2 n j=1,2,...,2n j=1,2,...,2n.
  2. Compute C ( x j ) C(x_j) C(xj) for each j j j very easily: C ( x j ) = A ( x j ) ⋅ B ( x j ) = α j β j C(x_j)=A(x_j) \cdot B(x_j) = \alpha_j\beta_j C(xj)=A(xj)B(xj)=αjβj
  3. Recover C C C from its values on x 1 , x 2 , . . . , x 2 n x_1,x_2,...,x_{2n} x1,x2,...,x2n. Here we take advantage of a fundamental fact about polynomials: any polynomial of degree d d d can be reconstructed from its values on any set of d + 1 d + 1 d+1 or more points. This is known as polynomial interpolation. Since A A A and B B B each have degree at most n − 1 n−1 n1, their product C C C has degree at most 2 n − 2 2n−2 2n2, and so it can be reconstructed from the values C ( x 1 ) C(x_1) C(x1), C ( x 2 ) C(x_2) C(x2), . . . , C ( x 2 n ) C(x_{2n}) C(x2n) that we computed in step ( 2 ) (2) (2).

第2步显然是 O ( n ) O(n) O(n),但是第1、3步看起来不像是 O ( n ) O(n) O(n),其中第1步明显是 O ( n 2 ) O(n^2) O(n2)

Key Idea: find a set of 2 n 2n 2n values x 1 x_1 x1, x 2 x_2 x2, . . . , x 2 n x_{2n} x2n that are intimately related in some way, such that the work in evaluating A A A and B B B on all of them can be shared across different evaluations. - 满足这个的集合:complex roots of unity.

The Complex Roots of Unity

The k t h k^{th} kth roots of unity 定义:
对于一个正整数 k k k x k = 1 x^k=1 xk=1 共有 k k k 个不同的复数根, 即
ω j , k = e 2 π j i / k \omega_{j, k}=e^{2\pi ji /k} ωj,k=e2πji/k for j = 0 , 1 , 2 , . . . , k − 1 j=0, 1, 2, ..., k-1 j=0,1,2,...,k1

k = 8 k=8 k=8 时复数根在复平面上的分布是:
Convolutions and Fast Fourier Transform_第2张图片
对于 x 1 , x 2 , . . . , x 2 n x_1, x_2, ..., x_{2n} x1,x2,...,x2n 选择为 the ( 2 n ) t h (2n)^{th} (2n)th roots of unity

The representation of a degree- d d d polynomial P P P by its values on the ( d + 1 ) s t (d+1)^{st} (d+1)st roots of unity is sometimes referred to as the discrete Fourier Transform of P P P.

A Recursive Procedure for Polynomial Evaluation

α j = A ( x j ) = A ( ω j , 2 n ) \alpha_j=A(x_j)=A(\omega_{j, 2n}) αj=A(xj)=A(ωj,2n)
β j = B ( x j ) = B ( ω j , 2 n ) \beta_j=B(x_j)=B(\omega_{j, 2n}) βj=B(xj)=B(ωj,2n)

How to break the evaluation of a polynomial into two equal-sized subproblems? - 分为偶数项和奇数项 (假设 n n n 是2的倍数)
A e v e n ( x ) = a 0 + a 2 x + a 4 x 2 + . . . + a n − 2 x ( n − 2 ) / 2 A_{even}(x)=a_0+a_2x+a_4x^2+...+a_{n-2}x^{(n-2)/2} Aeven(x)=a0+a2x+a4x2+...+an2x(n2)/2 A o d d ( x ) = a 1 + a 3 x + a 5 x 2 + . . . + a n − 1 x ( n − 2 ) / 2 A_{odd}(x)=a_1+a_3x+a_5x^2+...+a_{n-1}x^{(n-2)/2} Aodd(x)=a1+a3x+a5x2+...+an1x(n2)/2 那么,
A ( x ) = A e v e n ( x 2 ) + x A o d d ( x 2 ) A(x)=A_{even}(x^2)+xA_{odd}(x^2) A(x)=Aeven(x2)+xAodd(x2) Note:

  1. A e v e n A_{even} Aeven and A o d d A_{odd} Aodd have ( n − 2 ) / 2 (n-2)/2 (n2)/2 degree, which is half of the degree of A A A.
  2. 若已知 A e v e n ( x 2 ) A_{even}(x^2) Aeven(x2) A o d d ( x 2 ) A_{odd}(x^2) Aodd(x2) A ( x ) A(x) A(x) 的计算需要 constant time。

Thus, we can perform these evaluations in time T ( n / 2 ) T(n/2) T(n/2) for each A e v e n A_{even} Aeven and A o d d A_{odd} Aodd, for a total time 2 T ( n / 2 ) 2T(n/2) 2T(n/2)

你可能感兴趣的:(学习笔记)