Given two vectors a = ( a 1 , a 2 , . . . , a n − 1 ) a = (a_1, a_2, ..., a_{n-1}) a=(a1,a2,...,an−1) and b = ( b 1 , b 2 , . . . , b n − 1 ) b=(b_1, b_2, ..., b_{n-1}) b=(b1,b2,...,bn−1). The convolution of two vectors of length n n n is a vector of length 2 n − 1 2n-1 2n−1. 定义 convolution: a ∗ b a*b a∗b, where coordinate k k k is equal to
∑ ( i , j ) : i + j = k and i , j < n a i b i \sum_{(i, j):i+j=k \text{ and } i, j
a ∗ b = ( a 0 b 0 , a 0 b 1 + a 1 b 0 , a 0 b 2 + a 1 b 1 + a 2 b 0 , . . . , a n − 1 b n − 1 + a n − 1 b n − 2 , a n − 1 b n − 1 ) a*b=(a_0b_0, a_0b_1+a_1b_0, a_0b_2+a_1b_1+a_2b_0, ..., a_{n-1}b_{n-1}+a_{n-1}b_{n-2}, a_{n-1}b_{n-1}) a∗b=(a0b0,a0b1+a1b0,a0b2+a1b1+a2b0,...,an−1bn−1+an−1bn−2,an−1bn−1) 如何理解这个看似复杂的表达式?
不同于点乘和向量的和,要求两个向量的基底数量相同,可以对任意基底数的两个向量进行卷积。设若 a = ( a 0 , a 1 , . . . , a m − 1 ) , b = ( b 0 , b 1 , . . . , b n − 1 ) a=(a_0, a_1, ..., a_{m-1}), b=(b_0, b_1, ..., b_{n-1}) a=(a0,a1,...,am−1),b=(b0,b1,...,bn−1)。那么 a ∗ b a*b a∗b 是一个基底数量为 m + n − 1 m+n-1 m+n−1 的向量,其中 coordinate k k k 等于
∑ ( i , j ) : i + j = k , i < m , j < n a i b j \sum_{(i, j): i+j=k, i
应用一:
Given two polynomials A ( x ) = a 0 + a 1 x + a 2 x 2 + . . . a m − 1 x m − 1 A(x) = a_0 + a_1 x + a_2 x^2 + ...a_{m-1}x^{m-1} A(x)=a0+a1x+a2x2+...am−1xm−1 and B ( x ) = b 0 + b 1 x + b 2 x 2 + . . . + b n − 1 x n − 1 B(x)=b_0 + b_1x + b_2 x^2+...+b_{n-1}x^{n-1} B(x)=b0+b1x+b2x2+...+bn−1xn−1, consider the polynomial C ( x ) = A ( x ) B ( x ) C(x)=A(x)B(x) C(x)=A(x)B(x). In this polynomial C ( x ) C(x) C(x), the coefficient on the x k x^k xk term is equal to c k = ∑ ( i , j ) : i + j = k a i b j c_k=\sum_{(i, j):i+j=k} a_i b_j ck=(i,j):i+j=k∑aibj In other words, the coefficient vector c c c of C ( x ) C(x) C(x) is the convolution of the coefficient vectors of A ( x ) A(x) A(x) and B ( x ) B(x) B(x).
应用二:signal processing - 卷积和傅立叶分析
应用三:Combining Histogram
Suppose we have the following two histograms:
We’d now like to produce a new histogram, showing for each k k k the number of pairs ( M , W ) (M , W ) (M,W) for which man M M M and woman W W W have a combined income of k k k.
Now we suppose m = n m=n m=n, for each k k k, we calculate the sum ∑ ( i , j ) : i + j = k a i b j \sum_{(i,j):i+j=k}a_ib_j (i,j):i+j=k∑aibj This is O ( n 2 ) O(n^2) O(n2) arithmetic operations.
Could one design an algorithm that bypasses the quadratic-size definition of convolution and computes it in some smarter way?
Suppose we are given a = ( a 0 , a 1 , . . . , a n − 1 ) a=(a_0, a_1, ..., a_{n-1}) a=(a0,a1,...,an−1) and b = ( b 0 , b 1 , . . . , b n − 1 ) b = (b_0, b_1, ..., b_{n-1}) b=(b0,b1,...,bn−1)
Rather than multiplying A A A and B B B symbolically, we can treat them as functions of the variable x x x and multiply them as follows:
第2步显然是 O ( n ) O(n) O(n),但是第1、3步看起来不像是 O ( n ) O(n) O(n),其中第1步明显是 O ( n 2 ) O(n^2) O(n2)。
Key Idea: find a set of 2 n 2n 2n values x 1 x_1 x1, x 2 x_2 x2, . . . , x 2 n x_{2n} x2n that are intimately related in some way, such that the work in evaluating A A A and B B B on all of them can be shared across different evaluations. - 满足这个的集合:complex roots of unity.
The k t h k^{th} kth roots of unity 定义:
对于一个正整数 k k k, x k = 1 x^k=1 xk=1 共有 k k k 个不同的复数根, 即
ω j , k = e 2 π j i / k \omega_{j, k}=e^{2\pi ji /k} ωj,k=e2πji/k for j = 0 , 1 , 2 , . . . , k − 1 j=0, 1, 2, ..., k-1 j=0,1,2,...,k−1
k = 8 k=8 k=8 时复数根在复平面上的分布是:
对于 x 1 , x 2 , . . . , x 2 n x_1, x_2, ..., x_{2n} x1,x2,...,x2n 选择为 the ( 2 n ) t h (2n)^{th} (2n)th roots of unity
The representation of a degree- d d d polynomial P P P by its values on the ( d + 1 ) s t (d+1)^{st} (d+1)st roots of unity is sometimes referred to as the discrete Fourier Transform of P P P.
α j = A ( x j ) = A ( ω j , 2 n ) \alpha_j=A(x_j)=A(\omega_{j, 2n}) αj=A(xj)=A(ωj,2n)
β j = B ( x j ) = B ( ω j , 2 n ) \beta_j=B(x_j)=B(\omega_{j, 2n}) βj=B(xj)=B(ωj,2n)
How to break the evaluation of a polynomial into two equal-sized subproblems? - 分为偶数项和奇数项 (假设 n n n 是2的倍数)
A e v e n ( x ) = a 0 + a 2 x + a 4 x 2 + . . . + a n − 2 x ( n − 2 ) / 2 A_{even}(x)=a_0+a_2x+a_4x^2+...+a_{n-2}x^{(n-2)/2} Aeven(x)=a0+a2x+a4x2+...+an−2x(n−2)/2 A o d d ( x ) = a 1 + a 3 x + a 5 x 2 + . . . + a n − 1 x ( n − 2 ) / 2 A_{odd}(x)=a_1+a_3x+a_5x^2+...+a_{n-1}x^{(n-2)/2} Aodd(x)=a1+a3x+a5x2+...+an−1x(n−2)/2 那么,
A ( x ) = A e v e n ( x 2 ) + x A o d d ( x 2 ) A(x)=A_{even}(x^2)+xA_{odd}(x^2) A(x)=Aeven(x2)+xAodd(x2) Note:
Thus, we can perform these evaluations in time T ( n / 2 ) T(n/2) T(n/2) for each A e v e n A_{even} Aeven and A o d d A_{odd} Aodd, for a total time 2 T ( n / 2 ) 2T(n/2) 2T(n/2)