Singular Value Decomposition(SVD)奇异值分解

The topic of this article, the singular value decomposition, is one that should be a part of the standard mathematics undergraduate curriculum but all too often slips between the cracks. Besides being rather intuitive, these decompositions are incredibly useful. For instance, Netflix, the online movie rental company, is currently offering a $1 million prize for anyone who can improve the accuracy of its movie recommendation system by 10%. Surprisingly, this seemingly modest problem turns out to be quite challenging, and the groups involved are now using rather sophisticated techniques. At the heart of all of them is the singular valuedecomposition.

A singular value decomposition provides a convenient way for breaking a matrix, which perhaps contains some data we are interested in, into simpler, meaningful pieces. In this article, we will offer a geometric explanation of singular value decompositions and look at some of the applications of them.

The geometry of linear transformations
Let us begin by looking at some simple matrices, namely those with two rows and two columns. Our first example is the diagonal matrix

Geometrically, we may think of a matrix like this as taking a point (x, y) in the plane and transforming it into another point using matrix multiplication:


The effect of this transformation is shown below: the plane is horizontally stretched by a factor of 3, while there is no vertical change.


Singular Value Decomposition(SVD)奇异值分解_第1张图片Singular Value Decomposition(SVD)奇异值分解_第2张图片

Now let's look at

which produces this effect

Singular Value Decomposition(SVD)奇异值分解_第3张图片Singular Value Decomposition(SVD)奇异值分解_第4张图片

It is not so clear how to describe simply the geometric effect of the transformation. However, let's rotate our grid through a 45 degree angle and see what happens.


Singular Value Decomposition(SVD)奇异值分解_第5张图片Singular Value Decomposition(SVD)奇异值分解_第6张图片

Ah ha. We see now that this new grid is transformed in the same way that the original grid was transformed by the diagonal matrix: the grid is stretched by a factor of 3 in one direction.


解释: 和


This is a very special situation that results from the fact that the matrix M is symmetric; that is, the transpose of M, the matrix obtained by flipping the entries about the diagonal, is equal to M. If we have a symmetric 2  2 matrix, it turns out that we may always rotate the grid in the domain so that the matrix acts by stretching and perhaps reflecting in the two directions. In other words, symmetric matrices behave like diagonal matrices.

Said with more mathematical precision, given a symmetric matrix M, we may find a set of orthogonal vectors vi so that Mvi is a scalar multiple of vi; that is



M vi = λ i vi

where λi is a scalar. Geometrically, this means that the vectors vi are simply stretched and/or reflected when multiplied by M. Because of this property, we call the vectors vi eigenvectors of M; the scalars λi are called eigenvalues. An important fact, which is easily verified, is that eigenvectors of a symmetric matrix corresponding to different eigenvalues are orthogonal.
其中 λi 是个标量。几何意义上这意味着向量vi 通过乘以矩阵M只是被简单的拉伸了。正是由于这个特性,我们称Vi为矩阵M的特征向量;标量 λi 称为特征值。一个重要的事实就是对称矩阵不同特征值的特征向量相互正交,这个事实很容易证明。
If we use the eigenvectors of a symmetric matrix to align the grid, the matrix stretches and reflects the grid in the same way that it does the eigenvectors.


The geometric description we gave for this linear transformation is a simple one: the grid is simply stretched in one direction. For more general matrices, we will ask if we can find an orthogonal grid that is transformed into another orthogonal grid. Let's consider a final example using a matrix that is not symmetric:


This matrix produces the geometric effect known as a shear.


Singular Value Decomposition(SVD)奇异值分解_第7张图片 Singular Value Decomposition(SVD)奇异值分解_第8张图片

It's easy to find one family of eigenvectors along the horizontal axis. However, our figure above shows that these eigenvectors cannot be used to create an orthogonal grid that is transformed into another orthogonal grid. Nonetheless, let's see what happens when we rotate the grid first by 30 degrees,


Singular Value Decomposition(SVD)奇异值分解_第9张图片 Singular Value Decomposition(SVD)奇异值分解_第10张图片

Notice that the angle at the origin formed by the red parallelogram on the right has increased. Let's next rotate the grid by 60 degrees.



Singular Value Decomposition(SVD)奇异值分解_第11张图片 Singular Value Decomposition(SVD)奇异值分解_第12张图片

Hmm. It appears that the grid on the right is now almost orthogonal. In fact, by rotating the grid in the domain by an angle of roughly 58.28 degrees, both grids are now orthogonal.

 解释: 由公式看出先将原坐标轴旋转58.28度然后再左乘矩阵


Singular Value Decomposition(SVD)奇异值分解_第13张图片


The singular value decomposition

This is the geometric essence of the singular value decomposition for 2  2 matrices: for any 2  2 matrix, we may find an orthogonal grid that is transformed into another orthogonal grid.

We will express this fact using vectors: with an appropriate choice of orthogonal unit vectors v1 and v2, the vectors Mv1 and Mv2 are orthogonal.



Singular Value Decomposition(SVD)奇异值分解_第14张图片 Singular Value Decomposition(SVD)奇异值分解_第15张图片

We will use u1 and u2 to denote unit vectors in the direction of Mv1 and Mv2. The lengths of Mv1 and Mv2--denoted by σ1 and σ2--describe the amount that the grid is stretched in those particular directions. These numbers are called the singular values of M. (In this case, the singular values are the golden ratio and its reciprocal, but that is not so important here.)
我们使用u1 和u2 来表示单位向量Mv1 和 Mv的方向。Mv1 和 Mv的长度用σ1 和 σ表示。σ1 和 σ描述了在坐标网格这些特定的方向上被拉伸的程度。这些数字称为矩阵M的奇异值。


Singular Value Decomposition(SVD)奇异值分解_第16张图片

We therefore have


M v1 = σ 1 u1 

M v2 = σ 2 u2 

We may now give a simple description for how the matrix M treats a general vector x. Since the vectors v1 and vare orthogonal unit vectors, we have


我们对矩阵M如何作用于一般向量M给出了一种简单的描述。当向量 v1 和 v是正交的单位向量时,我们有

x = ( v1 xv1 + ( v2 xv2

例如:x = ,V1.x=.=2 , V2.x = 3, 因为x = 2 V1 + 3 V2,所以x = (v1xv1 + (v2xv2

This means that


M x = ( v1 xM v1 + ( v2 xM v2 

M x = ( v1 x) σ 1 u1 + ( v2 x) σ 2 u2

Remember that the dot product may be computed using the vector transpose



v x =  v T x

which leads to


M x =  u1σ 1  v1 T x +  u2σ 2  v2 T x 

M =  u1σ 1  v1 T +  u2σ 2  v2 T

This is usually expressed by writing


M =  UΣ VT

where U is a matrix whose columns are the vectors u1 and u2, Σ is a diagonal matrix whose entries are σ1 and σ2, and V is a matrix whose columns are v1 and v2. The superscript T on the matrix V denotes the matrix transpose ofV.

其中U是其列向代表向量u1 和 u的矩阵, Σ是其对角值为σ1 和 σ的对角矩阵,V是其行向代表v1 和 v的矩阵。矩阵V上的上标T表示对矩阵V的转置。

This shows how to decompose the matrix M into the product of three matrices: V describes an orthonormal basis in the domain, and U describes an orthonormal basis in the co-domain, and Σ describes how much the vectors in Vare stretched to give the vectors in U.
这显示了如何将矩阵M分解为三个矩阵的相乘的形式:V描述了原来区域的正交基,U描述了变换后的正交基,Σ 描述了矩阵V中的向量变换到矩阵U中的向量被拉伸的程度。

How do we find the singular decomposition?

The power of the singular value decomposition lies in the fact that we may find it for any matrix. How do we do it? Let's look at our earlier example and add the unit circle in the domain. Its image will be an ellipse whose major and minor axes define the orthogonal grid in the co-domain.


Singular Value Decomposition(SVD)奇异值分解_第17张图片 Singular Value Decomposition(SVD)奇异值分解_第18张图片

Notice that the major and minor axes are defined by Mv1 and Mv2. These vectors therefore are the longest and shortest vectors among all the images of vectors on the unit circle.
注意到Mv1 和 Mv代表了长轴和短轴。因此这些向量是在单位圆上所有向量的最长和最短向量。


Singular Value Decomposition(SVD)奇异值分解_第19张图片 Singular Value Decomposition(SVD)奇异值分解_第20张图片

In other words, the function |Mx| on the unit circle has a maximum at v1 and a minimum at v2. This reduces the problem to a rather standard calculus problem in which we wish to optimize a function over the unit circle. It turns out that the critical points of this function occur at the eigenvectors of the matrix MTM. Since this matrix is symmetric, eigenvectors corresponding to different eigenvalues will be orthogonal. This gives the family of vectorsvi.
总之,在单位圆上的函数 |Mx| 在V1上有最大值和在V2上有最小值。这减少了微积分方面的问题的,在这个问题中我们希望在单位圆上进行函数优化。结果证明这个函数的极值点发生在MTM矩阵的特征向量方向上。因为矩阵是对阵矩阵,所以对应不同特征值的特征向量相互正交。
The singular values are then given by σi = |Mvi|, and the vectors ui are obtained as unit vectors in the direction ofMvi. But why are the vectors ui orthogonal?
σi = |Mvi| 给出了奇异值,向量ui 是 Mv方向上的单位向量。但是为什么向量ui 正交呢?
To explain this, we will assume that σi and σj are distinct singular values. We have

为了解释这一点,我们假设 σi and σj 是不同的奇异值。


M vi = σ i ui 

M vj = σ j uj.

Let's begin by looking at the expression MviMvj and assuming, for convenience, that the singular values are non-zero. On one hand, this expression is zero since the vectors vi, which are eigenvectors of the symmetric matrix MTM are orthogonal to one another:
我们来看一下Mv的表达式,为了方便,假设Mvj 的奇异值非零。一方面,这个表达式为零,因为向量v是对称矩阵 MTM 的特征向量正交于另外的特征向量。


M vi   M vj =  vi T M T  M vj =  vi   M T M vj = λ j vi   vj = 0.

On the other hand, we have


M vi   M vj = σ iσ j  ui   uj = 0

Therefore, ui and uj are othogonal so we have found an orthogonal set of vectors vi that is transformed into another orthogonal set ui. The singular values describe the amount of stretching in the different directions.
因此, ui 和 uj 正交,结果我们找到了变换到另一个正交集ui的向量Vi的正交集,奇异值描述了不同方向拉伸的程度。
In practice, this is not the procedure used to find the singular value decomposition of a matrix since it is not particularly efficient or well-behaved numerically.



