对于原始数据矩阵A,
UU矩阵的计算方法与VV矩阵的计算方法类似。
A small detail needs to be explained concerning UU and VV . Each of these matrices have rr columns, while ATAATA is an n×nn×n matrix and AATAAT is an m×mm×mmatrix. Both nn and mm are at least as large as rr. Thus, AATAAT and ATAATA should have an additional n−rn−r and m−rm−r eigenpairs, respectively, and these pairs donot show up in UU, VV , and ΣΣ. Since the rank of AA is rr, all other eigenvalueswill be 0, and these are not useful.
Instance#1
there are two “concepts” underlying the movies: science-fiction and romance. All the boys rate only science-fiction, and all the girls rate only romance.
SVD分解如下:
The key to understanding what SVD offers is in viewing the rr columns of UU, ΣΣ, and VV as representing concepts that are hidden in the original matrix MM. In Example 11.8, these concepts are clear; one is “science fiction” and the other is “romance.” Let us think of the rows of MM as people and the columns of MM as movies. Then matrix UU connects people to concepts. For example, the person Joe, who corresponds to row 1 of MM in Fig. 11.6, likes only the concept science fiction. The value 0.14 in the first row and first column of UU is smaller than some of the other entries in that column, because while Joe watches only science fiction, he doesn’t rate those movies highly. The second column of the first row of UU is 0, because Joe doesn’t rate romance movies at all.
The matrix VV relates movies to concepts. The 0.58 in each of the first three columns of the first row of VTVT indicates that the first three movies – Matrix, Alien, and Star Wars – each are of the science-fiction genre, while the 0’s in the last two columns of the first row say that these movies do not partake of the concept romance at all. Likewise, the second row of VTVT tells us that the movies Casablanca and Titanic are exclusively romances.
Finally, the matrix ΣΣ gives the strength of each of the concepts. In our example, the strength of the science-fiction concept is 12.4, while the strength of the romance concept is 9.5. Intuitively, the science-fiction concept is stronger because the data provides more movies of that genre and more people who like them.
SVD-Interpretation #2
Dimensionality Reduction Using SVD
进行SVD分解如下:
Suppose we want to represent a very large matrix MM by its SVD components UU,ΣΣ, and VV , but these matrices are also too large to store conveniently. The best way to reduce the dimensionality of the three matrices is to set the smallest of the singular values to zero. If we set the s smallest singular values to 0, then we can also eliminate the corresponding ss rows of UU and VV .
Suppose we want to reduce the number of dimensions to two. Then we set the smallest of the singular values, which is 1.3, to zero. The effect on the expression in Fig. 11.9 is that the third column of UU and the third row of VTVT are multiplied only by 0’s when we perform the multiplication, so this row and this column may as well not be there. That is, the approximation to M'M′ obtained by using only the two largest singular values is that shown in Fig. 11.10.
The resulting matrix is quite close to the matrix M'M′ of Fig. 11.8. Ideally, the entire difference is the result of making the last singular value be 0. However, in this simple example, much of the difference is due to rounding error caused by the fact that the decomposition of M'M′ was only correct to two significant digits.
关于为什么选择较小的特征值设置为0,选择哪些个特征值并设置为0可以得到很好的降维效果,以及为什么这样处理的SVD非常有效可以参看《Mining of massive datasets》
In large-data applications, it is normal for the matrix MM being decomposed to be very sparse; that is, most entries are 0. For example, a matrix representing many documents (as rows) and the words they contain (as columns) will be sparse, because most words are not present in most documents. Similarly, a matrix of customers and products will be sparse because most people do not buy most products.
We cannot deal with dense matrices that have millions or billions of rows and/or columns. However, with SVD, even if MMis sparse, UU and VV will be dense. ΣΣ, being diagonal, will be sparse, but Σ is usually much smaller than UU and VV ,so its sparseness does not help.
SVD分解的劣势如下:
It is common for the matrix M that we wish to decompose to be very sparse.
But U and V from a UV or SVD decomposition will not be sparse even so.
CUR decomposition solves this problem by using only (randomly chosen) rows and columns of M.
随机选择具体算法伪代码如下:
Moore-Penrose Inverse的定义如下:
CUR分解可以看成如下的优化问题:
Ref:
https://en.wikipedia.org/wiki/CUR_matrix_approximation
http://web.stanford.edu/class/cs246/handouts.html
We now take up a second form of matrix analysis that leads to a low-dimensional representation of a high-dimensional matrix. This approach, called singularvalue decomposition (SVD), allows an exact representation of any matrix, and also makes it easy to eliminate the less important parts of that representation to produce an approximate representation with any desired number of dimensions. Of course the fewer the dimensions we choose, the less accurate will be the approximation.
对于原始数据矩阵A,
UU矩阵的计算方法与VV矩阵的计算方法类似。
A small detail needs to be explained concerning UU and VV . Each of these matrices have rr columns, while ATAATA is an n×nn×n matrix and AATAAT is an m×mm×mmatrix. Both nn and mm are at least as large as rr. Thus, AATAAT and ATAATA should have an additional n−rn−r and m−rm−r eigenpairs, respectively, and these pairs donot show up in UU, VV , and ΣΣ. Since the rank of AA is rr, all other eigenvalueswill be 0, and these are not useful.
Instance#1
there are two “concepts” underlying the movies: science-fiction and romance. All the boys rate only science-fiction, and all the girls rate only romance.
SVD分解如下:
The key to understanding what SVD offers is in viewing the rr columns of UU, ΣΣ, and VV as representing concepts that are hidden in the original matrix MM. In Example 11.8, these concepts are clear; one is “science fiction” and the other is “romance.” Let us think of the rows of MM as people and the columns of MM as movies. Then matrix UU connects people to concepts. For example, the person Joe, who corresponds to row 1 of MM in Fig. 11.6, likes only the concept science fiction. The value 0.14 in the first row and first column of UU is smaller than some of the other entries in that column, because while Joe watches only science fiction, he doesn’t rate those movies highly. The second column of the first row of UU is 0, because Joe doesn’t rate romance movies at all.
The matrix VV relates movies to concepts. The 0.58 in each of the first three columns of the first row of VTVT indicates that the first three movies – Matrix, Alien, and Star Wars – each are of the science-fiction genre, while the 0’s in the last two columns of the first row say that these movies do not partake of the concept romance at all. Likewise, the second row of VTVT tells us that the movies Casablanca and Titanic are exclusively romances.
Finally, the matrix ΣΣ gives the strength of each of the concepts. In our example, the strength of the science-fiction concept is 12.4, while the strength of the romance concept is 9.5. Intuitively, the science-fiction concept is stronger because the data provides more movies of that genre and more people who like them.
SVD-Interpretation #2
Dimensionality Reduction Using SVD
进行SVD分解如下:
Suppose we want to represent a very large matrix MM by its SVD components UU,ΣΣ, and VV , but these matrices are also too large to store conveniently. The best way to reduce the dimensionality of the three matrices is to set the smallest of the singular values to zero. If we set the s smallest singular values to 0, then we can also eliminate the corresponding ss rows of UU and VV .
Suppose we want to reduce the number of dimensions to two. Then we set the smallest of the singular values, which is 1.3, to zero. The effect on the expression in Fig. 11.9 is that the third column of UU and the third row of VTVT are multiplied only by 0’s when we perform the multiplication, so this row and this column may as well not be there. That is, the approximation to M'M′ obtained by using only the two largest singular values is that shown in Fig. 11.10.
The resulting matrix is quite close to the matrix M'M′ of Fig. 11.8. Ideally, the entire difference is the result of making the last singular value be 0. However, in this simple example, much of the difference is due to rounding error caused by the fact that the decomposition of M'M′ was only correct to two significant digits.
关于为什么选择较小的特征值设置为0,选择哪些个特征值并设置为0可以得到很好的降维效果,以及为什么这样处理的SVD非常有效可以参看《Mining of massive datasets》
In large-data applications, it is normal for the matrix MM being decomposed to be very sparse; that is, most entries are 0. For example, a matrix representing many documents (as rows) and the words they contain (as columns) will be sparse, because most words are not present in most documents. Similarly, a matrix of customers and products will be sparse because most people do not buy most products.
We cannot deal with dense matrices that have millions or billions of rows and/or columns. However, with SVD, even if MMis sparse, UU and VV will be dense. ΣΣ, being diagonal, will be sparse, but Σ is usually much smaller than UU and VV ,so its sparseness does not help.
SVD分解的劣势如下:
It is common for the matrix M that we wish to decompose to be very sparse.
But U and V from a UV or SVD decomposition will not be sparse even so.
CUR decomposition solves this problem by using only (randomly chosen) rows and columns of M.
随机选择具体算法伪代码如下:
Moore-Penrose Inverse的定义如下:
CUR分解可以看成如下的优化问题:
Ref:
https://en.wikipedia.org/wiki/CUR_matrix_approximation
http://web.stanford.edu/class/cs246/handouts.html
We now take up a second form of matrix analysis that leads to a low-dimensional representation of a high-dimensional matrix. This approach, called singular value decomposition (SVD), allows an exact representation of any matrix, and also makes it easy to eliminate the less important parts of that representation to produce an approximate representation with any desired number of dimensions. Of course the fewer the dimensions we choose, the less accurate will be the approximation.
对于原始数据矩阵A,
UU矩阵的计算方法与VV矩阵的计算方法类似。
A small detail needs to be explained concerning UU and VV . Each of these matrices have rr columns, while ATAATA is an n×nn×n matrix and AATAAT is an m×mm×mmatrix. Both nn and mm are at least as large as rr. Thus, AATAAT and ATAATA should have an additional n−rn−r and m−rm−r eigenpairs, respectively, and these pairs donot show up in UU, VV , and ΣΣ. Since the rank of AA is rr, all other eigenvalueswill be 0, and these are not useful.
Instance#1
there are two “concepts” underlying the movies: science-fiction and romance. All the boys rate only science-fiction, and all the girls rate only romance.
SVD分解如下:
The key to understanding what SVD offers is in viewing the rr columns of UU, ΣΣ, and VV as representing concepts that are hidden in the original matrix MM. In Example 11.8, these concepts are clear; one is “science fiction” and the other is “romance.” Let us think of the rows of MM as people and the columns of MM as movies. Then matrix UU connects people to concepts. For example, the person Joe, who corresponds to row 1 of MM in Fig. 11.6, likes only the concept science fiction. The value 0.14 in the first row and first column of UU is smaller than some of the other entries in that column, because while Joe watches only science fiction, he doesn’t rate those movies highly. The second column of the first row of UU is 0, because Joe doesn’t rate romance movies at all.
The matrix VV relates movies to concepts. The 0.58 in each of the first three columns of the first row of VTVT indicates that the first three movies – Matrix, Alien, and Star Wars – each are of the science-fiction genre, while the 0’s in the last two columns of the first row say that these movies do not partake of the concept romance at all. Likewise, the second row of VTVT tells us that the movies Casablanca and Titanic are exclusively romances.
Finally, the matrix ΣΣ gives the strength of each of the concepts. In our example, the strength of the science-fiction concept is 12.4, while the strength of the romance concept is 9.5. Intuitively, the science-fiction concept is stronger because the data provides more movies of that genre and more people who like them.
SVD-Interpretation #2
Dimensionality Reduction Using SVD
进行SVD分解如下:
Suppose we want to represent a very large matrix MM by its SVD components UU,ΣΣ, and VV , but these matrices are also too large to store conveniently. The best way to reduce the dimensionality of the three matrices is to set the smallest of the singular values to zero. If we set the s smallest singular values to 0, then we can also eliminate the corresponding ss rows of UU and VV .
Suppose we want to reduce the number of dimensions to two. Then we set the smallest of the singular values, which is 1.3, to zero. The effect on the expression in Fig. 11.9 is that the third column of UU and the third row of VTVT are multiplied only by 0’s when we perform the multiplication, so this row and this column may as well not be there. That is, the approximation to M'M′ obtained by using only the two largest singular values is that shown in Fig. 11.10.
The resulting matrix is quite close to the matrix M'M′ of Fig. 11.8. Ideally, the entire difference is the result of making the last singular value be 0. However, in this simple example, much of the difference is due to rounding error caused by the fact that the decomposition of M'M′ was only correct to two significant digits.
关于为什么选择较小的特征值设置为0,选择哪些个特征值并设置为0可以得到很好的降维效果,以及为什么这样处理的SVD非常有效可以参看《Mining of massive datasets》
In large-data applications, it is normal for the matrix MM being decomposed to be very sparse; that is, most entries are 0. For example, a matrix representing many documents (as rows) and the words they contain (as columns) will be sparse, because most words are not present in most documents. Similarly, a matrix of customers and products will be sparse because most people do not buy most products.
We cannot deal with dense matrices that have millions or billions of rows and/or columns. However, with SVD, even if MMis sparse, UU and VV will be dense. ΣΣ, being diagonal, will be sparse, but Σ is usually much smaller than UU and VV ,so its sparseness does not help.
SVD分解的劣势如下:
It is common for the matrix M that we wish to decompose to be very sparse.
But U and V from a UV or SVD decomposition will not be sparse even so.
CUR decomposition solves this problem by using only (randomly chosen) rows and columns of M.
随机选择具体算法伪代码如下:
Moore-Penrose Inverse的定义如下:
CUR分解可以看成如下的优化问题:
Ref:
https://en.wikipedia.org/wiki/CUR_matrix_approximation
http://web.stanford.edu/class/cs246/handouts.html