Coursera机器学习第8周作业

1. 

1、Consider the following 2D dataset:

Which of the following figures correspond to possible values that PCA may return for  u(1) (the first eigenvector / first principal component)? Check all that apply (you may have to check more than one figure).

Coursera机器学习第8周作业_第1张图片

Coursera机器学习第8周作业_第2张图片Coursera机器学习第8周作业_第3张图片Coursera机器学习第8周作业_第4张图片

选1、4

2、Which of the following is a reasonable way to select the number of principal components 

k ?

(Recall that  n  is the dimensionality of the input data and  m  is the number of input examples.)

Choose  k  to be the smallest value so that at least 99% of the variance is retained.

Choose  k  to be the largest value so that at least 99% of the variance is retained

Choose  k  to be 99% of  m  (i.e.,  k=0.99m , rounded to the nearest integer).

Use the elbow method.

选1

3、Suppose someone tells you that they ran PCA in such a way that "95% of the variance was retained." What is an equivalent statement to this?     选3

1mmi=1||x(i)x(i)approx||21mmi=1||x(i)||20.05

1mmi=1||x(i)||21mmi=1||x(i)x(i)approx||20.95

1mmi=1||x(i)x(i)approx||21mmi=1||x(i)||20.05

1mmi=1||x(i)x(i)approx||21mmi=1||x(i)||20.95

4、 Which of the following statements are true? Check all that apply.   选1和3

Given input data  xRn , it makes sense to run PCA only with values of  k  that satisfy  kn . (In particular, running it with  k=n  is possible but not helpful, and  k>n  does not make sense.)

PCA is susceptible to local optima; trying multiple random initializations may help.

Even if all the input features are on very similar scales, we should still perform mean normalization (so that each feature has zero mean) before running PCA.

Given only  z(i)  and  Ureduce , there is no way to reconstruct any reasonable approximation to  x(i) .

5、Which of the following are recommended applications of PCA? Select all that apply.  选1和3

Data compression: Reduce the dimension of your data, so that it takes up less memory / disk space.

As a replacement for (or alternative to) linear regression: For most learning applications, PCA and linear regression give substantially similar results.

Data visualization: To take 2D data, and find a different way of plotting it in 2D (using k=2).

Data compression: Reduce the dimension of your input data  x(i) , which will be used in a supervised learning algorithm (i.e., use PCA so that your supervised learning algorithm runs faster).

Which of the following is a reasonable way to select the number of principal components  k ?

(Recall that  n  is the dimensionality of the input data and  m  is the number of input examples.)

Choose  k  to be the smallest value so that at least 99% of the variance is retained.

Choose  k  to be the largest value so that at least 99% of the variance is retained

Choose  k  to be 99% of  m  (i.e.,  k=0.99m , rounded to the nearest integer).

Use the elbow method.

Which of the following statements are true? Check all that apply.

Given input data  xRn , it makes sense to run PCA only with values of  k  that satisfy  kn . (In particular, running it with  k=n  is possible but not helpful, and  k>n  does not make sense.)

PCA is susceptible to local optima; trying multiple random initializations may help.

Even if all the input features are on very similar scales, we should still perform mean normalization (so that each feature has zero mean) before running PCA.

Given only  z(i)  and  Ureduce , there is no way to reconstruct any reasonable approximation to  x(i) .

你可能感兴趣的:(机器学习)