xulinshadow701

UFLDL Tutorial_Preprocessing: PCA and Whitening

PCA

[hide]

1 Introduction
2 Example and Mathematical Background
3 Rotating the Data
4 Reducing the Data Dimension
5 Recovering an Approximation of the Data
6 Number of components to retain
7 PCA on Images
8 References

Introduction

Principal Components Analysis (PCA) is a dimensionality reduction algorithm that can be used to significantly speed up your unsupervised feature learning algorithm. More importantly, understanding PCA will enable us to later implement whitening, which is an important pre-processing step for many algorithms.

Suppose you are training your algorithm on images. Then the input will be somewhat redundant, because the values of adjacent pixels in an image are highly correlated. Concretely, suppose we are training on 16x16 grayscale image patches. Then $\textstyle x \in \Re^{256}$ are 256 dimensional vectors, with one feature $\textstyle x_j$ corresponding to the intensity of each pixel. Because of the correlation between adjacent pixels, PCA will allow us to approximate the input with a much lower dimensional one, while incurring very little error.

Example and Mathematical Background

For our running example, we will use a dataset $\textstyle \{x^{(1)}, x^{(2)}, \ldots, x^{(m)}\}$ with $\textstyle n=2$ dimensional inputs, so that $\textstyle x^{(i)} \in \Re^2$ . Suppose we want to reduce the data from 2 dimensions to 1. (In practice, we might want to reduce data from 256 to 50 dimensions, say; but using lower dimensional data in our example allows us to visualize the algorithms better.) Here is our dataset:

This data has already been pre-processed so that each of the features $\textstyle x_1$ and $\textstyle x_2$ have about the same mean (zero) and variance.

For the purpose of illustration, we have also colored each of the points one of three colors, depending on their $\textstyle x_1$ value; these colors are not used by the algorithm, and are for illustration only.

PCA will find a lower-dimensional subspace onto which to project our data. From visually examining the data, it appears that $\textstyle u_1$ is the principal direction of variation of the data, and $\textstyle u_2$ the secondary direction of variation:

I.e., the data varies much more in the direction $\textstyle u_1$ than $\textstyle u_2$ . To more formally find the directions $\textstyle u_1$ and $\textstyle u_2$ , we first compute the matrix $\textstyle \Sigma$ as follows:

$\begin{align}\Sigma = \frac{1}{m} \sum_{i=1}^m (x^{(i)})(x^{(i)})^T. \end{align}$

If $\textstyle x$ has zero mean, then $\textstyle \Sigma$ is exactly the covariance matrix of $\textstyle x$ . (The symbol " $\textstyle \Sigma$ ", pronounced "Sigma", is the standard notation for denoting the covariance matrix. Unfortunately it looks just like the summation symbol, as in $\sum_{i=1}^n i$ ; but these are two different things.)

It can then be shown that $\textstyle u_1$ ---the principal direction of variation of the data---is the top (principal) eigenvector of $\textstyle \Sigma$ , and $\textstyle u_2$ is the second eigenvector.

Note: If you are interested in seeing a more formal mathematical derivation/justification of this result, see the CS229 (Machine Learning) lecture notes on PCA (link at bottom of this page). You won't need to do so to follow along this course, however.

You can use standard numerical linear algebra software to find these eigenvectors (see Implementation Notes). Concretely, let us compute the eigenvectors of $\textstyle \Sigma$ , and stack the eigenvectors in columns to form the matrix $\textstyle U$ :

Here, $\textstyle u_1$ is the principal eigenvector (corresponding to the largest eigenvalue), $\textstyle u_2$ is the second eigenvector, and so on. Also, let $\textstyle \lambda_1, \lambda_2, \ldots, \lambda_n$ be the corresponding eigenvalues.

The vectors $\textstyle u_1$ and $\textstyle u_2$ in our example form a new basis in which we can represent the data. Concretely, let $\textstyle x \in \Re^2$ be some training example. Then $\textstyle u_1^Tx$ is the length (magnitude) of the projection of $\textstyle x$ onto the vector $\textstyle u_1$ .

Similarly, is the magnitude of $\textstyle x$ projected onto the vector $\textstyle u_2$ .

Rotating the Data

Thus, we can represent $\textstyle x$ in the -basis by computing

$\begin{align}x_{\rm rot} = U^Tx = \begin{bmatrix} u_1^Tx \\ u_2^Tx \end{bmatrix} \end{align}$

(The subscript "rot" comes from the observation that this corresponds to a rotation (and possibly reflection) of the original data.) Lets take the entire training set, and compute $\textstyle x_{\rm rot}^{(i)} = U^Tx^{(i)}$ for every $\textstyle i$ . Plotting this transformed data $\textstyle x_{\rm rot}$ , we get:

This is the training set rotated into the $\textstyle u_1$ , $\textstyle u_2$ basis. In the general case, $\textstyle U^Tx$ will be the training set rotated into the basis $\textstyle u_1$ , $\textstyle u_2$ , ..., $\textstyle u_n$ .

One of the properties of $\textstyle U$ is that it is an "orthogonal" matrix, which means that it satisfies . So if you ever need to go from the rotated vectors $\textstyle x_{\rm rot}$ back to the original data $\textstyle x$ , you can compute

$\begin{align}x = U x_{\rm rot} ,\end{align}$

because $\textstyle U x_{\rm rot} = UU^T x = x$ .

Reducing the Data Dimension

We see that the principal direction of variation of the data is the first dimension of this rotated data. Thus, if we want to reduce this data to one dimension, we can set

$\begin{align}\tilde{x}^{(i)} = x_{{\rm rot},1}^{(i)} = u_1^Tx^{(i)} \in \Re.\end{align}$

More generally, if $\textstyle x \in \Re^n$ and we want to reduce it to a $\textstyle k$ dimensional representation $\textstyle \tilde{x} \in \Re^k$ (where $\textstyle k < n$ ), we would take the first $\textstyle k$ components of $\textstyle x_{\rm rot}$ , which correspond to the top $\textstyle k$ directions of variation.

Another way of explaining PCA is that $\textstyle x_{\rm rot}$ is an $\textstyle n$ dimensional vector, where the first few components are likely to be large (e.g., in our example, we saw that $\textstyle x_{{\rm rot},1}^{(i)} = u_1^Tx^{(i)}$ takes reasonably large values for most examples $\textstyle i$ ), and the later components are likely to be small (e.g., in our example, $\textstyle x_{{\rm rot},2}^{(i)} = u_2^Tx^{(i)}$ was more likely to be small). What PCA does it it drops the the later (smaller) components of $\textstyle x_{\rm rot}$ , and just approximates them with 0's. Concretely, our definition of $\textstyle \tilde{x}$ can also be arrived at by using an approximation to $\textstyle x_{{\rm rot}}$ where all but the first $\textstyle k$ components are zeros. In other words, we have:

$UFLDL Tutorial_Preprocessing: PCA and Whitening_第3张图片$

In our example, this gives us the following plot of $\textstyle \tilde{x}$ (using $\textstyle n=2, k=1$ ):

However, since the final $\textstyle n-k$ components of $\textstyle \tilde{x}$ as defined above would always be zero, there is no need to keep these zeros around, and so we define $\textstyle \tilde{x}$ as a $\textstyle k$ -dimensional vector with just the first $\textstyle k$ (non-zero) components.

This also explains why we wanted to express our data in the $\textstyle u_1, u_2, \ldots, u_n$ basis: Deciding which components to keep becomes just keeping the top $\textstyle k$ components. When we do this, we also say that we are "retaining the top $\textstyle k$ PCA (or principal) components."

Recovering an Approximation of the Data

Now, $\textstyle \tilde{x} \in \Re^k$ is a lower-dimensional, "compressed" representation of the original $\textstyle x \in \Re^n$ . Given $\textstyle \tilde{x}$ , how can we recover an approximation $\textstyle \hat{x}$ to the original value of $\textstyle x$ ? From an earlier section, we know that . Further, we can think of $\textstyle \tilde{x}$ as an approximation to $\textstyle x_{\rm rot}$ , where we have set the last $\textstyle n-k$ components to zeros. Thus, given $\textstyle \tilde{x} \in \Re^k$ , we can pad it out with $\textstyle n-k$ zeros to get our approximation to $\textstyle x_{\rm rot} \in \Re^n$ . Finally, we pre-multiply by $\textstyle U$ to get our approximation to $\textstyle x$ . Concretely, we get

$UFLDL Tutorial_Preprocessing: PCA and Whitening_第5张图片$

The final equality above comes from the definition of $\textstyle U$ given earlier. (In a practical implementation, we wouldn't actually zero pad $\textstyle \tilde{x}$ and then multiply by $\textstyle U$ , since that would mean multiplying a lot of things by zeros; instead, we'd just multiply $\textstyle \tilde{x} \in \Re^k$ with the first $\textstyle k$ columns of $\textstyle U$ as in the final expression above.) Applying this to our dataset, we get the following plot for $\textstyle \hat{x}$ :

We are thus using a 1 dimensional approximation to the original dataset.

If you are training an autoencoder or other unsupervised feature learning algorithm, the running time of your algorithm will depend on the dimension of the input. If you feed $\textstyle \tilde{x} \in \Re^k$ into your learning algorithm instead of $\textstyle x$ , then you'll be training on a lower-dimensional input, and thus your algorithm might run significantly faster. For many datasets, the lower dimensional $\textstyle \tilde{x}$ representation can be an extremely good approximation to the original, and using PCA this way can significantly speed up your algorithm while introducing very little approximation error.

Number of components to retain

How do we set $\textstyle k$ ; i.e., how many PCA components should we retain? In our simple 2 dimensional example, it seemed natural to retain 1 out of the 2 components, but for higher dimensional data, this decision is less trivial. If $\textstyle k$ is too large, then we won't be compressing the data much; in the limit of $\textstyle k=n$ , then we're just using the original data (but rotated into a different basis). Conversely, if $\textstyle k$ is too small, then we might be using a very bad approximation to the data.

To decide how to set $\textstyle k$ , we will usually look at the percentage of variance retained for different values of $\textstyle k$ . Concretely, if $\textstyle k=n$ , then we have an exact approximation to the data, and we say that 100% of the variance is retained. I.e., all of the variation of the original data is retained. Conversely, if $\textstyle k=0$ , then we are approximating all the data with the zero vector, and thus 0% of the variance is retained.

More generally, let $\textstyle \lambda_1, \lambda_2, \ldots, \lambda_n$ be the eigenvalues of $\textstyle \Sigma$ (sorted in decreasing order), so that $\textstyle \lambda_j$ is the eigenvalue corresponding to the eigenvector $\textstyle u_j$ . Then if we retain $\textstyle k$ principal components, the percentage of variance retained is given by:

$\begin{align}\frac{\sum_{j=1}^k \lambda_j}{\sum_{j=1}^n \lambda_j}.\end{align}$

In our simple 2D example above, $\textstyle \lambda_1 = 7.29$ , and $\textstyle \lambda_2 = 0.69$ . Thus, by keeping only $\textstyle k=1$ principal components, we retained , or 91.3% of the variance.

A more formal definition of percentage of variance retained is beyond the scope of these notes. However, it is possible to show that $\textstyle \lambda_j =\sum_{i=1}^m x_{{\rm rot},j}^2$ . Thus, if $\textstyle \lambda_j \approx 0$ , that shows that is usually near 0 anyway, and we lose relatively little by approximating it with a constant 0. This also explains why we retain the top principal components (corresponding to the larger values of $\textstyle \lambda_j$ ) instead of the bottom ones. The top principal components are the ones that're more variable and that take on larger values, and for which we would incur a greater approximation error if we were to set them to zero.

In the case of images, one common heuristic is to choose $\textstyle k$ so as to retain 99% of the variance. In other words, we pick the smallest value of $\textstyle k$ that satisfies

$\begin{align}\frac{\sum_{j=1}^k \lambda_j}{\sum_{j=1}^n \lambda_j} \geq 0.99. \end{align}$

Depending on the application, if you are willing to incur some additional error, values in the 90-98% range are also sometimes used. When you describe to others how you applied PCA, saying that you chose $\textstyle k$ to retain 95% of the variance will also be a much more easily interpretable description than saying that you retained 120 (or whatever other number of) components.

PCA on Images

For PCA to work, usually we want each of the features $\textstyle x_1, x_2, \ldots, x_n$ to have a similar range of values to the others (and to have a mean close to zero). If you've used PCA on other applications before, you may therefore have separately pre-processed each feature to have zero mean and unit variance, by separately estimating the mean and variance of each feature $\textstyle x_j$ . However, this isn't the pre-processing that we will apply to most types of images. Specifically, suppose we are training our algorithm on natural images, so that $\textstyle x_j$ is the value of pixel $\textstyle j$ . By "natural images," we informally mean the type of image that a typical animal or person might see over their lifetime.

Note: Usually we use images of outdoor scenes with grass, trees, etc., and cut out small (say 16x16) image patches randomly from these to train the algorithm. But in practice most feature learning algorithms are extremely robust to the exact type of image it is trained on, so most images taken with a normal camera, so long as they aren't excessively blurry or have strange artifacts, should work.

When training on natural images, it makes little sense to estimate a separate mean and variance for each pixel, because the statistics in one part of the image should (theoretically) be the same as any other. This property of images is calledstationarity.

In detail, in order for PCA to work well, informally we require that (i) The features have approximately zero mean, and (ii) The different features have similar variances to each other. With natural images, (ii) is already satisfied even without variance normalization, and so we won't perform any variance normalization. (If you are training on audio data---say, on spectrograms---or on text data---say, bag-of-word vectors---we will usually not perform variance normalization either.) In fact, PCA is invariant to the scaling of the data, and will return the same eigenvectors regardless of the scaling of the input. More formally, if you multiply each feature vector $\textstyle x$ by some positive number (thus scaling every feature in every training example by the same number), PCA's output eigenvectors will not change.

So, we won't use variance normalization. The only normalization we need to perform then is mean normalization, to ensure that the features have a mean around zero. Depending on the application, very often we are not interested in how bright the overall input image is. For example, in object recognition tasks, the overall brightness of the image doesn't affect what objects there are in the image. More formally, we are not interested in the mean intensity value of an image patch; thus, we can subtract out this value, as a form of mean normalization.

Concretely, if $\textstyle x^{(i)} \in \Re^{n}$ are the (grayscale) intensity values of a 16x16 image patch ( $\textstyle n=256$ ), we might normalize the intensity of each image as follows:

$x^{(i)}_j := x^{(i)}_j - \mu^{(i)}$ , for all $\textstyle j$

Note that the two steps above are done separately for each image , and that $\textstyle \mu^{(i)}$ here is the mean intensity of the image . In particular, this is not the same thing as estimating a mean value separately for each pixel $\textstyle x_j$ .

If you are training your algorithm on images other than natural images (for example, images of handwritten characters, or images of single isolated objects centered against a white background), other types of normalization might be worth considering, and the best choice may be application dependent. But when training on natural images, using the per-image mean normalization method as given in the equations above would be a reasonable default.

References

http://cs229.stanford.edu

Whitening

[hide]

1 Introduction
2 2D example
3 ZCA Whitening
4 Regularizaton

Introduction

We have used PCA to reduce the dimension of the data. There is a closely related preprocessing step called whitening (or, in some other literatures, sphering) which is needed for some algorithms. If we are training on images, the raw input is redundant, since adjacent pixel values are highly correlated. The goal of whitening is to make the input less redundant; more formally, our desiderata are that our learning algorithms sees a training input where (i) the features are less correlated with each other, and (ii) the features all have the same variance.

2D example

We will first describe whitening using our previous 2D example. We will then describe how this can be combined with smoothing, and finally how to combine this with PCA.

How can we make our input features uncorrelated with each other? We had already done this when computing $\textstyle x_{\rm rot}^{(i)} = U^Tx^{(i)}$ . Repeating our previous figure, our plot for $\textstyle x_{\rm rot}$ was:

The covariance matrix of this data is given by:

$\begin{align}\begin{bmatrix}7.29 & 0 \\0 & 0.69\end{bmatrix}.\end{align}$

(Note: Technically, many of the statements in this section about the "covariance" will be true only if the data has zero mean. In the rest of this section, we will take this assumption as implicit in our statements. However, even if the data's mean isn't exactly zero, the intuitions we're presenting here still hold true, and so this isn't something that you should worry about.)

It is no accident that the diagonal values are $\textstyle \lambda_1$ and $\textstyle \lambda_2$ . Further, the off-diagonal entries are zero; thus, and are uncorrelated, satisfying one of our desiderata for whitened data (that the features be less correlated).

To make each of our input features have unit variance, we can simply rescale each feature $\textstyle x_{{\rm rot},i}$ by $\textstyle 1/\sqrt{\lambda_i}$ . Concretely, we define our whitened data $\textstyle x_{{\rm PCAwhite}} \in \Re^n$ as follows:

$\begin{align}x_{{\rm PCAwhite},i} = \frac{x_{{\rm rot},i} }{\sqrt{\lambda_i}}. \end{align}$

Plotting $\textstyle x_{{\rm PCAwhite}}$ , we get:

This data now has covariance equal to the identity matrix $\textstyle I$ . We say that $\textstyle x_{{\rm PCAwhite}}$ is our PCA whitened version of the data: The different components of $\textstyle x_{{\rm PCAwhite}}$ are uncorrelated and have unit variance.

Whitening combined with dimensionality reduction. If you want to have data that is whitened and which is lower dimensional than the original input, you can also optionally keep only the top $\textstyle k$ components of $\textstyle x_{{\rm PCAwhite}}$ . When we combine PCA whitening with regularization (described later), the last few components of $\textstyle x_{{\rm PCAwhite}}$ will be nearly zero anyway, and thus can safely be dropped.

ZCA Whitening

Finally, it turns out that this way of getting the data to have covariance identity $\textstyle I$ isn't unique. Concretely, if $\textstyle R$ is any orthogonal matrix, so that it satisfies $\textstyle RR^T = R^TR = I$ (less formally, if $\textstyle R$ is a rotation/reflection matrix), then $\textstyle R \,x_{\rm PCAwhite}$ will also have identity covariance. In ZCA whitening, we choose $\textstyle R = U$ . We define

$\begin{align}x_{\rm ZCAwhite} = U x_{\rm PCAwhite}\end{align}$

Plotting $\textstyle x_{\rm ZCAwhite}$ , we get:

It can be shown that out of all possible choices for $\textstyle R$ , this choice of rotation causes $\textstyle x_{\rm ZCAwhite}$ to be as close as possible to the original input data $\textstyle x$ .

When using ZCA whitening (unlike PCA whitening), we usually keep all $\textstyle n$ dimensions of the data, and do not try to reduce its dimension.

Regularizaton

When implementing PCA whitening or ZCA whitening in practice, sometimes some of the eigenvalues $\textstyle \lambda_i$ will be numerically close to 0, and thus the scaling step where we divide by $\sqrt{\lambda_i}$ would involve dividing by a value close to zero; this may cause the data to blow up (take on large values) or otherwise be numerically unstable. In practice, we therefore implement this scaling step using a small amount of regularization, and add a small constant $\textstyle \epsilon$ to the eigenvalues before taking their square root and inverse:

$\begin{align}x_{{\rm PCAwhite},i} = \frac{x_{{\rm rot},i} }{\sqrt{\lambda_i + \epsilon}}.\end{align}$

When $\textstyle x$ takes values around $\textstyle [-1,1]$ , a value of $\textstyle \epsilon \approx 10^{-5}$ might be typical.

For the case of images, adding $\textstyle \epsilon$ here also has the effect of slightly smoothing (or low-pass filtering) the input image. This also has a desirable effect of removing aliasing artifacts caused by the way pixels are laid out in an image, and can improve the features learned (details are beyond the scope of these notes).

ZCA whitening is a form of pre-processing of the data that maps it from $\textstyle x$ to $\textstyle x_{\rm ZCAwhite}$ . It turns out that this is also a rough model of how the biological eye (the retina) processes images. Specifically, as your eye perceives images, most adjacent "pixels" in your eye will perceive very similar values, since adjacent parts of an image tend to be highly correlated in intensity. It is thus wasteful for your eye to have to transmit every pixel separately (via your optic nerve) to your brain. Instead, your retina performs a decorrelation operation (this is done via retinal neurons that compute a function called "on center, off surround/off center, on surround") which is similar to that performed by ZCA. This results in a less redundant representation of the input image, which is then transmitted to your brain.

Implementing PCA/Whitening

In this section, we summarize the PCA, PCA whitening and ZCA whitening algorithms, and also describe how you can implement them using efficient linear algebra libraries.

First, we need to ensure that the data has (approximately) zero-mean. For natural images, we achieve this (approximately) by subtracting the mean value of each image patch.

We achieve this by computing the mean for each patch and subtracting it for each patch. In Matlab, we can do this by using

avg = mean(x, 1);     % Compute the mean pixel intensity value separately for each patch. 
x = x - repmat(avg, size(x, 1), 1);

Next, we need to compute $\textstyle \Sigma = \frac{1}{m} \sum_{i=1}^m (x^{(i)})(x^{(i)})^T$ . If you're implementing this in Matlab (or even if you're implementing this in C++, Java, etc., but have access to an efficient linear algebra library), doing it as an explicit sum is inefficient. Instead, we can compute this in one fell swoop as

sigma = x * x' / size(x, 2);

(Check the math yourself for correctness.) Here, we assume that $x$ is a data structure that contains one training example per column (so, $x$ is a $\textstyle n$ -by- $\textstyle m$ matrix).

Next, PCA computes the eigenvectors of $Σ$ . One could do this using the Matlab eig function. However, because $Σ$ is a symmetric positive semi-definite matrix, it is more numerically reliable to do this using the svd function. Concretely, if you implement

[U,S,V] = svd(sigma);

then the matrix $U$ will contain the eigenvectors of $Sigma$ (one eigenvector per column, sorted in order from top to bottom eigenvector), and the diagonal entries of the matrix $S$ will contain the corresponding eigenvalues (also sorted in decreasing order). The matrix $V$ will be equal to transpose of $U$ , and can be safely ignored.

(Note: The svd function actually computes the singular vectors and singular values of a matrix, which for the special case of a symmetric positive semi-definite matrix---which is all that we're concerned with here---is equal to its eigenvectors and eigenvalues. A full discussion of singular vectors vs. eigenvectors is beyond the scope of these notes.)

Finally, you can compute $\textstyle x_{\rm rot}$ and $\textstyle \tilde{x}$ as follows:

xRot = U' * x;          % rotated version of the data. 
xTilde = U(:,1:k)' * x; % reduced dimension representation of the data, 
                        % where k is the number of eigenvectors to keep

This gives your PCA representation of the data in terms of $\textstyle \tilde{x} \in \Re^k$ . Incidentally, if $x$ is a $\textstyle n$ -by- $\textstyle m$ matrix containing all your training data, this is a vectorized implementation, and the expressions above work too for computing $x rot$ and $\tilde{x}$ for your entire training set all in one go. The resulting $x rot$ and $\tilde{x}$ will have one column corresponding to each training example.

To compute the PCA whitened data $\textstyle x_{\rm PCAwhite}$ , use

xPCAwhite = diag(1./sqrt(diag(S) + epsilon)) * U' * x;

Since $S$ 's diagonal contains the eigenvalues $\textstyle \lambda_i$ , this turns out to be a compact way of computing $\textstyle x_{{\rm PCAwhite},i} = \frac{x_{{\rm rot},i} }{\sqrt{\lambda_i}}$ simultaneously for all $\textstyle i$ .

Finally, you can also compute the ZCA whitened data $\textstyle x_{\rm ZCAwhite}$ as:

xZCAwhite = U * diag(1./sqrt(diag(S) + epsilon)) * U' * x;

Exercise:PCA in 2D

[hide]

1 PCA, PCA whitening and ZCA whitening in 2D
- 1.1 Step 0: Load data
- 1.2 Step 1: Implement PCA
  - 1.2.1 Step 1a: Finding the PCA basis
  - 1.2.2 Step 1b: Check xRot
- 1.3 Step 2: Dimension reduce and replot
- 1.4 Step 3: PCA Whitening
- 1.5 Step 4: ZCA Whitening

PCA, PCA whitening and ZCA whitening in 2D

In this exercise you will implement PCA, PCA whitening and ZCA whitening, as described in the earlier sections of this tutorial, and generate the images shown in the earlier sections yourself. You will build on the starter code that has been provided atpca_2d.zip. You need only write code at the places indicated by "YOUR CODE HERE" in the files. The only file you need to modify ispca_2d.m. Implementing this exercise will make the next exercise significantly easier to understand and complete.

Step 0: Load data

The starter code contains code to load 45 2D data points. When plotted using the scatter function, the results should look like the following:

Step 1: Implement PCA

In this step, you will implement PCA to obtain $x rot$ , the matrix in which the data is "rotated" to the basis comprising $\textstyle u_1, \ldots, u_n$ made up of the principal components. As mentioned in the implementation notes, you should make use of MATLAB's svd function here.

Step 1a: Finding the PCA basis

Find $\textstyle u_1$ and $\textstyle u_2$ , and draw two lines in your figure to show the resulting basis on top of the given data points. You may find it useful to use MATLAB's hold on and hold off functions. (After calling hold on, plotting functions such as plot will draw the new data on top of the previously existing figure rather than erasing and replacing it; and hold off turns this off.) You can use plot([x1,x2], [y1,y2], '-') to draw a line between (x1,y1) and (x2,y2). Your figure should look like this:

If you are doing this in Matlab, you will probably get a plot that's identical to ours. However, eigenvectors are defined only up to a sign. I.e., instead of returning $\textstyle u_1$ as the first eigenvector, Matlab/Octave could just as easily have returned $\textstyle -u_1$ , and similarly instead of $\textstyle u_2$ Matlab/Octave could have returned $\textstyle -u_2$ . So if you wound up with one or both of the eigenvectors pointing in a direction opposite (180 degrees difference) from what's shown above, that's okay too.

Step 1b: Check xRot

Compute xRot, and use the scatter function to check that xRot looks as it should, which should be something like the following:

Because Matlab/Octave could have returned $\textstyle -u_1$ and/or $\textstyle -u_2$ instead of $\textstyle u_1$ and $\textstyle u_2$ , it's also possible that you might have gotten a figure which is "flipped" or "reflected" along the $\textstyle x$ - and/or $\textstyle y$ -axis; a flipped/reflected version of this figure is also a completely correct result.

Step 2: Dimension reduce and replot

In the next step, set $k$ , the number of components to retain, to be 1 (we have already done this for you). Compute the resulting xHatand plot the results. You should get the following (this figure should not be flipped along the $\textstyle x$ - or $\textstyle y$ -axis):

Step 3: PCA Whitening

Implement PCA whitening using the formula from the notes. Plot xPCAWhite, and verify that it looks like the following (a figure that is flipped/reflected on either/both axes is also correct):

Step 4: ZCA Whitening

Implement ZCA whitening and plot the results. The results should look like the following (this should not be flipped/reflected along the $\textstyle x$ - or $\textstyle y$ -axis):

Step 0: Load data
Step 1a: Implement PCA to obtain U
Step 1b: Compute xRot, the projection on to the eigenbasis
Step 2: Reduce the number of dimensions from 2 to 1.
Step 3: PCA Whitening
Step 3: ZCA Whitening
Congratulations! When you have reached this point, you are done!

close all

%%================================================================

Step 0: Load data

We have provided the code to load data from pcaData.txt into x.
x is a 2 * 45 matrix, where the kth column x(:,k) corresponds to
the kth data point.Here we provide the code to load natural image data into x.
You do not need to change the code below.

x = load('pcaData.txt','-ascii');
figure(1);
scatter(x(1, :), x(2, :));
title('Raw data');


%%================================================================

Step 1a: Implement PCA to obtain U

Implement PCA to obtain the rotation matrix U, which is the eigenbasis
sigma.

% -------------------- YOUR CODE HERE --------------------
u = zeros(size(x, 1)); % You need to compute this
[n m] = size(x);
x = x-repmat(mean(x,2),1,m);%预处理，均值为0
sigma = (1.0/m)*x*x';
[u s v] = svd(sigma);


% --------------------------------------------------------
hold on
plot([0 u(1,1)], [0 u(2,1)]);%画第一条线
plot([0 u(1,2)], [0 u(2,2)]);%第二条线
scatter(x(1, :), x(2, :));
hold off

%%================================================================

Step 1b: Compute xRot, the projection on to the eigenbasis

Now, compute xRot by projecting the data on to the basis defined
by U. Visualize the points by performing a scatter plot.

% -------------------- YOUR CODE HERE --------------------
xRot = zeros(size(x)); % You need to compute this
xRot = u'*x;


% --------------------------------------------------------

% Visualise the covariance matrix. You should see a line across the
% diagonal against a blue background.
figure(2);
scatter(xRot(1, :), xRot(2, :));
title('xRot');

%%================================================================

Step 2: Reduce the number of dimensions from 2 to 1.

Compute xRot again (this time projecting to 1 dimension).
Then, compute xHat by projecting the xRot back onto the original axes
to see the effect of dimension reduction

% -------------------- YOUR CODE HERE --------------------
k = 1; % Use k = 1 and project the data onto the first eigenbasis
xHat = zeros(size(x)); % You need to compute this
xHat = u*([u(:,1),zeros(n,1)]'*x);


% --------------------------------------------------------
figure(3);
scatter(xHat(1, :), xHat(2, :));
title('xHat');


%%================================================================

Step 3: PCA Whitening

Complute xPCAWhite and plot the results.

epsilon = 1e-5;
% -------------------- YOUR CODE HERE --------------------
xPCAWhite = zeros(size(x)); % You need to compute this
xPCAWhite = diag(1./sqrt(diag(s)+epsilon))*u'*x;



% --------------------------------------------------------
figure(4);
scatter(xPCAWhite(1, :), xPCAWhite(2, :));
title('xPCAWhite');

%%================================================================

Step 3: ZCA Whitening

Complute xZCAWhite and plot the results.

% -------------------- YOUR CODE HERE --------------------
xZCAWhite = zeros(size(x)); % You need to compute this
xZCAWhite = u*diag(1./sqrt(diag(s)+epsilon))*u'*x;

% --------------------------------------------------------
figure(5);
scatter(xZCAWhite(1, :), xZCAWhite(2, :));
title('xZCAWhite');

Congratulations! When you have reached this point, you are done!

You can now move onto the next PCA exercise. :)

Exercise:PCA and Whitening
   
   
   
   
    
    
    
    
    
    
    
     
      
       
        
         
         Contents  
         [hide] 
         
         
         1 PCA and Whitening on natural images 
           
           1.1 Step 0: Prepare data 
             
             1.1.1 Step 0a: Load data 
             1.1.2 Step 0b: Zero mean the data 
             
           1.2 Step 1: Implement PCA 
             
             1.2.1 Step 1a: Implement PCA 
             1.2.2 Step 1b: Check covariance 
             
           1.3 Step 2: Find number of components to retain 
           1.4 Step 3: PCA with dimension reduction 
           1.5 Step 4: PCA with whitening and regularization 
             
             1.5.1 Step 4a: Implement PCA with whitening and regularization 
             1.5.2 Step 4b: Check covariance 
             
           1.6 Step 5: ZCA whitening 
           
         
       
      
    
    
    
    
    PCA and Whitening on natural images
    
    
    
    In this exercise, you will implement PCA, PCA whitening and ZCA whitening, and apply them to image patches taken from natural images.
    
    
    
    You will build on the MATLAB starter code which we have provided in pca_exercise.zip. You need only write code at the places indicated by "YOUR CODE HERE" in the files. The only file you need to modify is pca_gen.m.
    
    
    
    Step 0: Prepare data
    
    
    
    Step 0a: Load data
    
    
    
    The starter code contains code to load a set of natural images and sample 12x12 patches from them. The raw patches will look something like this:
    
    
    
    
    
    
    
    These patches are stored as column vectors  in the  matrix  $x$ .
    
    
    
    Step 0b: Zero mean the data
    
    
    
    First, for each image patch, compute the mean pixel value and subtract it from that image, this centering the image around zero. You should compute a different mean value for each image patch.
    
    
    
    Step 1: Implement PCA
    
    
    
    Step 1a: Implement PCA
    
    
    
    In this step, you will implement PCA to obtain  $x rot$ , the matrix in which the data is "rotated" to the basis comprising the principal components (i.e. the eigenvectors of  $Σ$ ). Note that in this part of the exercise, you should not whiten the data.
    
    
    
    Step 1b: Check covariance
    
    
    
    To verify that your implementation of PCA is correct, you should check the covariance matrix for the rotated data  $x rot$ . PCA guarantees that the covariance matrix for the rotated data is a diagonal matrix (a matrix with non-zero entries only along the main diagonal). Implement code to compute the covariance matrix and verify this property. One way to do this is to compute the covariance matrix, and visualise it using the MATLAB command imagesc. The image should show a coloured diagonal line against a blue background. For this dataset, because of the range of the diagonal entries, the diagonal line may not be apparent, so you might get a figure like the one show below, but this trick of visualizing using imagesc will come in handy later in this exercise.
    
    
    
    
    
    
    
    Step 2: Find number of components to retain
    
    
    
    Next, choose  $k$ , the number of principal components to retain. Pick  $k$  to be as small as possible, but so that at least 99% of the variance is retained. In the step after this, you will discard all but the top  $k$  principal components, reducing the dimension of the original data to  $k$ .
    
    
    
    Step 3: PCA with dimension reduction
    
    
    
    Now that you have found  $k$ , compute , the reduced-dimension representation of the data. This gives you a representation of each image patch as a  $k$  dimensional vector instead of a 144 dimensional vector. If you are training a sparse autoencoder or other algorithm on this reduced-dimensional data, it will run faster than if you were training on the original 144 dimensional data.
    
    
    
    To see the effect of dimension reduction, go back from  to produce the matrix , the dimension-reduced data but expressed in the original 144 dimensional space of image patches. Visualise  and compare it to the raw data,  $x$ . You will observe that there is little loss due to throwing away the principal components that correspond to dimensions with low variation. For comparison, you may also wish to generate and visualise  for when only 90% of the variance is retained.
    
    
    
     
      
       
        
        
        
       
       
       Raw images    
       PCA dimension-reduced images (99% variance) 
       PCA dimension-reduced images (90% variance) 
       
      
    
    
    
    
    Step 4: PCA with whitening and regularization
    
    
    
    Step 4a: Implement PCA with whitening and regularization
    
    
    
    Now implement PCA with whitening and regularization to produce the matrix  $x PCAWhite$ . Use the following parameter value:
    
    
    
    epsilon = 0.1

    
    
    
    Step 4b: Check covariance
    
    
    
    Similar to using PCA alone, PCA with whitening also results in processed data that has a diagonal covariance matrix. However, unlike PCA alone, whitening additionally ensures that the diagonal entries are equal to 1, i.e. that the covariance matrix is the identity matrix.
    
    
    
    That would be the case if you were doing whitening alone with no regularization. However, in this case you are whitening with regularization, to avoid numerical/etc. problems associated with small eigenvalues. As a result of this, some of the diagonal entries of the covariance of your  $x PCAwhite$  will be smaller than 1.
    
    
    
    To verify that your implementation of PCA whitening with and without regularization is correct, you can check these properties. Implement code to compute the covariance matrix and verify this property. (To check the result of PCA without whitening, simply set epsilon to 0, or close to 0, say 1e-10). As earlier, you can visualise the covariance matrix with imagesc. When visualised as an image, for PCA whitening without regularization you should see a red line across the diagonal (corresponding to the one entries) against a blue background (corresponding to the zero entries); for PCA whitening with regularization you should see a red line that slowly turns blue across the diagonal (corresponding to the 1 entries slowly becoming smaller).
    
    
    
     
      
       
        
        
       
       
        Covariance for PCA whitening with regularization  
        Covariance for PCA whitening without regularization  
       
      
    
    
    
    
    Step 5: ZCA whitening
    
    
    
    Now implement ZCA whitening to produce the matrix  $x ZCAWhite$ . Visualize  $x ZCAWhite$  and compare it to the raw data,  $x$ . You should observe that whitening results in, among other things, enhanced edges. Try repeating this with epsilon set to 1, 0.1, and 0.01, and see what you obtain. The example shown below (left image) was obtained with epsilon = 0.1.
    
    
    
     
      
       
        
        
       
       
       ZCA whitened images 
       Raw images
 
       
      
    
   
   
   
   
Contents
   
   
   
   
    
    
    
     
     Step 0a: Load data 
     Step 0b: Zero-mean the data (by row) 
     Step 1a: Implement PCA to obtain xRot 
     Step 1b: Check your implementation of PCA 
     Step 2: Find k, the number of components to retain 
     Step 3: Implement PCA with dimension reduction 
     Step 4a: Implement PCA with whitening and regularisation 
     Step 4b: Check your implementation of PCA whitening 
     Step 5: Implement ZCA whitening 
    
   
   
   
   
%%================================================================
Step 0a: Load data
Here we provide the code to load natural image data into x.
x will be a 144 * 10000 matrix, where the kth column x(:, k) corresponds to
the raw image data from the kth 12x12 image patch sampled.
You do not need to change the code below.
x = sampleIMAGESRAW();
figure('name','Raw images');
randsel = randi(size(x,2),204,1); % A random selection of samples for visualization
display_network(x(:,randsel));%为什么x有负数还可以显示？

%%================================================================
Step 0b: Zero-mean the data (by row)
You can make use of the mean and repmat/bsxfun functions.
% -------------------- YOUR CODE HERE --------------------
x = x-repmat(mean(x,1),size(x,1),1);%求的是每一列的均值
%x = x-repmat(mean(x,2),1,size(x,2));

%%================================================================
Step 1a: Implement PCA to obtain xRot
Implement PCA to obtain xRot, the matrix in which the data is expressed
with respect to the eigenbasis of sigma, which is the matrix U.
% -------------------- YOUR CODE HERE --------------------
xRot = zeros(size(x)); % You need to compute this
[n m] = size(x);
sigma = (1.0/m)*x*x';
[u s v] = svd(sigma);
xRot = u'*x;


%%================================================================
Step 1b: Check your implementation of PCA
The covariance matrix for the data expressed with respect to the basis U
should be a diagonal matrix with non-zero entries only along the main
diagonal. We will verify this here.
Write code to compute the covariance matrix, covar.
When visualised as an image, you should see a straight line across the
diagonal (non-zero entries) against a blue background (zero entries).
% -------------------- YOUR CODE HERE --------------------
covar = zeros(size(x, 1)); % You need to compute this
covar = (1./m)*xRot*xRot';

% Visualise the covariance matrix. You should see a line across the
% diagonal against a blue background.
figure('name','Visualisation of covariance matrix');
imagesc(covar);

%%================================================================
Step 2: Find k, the number of components to retain
Write code to determine k, the number of components to retain in order
to retain at least 99% of the variance.
% -------------------- YOUR CODE HERE --------------------
k = 0; % Set k accordingly
ss = diag(s);
% for k=1:m
%    if sum(s(1:k))./sum(ss) < 0.99
%        continue;
% end
%其中cumsum(ss)求出的是一个累积向量，也就是说ss向量值的累加值
%并且(cumsum(ss)/sum(ss))<=0.99是一个向量，值为0或者1的向量，为1表示满足那个条件
k = length(ss((cumsum(ss)/sum(ss))<=0.99));

%%================================================================
Step 3: Implement PCA with dimension reduction
Now that you have found k, you can reduce the dimension of the data by
discarding the remaining dimensions. In this way, you can represent the
data in k dimensions instead of the original 144, which will save you
computational time when running learning algorithms on the reduced
representation.
Following the dimension reduction, invert the PCA transformation to produce
the matrix xHat, the dimension-reduced data with respect to the original basis.
Visualise the data and compare it to the raw data. You will observe that
there is little loss due to throwing away the principal components that
correspond to dimensions with low variation.
% -------------------- YOUR CODE HERE --------------------
xHat = zeros(size(x));  % You need to compute this
xHat = u*[u(:,1:k)'*x;zeros(n-k,m)];

% Visualise the data, and compare it to the raw data
% You should observe that the raw and processed data are of comparable quality.
% For comparison, you may wish to generate a PCA reduced image which
% retains only 90% of the variance.

figure('name',['PCA processed images ',sprintf('(%d / %d dimensions)', k, size(x, 1)),'']);
display_network(xHat(:,randsel));
figure('name','Raw images');
display_network(x(:,randsel));

%%================================================================
 Step 4a: Implement PCA with whitening and regularisation
Implement PCA with whitening and regularisation to produce the matrix
xPCAWhite.
epsilon = 0.1;
xPCAWhite = zeros(size(x));

% -------------------- YOUR CODE HERE --------------------
xPCAWhite = diag(1./sqrt(diag(s)+epsilon))*u'*x;
figure('name','PCA whitened images');
display_network(xPCAWhite(:,randsel));

%%================================================================
Step 4b: Check your implementation of PCA whitening
Check your implementation of PCA whitening with and without regularisation.
PCA whitening without regularisation results a covariance matrix
that is equal to the identity matrix. PCA whitening with regularisation
results in a covariance matrix with diagonal entries starting close to
1 and gradually becoming smaller. We will verify these properties here.
Write code to compute the covariance matrix, covar.
Without regularisation (set epsilon to 0 or close to 0),
when visualised as an image, you should see a red line across the
diagonal (one entries) against a blue background (zero entries).
With regularisation, you should see a red line that slowly turns
blue across the diagonal, corresponding to the one entries slowly
becoming smaller.
% -------------------- YOUR CODE HERE --------------------
covar = (1./m)*xPCAWhite*xPCAWhite';

% Visualise the covariance matrix. You should see a red line across the
% diagonal against a blue background.
figure('name','Visualisation of covariance matrix');
imagesc(covar);

%%================================================================
Step 5: Implement ZCA whitening
Now implement ZCA whitening to produce the matrix xZCAWhite.
Visualise the data and compare it to the raw data. You should observe
that whitening results in, among other things, enhanced edges.
xZCAWhite = zeros(size(x));

% -------------------- YOUR CODE HERE --------------------
xZCAWhite = u*xPCAWhite;

% Visualise the data, and compare it to the raw data.
% You should observe that the whitened images have enhanced edges.
figure('name','ZCA whitened images');
display_network(xZCAWhite(:,randsel));
figure('name','Raw images');
display_network(x(:,randsel));
 
Published with MATLAB® 7.11

你可能感兴趣的:(NetWork,deep,learning,learning,machine,Neural)

uniapp中使用vant wappp m0_70647189 Web前端经验分享前端前端框架
.custom-button{.van-button{background-color:blue;border-radius:10px;}}如果你的style样式中存在scoped，我们可以利用vue中的语法，加个/deep/进行样式覆盖，如下所示：/deep/.custom-button{.van-button{background-color:blue;border-radius:10px;}
通俗理解IP地址概念：网络号、主机号、子网掩码与网段解析小小野猪网络-IP 网络 tcp/ip 服务器
通俗理解IP地址概念：网络号、主机号、子网掩码与网段解析网络号（NetworkID）主机号（HostID）子网掩码网段总结在互联网中，IP地址扮演着至关重要的角色，它是连接网络世界的桥梁。这里深入浅出地讲解几个关键概念：网络号、主机号、子网掩码以及网段，帮助大家构建起对IP地址体系的全面理解。网络号（NetworkID）含义：网络号是IP地址中用于识别设备所属网络的那一部分。就像你家的街道地址，告
LLM-1-chatglm-安装deepspeed报错愚昧之山绝望之谷开悟之坡 python 开发语言
安装pipinstalldeepspeed报错Lookinginindexes:https://mirror.baidu.com/pypi/simpleCollectingdeepspeedUsingcachedhttps://mirror.baidu.com/pypi/packages/9f/64/4a3643f61b15dbfec1cab0172f4bdae1d45e1ab3cd73bb060
Django学习笔记 mengmwng Django django 学习笔记
学习视频来源：最新Python的web开发全家桶代码仓库：https://gitee.com/m_engmeng/django-learning1.创建项目Django中项目会有一些默认的文件和文件夹1.1在终端打开终端进入某个目录(项目放在哪里)输入命令——创建项目(最后一个参数是项目名)django-adminstartprojectmysite继续输入——创建app（最后一个参数是app所处
Docker多架构镜像构建踩坑记一直学下去 docker 容器 adm64 arm64 多架构
背景公司为了做信创项目的亮点，需要将现有的一套在X86上运行的应用系统迁移到ARM服务器上运行，整个项目通过后端Java，前端VUEJS开发通过CICD做成Docker镜像在K8S里面运行。但是当前的CICD产品不支持ARM的镜像构建，于是只能手工构建ARM镜像。以下是一些踩坑的记录，希望能帮大家少踩坑构建环境本地电脑DeepinLinux23(Windows的WSL和其他的Linux系统都可以）
亲测解决unable to import torch, please install it if you want to pre-compile any deepspeed ops. 狂小虎 Windows 系统学习python Deep Learning python 人工智能 pytorch deepspeed
这个问题是小虎在win上下载deepspeed导致。原因是windows不支持deepspeed。问题背景unabletoimporttorch,pleaseinstallitifyouwanttopre-compileanydeepspeedops.DS_BUILD_OPS=1解决方法windows上面不能使用deepspeed，因为deepspeed用到了linux系统的libaio-dev模
How can I fix my Flask server‘s 405 error that includes OpenAi api? 营赢盈英 AI ai python javascript flask openai api
题意：解决包含OpenAIAPI的Flask服务器中出现的405错误（MethodNotAllowed，即方法不允许）问题背景：I'mtryingtoaddanAPItomywebpageandhaveneverusedanyFlaskserverbefore,IhaveneverusedJavascripttoosothisisacompletelybrandnewlearningexperie
Wi-Fi AP模式入门（基于ESP-IDF）弱冠少年嵌入式软件网络
主要参考资料：Wi-Fi库:https://docs.espressif.com/projects/esp-idf/zh_CN/v4.4/esp32s3/api-reference/network/esp_wifi.htmlESP-NETIF:https://docs.espressif.com/projects/esp-idf/zh_CN/v4.4/esp32s3/api-reference/n
2024年人工智能领域发生了哪些事儿？全球AI大事件1至12月盘点人工智能aigc
2024年，对人工智能（AI）而言是激动人心的一年。这一年不仅见证了AI技术的全面突破，也深刻改变了社会生活的方方面面。从金融到医疗、从教育到娱乐，AI的深度渗透无处不在。显然，这项技术已经从概念走向普及，并开始重新定义我们的未来。一月：人机交互技术的崭新开端2024年1月30日：Neuralink脑机接口植入Neuralink宣布，首名人类成功接受脑机接口芯片植入手术。这项手术由机器人完成，芯片
开源生态发展合作倡议操作系统
在信息技术发展的浪潮中，开源已成为全球创新的强劲引擎，深刻影响着各行各业的发展。今天，我们站在新的历史起点上，肩负着推动开源生态发展的重任。在此，开源欧拉（openEuler）、龙蜥（OpenAnolis）、鸥栖（OpenCloudOS）、开放麒麟（openKylin）、深度（deepin）五大操作系统开源社区携手并进，共同发起开源生态发展合作倡议，旨在书写开源生态繁荣的新篇章。在此，我们提出三点
DeepSeek 公开新的模型权重数据分析能量站机器学习人工智能
DeepSeek-V3是一款开源大语言模型，在关键基准测试中超越了Llama3.1405B和GPT-4o，尤其在编码和数学任务中成绩优异。除特定受限应用（军事、伤害未成年人、生成虚假信息等）外，模型权重开源，可在线下载。工作原理混合专家架构（MoE）：DeepSeek-V3是MoE型Transformer模型，有6710亿个参数，运行时370亿参数激活。相比Llama3.1405B，训练时间大幅缩
深度探索 DeepSeek-R1：国产大模型的AGI雏形与创新进展微凉的衣柜科技头条 agi 人工智能
随着人工智能技术的飞速发展，国内外企业纷纷发布了一系列创新的大模型，推动了AGI（通用人工智能）领域的探索。近期，DeepSeek-R1这一模型的发布引起了广泛关注，它不仅标志着国产大模型在智能化上的一次重大突破，还提出了全新的训练方法，解决了过去依赖大量人类数据的问题。本篇文章将详细介绍DeepSeek-R1的核心优势、技术创新以及实际应用案例，揭示它在AGI领域的潜力。1.DeepSeek-R
AI界的拼多多-中国人工智能初创公司DeepSeek如何与硅谷巨头竞争 xidianjiapei001 AI-人工智能与大模型人工智能 AI DeepSeek 大模型
这家公司打造出了一款成本更低且颇具竞争力的聊天机器人，其使用的高端计算机芯片数量少于谷歌和OpenAI等美国巨头企业，这凸显出芯片出口管制的局限性。圣诞节次日，一家名为DeepSeek的中国小型初创公司推出了一款新的人工智能系统，其性能可与OpenAI和谷歌等公司的尖端聊天机器人相媲美。仅此一点就堪称一个里程碑。但这个名为DeepSeek-V3系统的研发团队称，他们迈出了更大的一步。在一篇解释该技
tensorlow中tensorboard可视化展示训练过程张登杰踩 tensorflow tensorboard tensorflow mnist 神经网络
importtensorflowastffromtensorflow.examples.tutorials.mnistimportinput_datamax_steps=1000#训练步数learning_rate=0.001#设置学习率dropout=0.9#神经元保留比例data_dir='./MNIST_data'#数据存放路径#minist数据集下载链接:https://pan.baidu
Linux dirname、basename 指令 weixin_30457465 操作系统
Linuxdirname、basename指令(2012-04-3021:44:53)转载▼标签：杂谈分类：linux一、dirname指令1、功能：从给定的包含绝对路径的文件名中去除文件名（非目录的部分），然后返回剩下的路径（目录的部分）2、用法：dirnamefilename例如下面几个例子（1）#dirname/etc/sysconfig/network-scripts/ifcfg-eth0
AAAI2024论文解读|Memory-Efficient Reversible Spiking Neural Networks-water-merged paixiaoxin 文献阅读论文合集脉冲神经网络可逆架构内存效率深度学习训练优化 AAAI
论文标题Memory-EfficientReversibleSpikingNeuralNetworks内存高效可逆脉冲神经网络论文链接Memory-EfficientReversibleSpikingNeuralNetworks论文下载论文作者HongZhang,YuZhang内容简介本文提出了一种可逆脉冲神经网络（RevSNN），旨在降低脉冲神经网络（SNNs）在训练过程中对中间激活和膜电位的内
深入详解神经网络的基础知识、工作原理以及应用【一】猿享天开人工智能基础知识学习深度学习神经网络人工智能
目录引言1.神经网络基础1.1感知器模型1.2多层感知器（MLP）示例：2.前馈神经网络（FeedforwardNeuralNetworks,FFNN）2.1结构与特点2.2训练过程2.3优化方法3.卷积神经网络（CNN）3.1基本概念3.2层类型3.3网络架构3.4应用领域3.5示例代码示例描述：4.循环神经网络（RNN）4.1基本概念4.2RNN结构4.3应用领域4.4示例代码示例描述：5.深
6. 马科维茨资产组合模型+政策意图AI金融智能体(DeepSeek-V3)增强方案（理论+Python实战） AI量金术师金融资产组合模型进化论人工智能金融 python 机器学习算法大数据数学建模
目录0.承前1.幻方量化&DeepSeek1.1Whatis幻方量化1.2WhatisDeepSeek2.重写AI金融智能体函数3.汇总代码4.反思4.1不足之处4.2提升思路5.启后0.承前本篇博文是对上一篇文章，链接:5.马科维茨资产组合模型+政策意图AI金融智能体(Qwen-Max)增强方案（理论+Python实战）的AI金融智能体更改为幻方量化DeepSeek-V3的尝试。唯一区别之处在于
docker-compose 部署Kong、PG、Konga qiandeqiande docker kong 容器
version:'2'networks:kong-net:driver:bridgeservices:kong-database:image:postgres:9.6container_name:kong-databaserestart:alwaysnetworks:-kong-netenvironment:POSTGRES_USER:kongPOSTGRES_DB:kongPOSTGRES_PA
Windows10环境vagrant+VirtualBox虚拟机无法创建私有网络的解决方案。 XiaoYu_3328 运维操作系统
报错信息==>default:Clearinganypreviouslysetnetworkinterfaces...Therewasanerrorwhileexecuting`VBoxManage`,aCLIusedbyVagrantforcontrollingVirtualBox.Thecommandandstderrisshownbelow.Command:["hostonlyif","cr
自动驾驶中的虚实迁移学习:降低对真实世界数据的依赖 AI架构设计之禅计算机软件编程原理与应用实践 java python javascript kotlin golang 架构人工智能
自动驾驶,迁移学习,虚实环境,数据效率,深度学习,强化学习1.背景介绍自动驾驶技术作为人工智能领域的重要应用之一，其发展离不开海量真实世界驾驶数据。然而，收集和标注真实世界驾驶数据成本高昂，且存在安全隐患。因此，如何降低对真实世界数据的依赖，提高自动驾驶系统的训练效率和安全性，成为一个亟待解决的关键问题。虚实迁移学习(Virtual-to-RealTransferLearning)作为一种新兴的机
【EI稳定检索、多届会议】2024年第四届自动化机械和设计工程国际研讨会(SAMDE 2024) 学术中心-龙老师国际会议征稿 #理工科自动化运维硬件工程云计算
The20244rdInternationalSymposiumonAutomationMachineryandDesignEngineering2024年第四届自动化机械和设计工程国际研讨会(SAMDE2024)大会介绍作为全球自动化机械和设计工程领域的重要盛会，SAMDE2024将汇聚众多顶尖的专家学者和企业工程师，共同探讨和分享自动化机械和设计工程领域的最新理论和技术成果。在会议的主题演讲环
千万年薪招揽AI大牛！罗福莉加盟小米，将如何改变其大模型战略？前端
近年来，人工智能(AI)领域发展迅速，其中大模型技术的突破更是引领着新一轮科技浪潮。AI代码生成器作为AI技术的重要应用，也正逐渐改变着软件开发的模式。1月18日，一则重磅消息震惊业界：DeepSeek开源大模型DeepSeek-V2的关键开发者之一罗福莉将加入小米，并可能领导小米大模型团队，年薪高达千万级别。这一举动不仅体现了小米对AI大模型技术的重视，也预示着小米在大模型领域的战略布局将迎来新
VLAN间路由配置实战 wespten 网络协议栈网络设备 5G 物联网网络工具开发网络
一、VLAN间路由1、VLAN隔离技术1.传统网络类型传统网络有三种类型：LAN–LocalAreaNetwork局域网；MAN–城域网；WAN–WideAreaNetwork广域网-internet；2.VLAN技术随着网络中计算机的数量越来越多，传统的以太网络开始面临广播泛滥以及安全性无法保障等各种问题。VLAN(VirtualLocalAreaNetwork)即虚拟局域网，是将一个物理的局域
在WSL 2 (Ubuntu 22.04)安装Docker Ce 启动错误解决梦想画家云原生工具软件 ubuntu docker linux wsl2
查看WSL版本在Windows命令提示符（CMD）或PowerShell中，你可以使用以下命令来查看已安装的WSL发行版及其版本信息：wsl-l-v(base)PSC:\Users\Lenovo>wsl-l-vNAMESTATEVERSION*Ubuntu-22.04Running2docker启动报错failedtostartdaemon:Errorinitializingnetworkcont
Java入门笔记（1）王磊鑫 java 笔记开发语言
引言在计算机编程的广袤宇宙中，Java无疑是一颗格外耀眼的恒星。那么，Java究竟是什么呢？Java是美国Sun公司（StanfordUniversityNetwork）在1995年推出的一门计算机高级编程语言。曾经辉煌的Sun公司在2009年被Oracle（甲骨文）公司收购，但Java的影响力并未因此而衰减。普遍认同Java的联合创始人之一詹姆斯·高斯林（JamesGosling）为“Java之
自学成才之路，DeepSeek R1 论文解读智识世界Intelligence 神经网络深度学习自然语言处理课程设计学习方法
DeepSeekR1的论文看完后，后劲很大。虽然我推荐所有人都去阅读一下，但我估计实际去读的人应该很少。今天把论文里的三个亮点，用通俗易懂地方式写出来，希望能让更多人了解这篇论文有多么重要。亮点一：告别“刷题班”，纯“实战”也能练出推理大神！我们平时学习，是不是经常要“刷题”？做大量的练习题，才能巩固知识，提高解题能力。以前训练AI模型，也差不多是这个套路，要先给AI“喂”大量的“习题”（监督数据
在docker中安装FastDFS容器，并且阿里云服务器配置童小纯项目部署(阿里云版)中间件大全---全面详解 docker 阿里云容器
1、拉取FastDFS镜像dockerpulldelron/fastdfs2、创建tracker容器dockerrun-dti--network=host--nametracker--privileged=true-v/var/fdfs/tracker:/var/fdfs-v/etc/localtime:/etc/localtimedelron/fastdfstracker3、创建storage容
【MotionCap】DROID-SLAM 1 ：介绍及安装等风来不如迎风去 AI入门与实战人工智能 SLAHMR DROID-SLAM
DROID-SLAM：DROID-SLAM:DeepVisualSLAMforMonocularDROID-SLAM：适用于单目、立体和RGB-D相机的深度视觉SLAMStereo,andRGB-DCamerashttps://arxiv.org/abs/2108.10869DROID-SLAM:DeepVisualSLAMforMonocular,Stereo,andRGB-DCamerasfi
DeepMind的新突破：GenCast 新加坡内哥谈技术人工智能大数据语言模型
每周跟踪AI热点新闻动向和震撼发展想要探索生成式人工智能的前沿进展吗？订阅我们的简报，深入解析最新的技术突破、实际应用案例和未来的趋势。与全球数同行一同，从行业内部的深度分析和实用指南中受益。不要错过这个机会，成为AI领域的领跑者。点击订阅，与未来同行！订阅：https://rengongzhineng.io/如今，人工智能（AI）在天气预报领域的表现已经可以与传统计算方法媲美。然而，AI模型的训
html 周华华 html
js 1，数组的排列 var arr=[1,4,234,43,52,]; for(var x=0;x<arr.length;x++){ for(var y=x-1;y<arr.length;y++){ if(arr[x]<arr[y]){ &
【Struts2 四】Struts2拦截器 bit1129 struts2拦截器
Struts2框架是基于拦截器实现的，可以对某个Action进行拦截，然后某些逻辑处理，拦截器相当于AOP里面的环绕通知，即在Action方法的执行之前和之后根据需要添加相应的逻辑。事实上，即使struts.xml没有任何关于拦截器的配置，Struts2也会为我们添加一组默认的拦截器，最常见的是，请求参数自动绑定到Action对应的字段上。 Struts2中自定义拦截器的步骤是：
make:cc 命令未找到解决方法 daizj linux 命令未知 make cc
安装rz sz程序时，报下面错误： [root@slave2 src]# make posix cc -O -DPOSIX -DMD=2 rz.c -o rz make: cc：命令未找到 make: *** [posix] 错误 127 系统：centos 6.6 环境：虚拟机错误原因：系统未安装gcc，这个是由于在安
Oracle之Job应用周凡杨 oracle job
最近写服务，服务上线后，需要写一个定时执行的SQL脚本，清理并更新数据库表里的数据，应用到了Oracle 的 Job的相关知识。在此总结一下。一：查看相关job信息 1、相关视图 dba_jobs all_jobs user_jobs dba_jobs_running 包含正在运行
多线程机制朱辉辉33 多线程
转至http://blog.csdn.net/lj70024/archive/2010/04/06/5455790.aspx 程序、进程和线程：程序是一段静态的代码，它是应用程序执行的蓝本。进程是程序的一次动态执行过程，它对应了从代码加载、执行至执行完毕的一个完整过程，这个过程也是进程本身从产生、发展至消亡的过程。线程是比进程更小的单位，一个进程执行过程中可以产生多个线程，每个线程有自身的
web报表工具FineReport使用中遇到的常见报错及解决办法（一）老A不折腾 web报表 finereport java报表报表工具
FineReport使用中遇到的常见报错及解决办法（一）这里写点抛砖引玉，希望大家能把自己整理的问题及解决方法晾出来，Mark一下，利人利己。出现问题先搜一下文档上有没有，再看看度娘有没有，再看看论坛有没有。有报错要看日志。下面简单罗列下常见的问题，大多文档上都有提到的。 1、address pool is full：含义：地址池满，连接数超过并发数上
mysql rpm安装后没有my.cnf 林鹤霄没有my.cnf
Linux下用rpm包安装的MySQL是不会安装/etc/my.cnf文件的，至于为什么没有这个文件而MySQL却也能正常启动和作用，在这儿有两个说法，第一种说法，my.cnf只是MySQL启动时的一个参数文件，可以没有它，这时MySQL会用内置的默认参数启动，第二种说法，MySQL在启动时自动使用/usr/share/mysql目录下的my-medium.cnf文件，这种说法仅限于r
Kindle Fire HDX root并安装谷歌服务框架之后仍无法登陆谷歌账号的问题 aigo root
原文：http://kindlefireforkid.com/how-to-setup-a-google-account-on-amazon-fire-tablet/ Step 4: Run ADB command from your PC On the PC, you need install Amazon Fire ADB driver and instal
javascript 中var提升的典型实例 alxw4616 JavaScript
// 刚刚在书上看到的一个小问题,很有意思.大家一起思考下吧 myname = 'global'; var fn = function () { console.log(myname); // undefined var myname = 'local'; console.log(myname); // local }; fn() // 上述代码实际上等同于以下代码 m
定时器和获取时间的使用百合不是茶时间的转换定时器
定时器:定时创建任务在游戏设计的时候用的比较多 Timer();定时器 TImerTask();Timer的子类由 Timer 安排为一次执行或重复执行的任务。定时器类Timer在java.util包中。使用时，先实例化，然后使用实例的schedule(TimerTask task, long delay)方法，设定
JDK1.5 Queue bijian1013 java thread java多线程 Queue
JDK1.5 Queue LinkedList： LinkedList不是同步的。如果多个线程同时访问列表，而其中至少一个线程从结构上修改了该列表，则它必须保持外部同步。（结构修改指添加或删除一个或多个元素的任何操作；仅设置元素的值不是结构修改。）这一般通过对自然封装该列表的对象进行同步操作来完成。如果不存在这样的对象，则应该使用 Collections.synchronizedList 方
http认证原理和https bijian1013 http https
一.基础介绍在URL前加https://前缀表明是用SSL加密的。你的电脑与服务器之间收发的信息传输将更加安全。 Web服务器启用SSL需要获得一个服务器证书并将该证书与要使用SSL的服务器绑定。 http和https使用的是完全不同的连接方式，用的端口也不一样,前者是80，后
【Java范型五】范型继承 bit1129 java
定义如下一个抽象的范型类，其中定义了两个范型参数，T1，T2 package com.tom.lang.generics; public abstract class SuperGenerics<T1, T2> { private T1 t1; private T2 t2; public abstract void doIt(T
【Nginx六】nginx.conf常用指令(Directive) bit1129 Directive
1. worker_processes 8; 表示Nginx将启动8个工作者进程，通过ps -ef|grep nginx,会发现有8个Nginx Worker Process在运行 nobody 53879 118449 0 Apr22 ? 00:26:15 nginx: worker process
lua 遍历Header头部 ronin47 lua header 遍历　
local headers = ngx.req.get_headers() ngx.say("headers begin", "<br/>") ngx.say("Host : ", he
java-32.通过交换a,b中的元素，使[序列a元素的和]与[序列b元素的和]之间的差最小(两数组的差最小)。 bylijinnan java
import java.util.Arrays; public class MinSumASumB { /** * Q32.有两个序列a,b，大小都为n,序列元素的值任意整数，无序. * * 要求：通过交换a,b中的元素，使[序列a元素的和]与[序列b元素的和]之间的差最小。 * 例如: * int[] a = {100,99,98,1,2,3
redis 开窍的石头 redis
在redis的redis.conf配置文件中找到# requirepass foobared 把它替换成requirepass 12356789 后边的12356789就是你的密码打开redis客户端输入config get requirepass 返回 redis 127.0.0.1:6379> config get requirepass 1) "require
[JAVA图像与图形]现有的GPU架构支持JAVA语言吗？ comsci java语言
无论是opengl还是cuda，都是建立在C语言体系架构基础上的，在未来，图像图形处理业务快速发展，相关领域市场不断扩大的情况下，我们JAVA语言系统怎么从这么庞大，且还在不断扩大的市场上分到一块蛋糕，是值得每个JAVAER认真思考和行动的事情
安装ubuntu14.04登录后花屏了怎么办 cuiyadll ubuntu
这个情况，一般属于显卡驱动问题。可以先尝试安装显卡的官方闭源驱动。按键盘三个键：CTRL + ALT + F1 进入终端，输入用户名和密码登录终端：安装amd的显卡驱动 sudo apt-get install fglrx 安装nvidia显卡驱动 sudo ap
SSL 与数字证书的基本概念和工作原理 darrenzhu 加密 ssl 证书密钥签名
SSL 与数字证书的基本概念和工作原理 http://www.linuxde.net/2012/03/8301.html SSL握手协议的目的是或最终结果是让客户端和服务器拥有一个共同的密钥，握手协议本身是基于非对称加密机制的，之后就使用共同的密钥基于对称加密机制进行信息交换。 http://www.ibm.com/developerworks/cn/webspher
Ubuntu设置ip的步骤 dcj3sjt126com ubuntu
在单位的一台机器完全装了Ubuntu Server，但回家只能在XP上VM一个，装的时候网卡是DHCP的，用ifconfig查了一下ip是192.168.92.128,可以ping通。转载不是错： Ubuntu命令行修改网络配置方法 /etc/network/interfaces打开后里面可设置DHCP或手动设置静态ip。前面auto eth0，让网卡开机自动挂载. 1. 以D
php包管理工具推荐 dcj3sjt126com PHP Composer
http://www.phpcomposer.com/ Composer是 PHP 用来管理依赖（dependency）关系的工具。你可以在自己的项目中声明所依赖的外部工具库（libraries），Composer 会帮你安装这些依赖的库文件。中文文档入门指南下载安装包列表 Composer 中国镜像
Gson使用四（TypeAdapter） eksliang json gson Gson自定义转换器 gsonTypeAdapter
转载请出自出处：http://eksliang.iteye.com/blog/2175595 一.概述 Gson的TypeAapter可以理解成自定义序列化和返序列化二、应用场景举例例如我们通常去注册时（那些外国网站），会让我们输入firstName，lastName,但是转到我们都
JQM控件之Navbar和Tabs gundumw100 html xml css
在JQM中使用导航栏Navbar是简单的。只需要将data-role="navbar"赋给div即可： <div data-role="navbar"> <ul> <li><a href="#" class="ui-btn-active&qu
利用归并排序算法对大文件进行排序 iwindyforest java 归并排序大文件分治法 Merge sort
归并排序算法介绍，请参照Wikipeida zh.wikipedia.org/wiki/%E5%BD%92%E5%B9%B6%E6%8E%92%E5%BA%8F 基本思想：大文件分割成行数相等的两个子文件，递归（归并排序）两个子文件，直到递归到分割成的子文件低于限制行数低于限制行数的子文件直接排序两个排序好的子文件归并到父文件直到最后所有排序好的父文件归并到输入
iOS UIWebView URL拦截啸笑天 UIWebView
本文译者：candeladiao，原文：URL filtering for UIWebView on the iPhone说明：译者在做app开发时，因为页面的javascript文件比较大导致加载速度很慢，所以想把javascript文件打包在app里，当UIWebView需要加载该脚本时就从app本地读取，但UIWebView并不支持加载本地资源。最后从下文中找到了解决方法，第一次翻译，难免有
索引的碎片整理SQL语句 macroli sql
SET NOCOUNT ON DECLARE @tablename VARCHAR (128) DECLARE @execstr VARCHAR (255) DECLARE @objectid INT DECLARE @indexid INT DECLARE @frag DECIMAL DECLARE @maxfrag DECIMAL --设置最大允许的碎片数量,超过则对索引进行碎片
Angularjs同步操作http请求with $promise qiaolevip 每天进步一点点学习永无止境 AngularJS 纵观千象
// Define a factory app.factory('profilePromise', ['$q', 'AccountService', function($q, AccountService) { var deferred = $q.defer(); AccountService.getProfile().then(function(res) {
hibernate联合查询问题 sxj19881213 sql Hibernate HQL 联合查询
最近在用hibernate做项目，遇到了联合查询的问题，以及联合查询中的N+1问题。针对无外键关联的联合查询，我做了HQL和SQL的实验，希望能帮助到大家。（我使用的版本是hibernate3.3.2） 1 几个常识：（1）hql中的几种join查询，只有在外键关联、并且作了相应配置时才能使用。（2）hql的默认查询策略，在进行联合查询时，会产
struts2.xml wuai struts
<?xml version="1.0" encoding="UTF-8" ?> <!DOCTYPE struts PUBLIC "-//Apache Software Foundation//DTD Struts Configuration 2.3//EN" "http://struts.apache


Raw images	PCA dimension-reduced images (99% variance)	PCA dimension-reduced images (90% variance)


Covariance for PCA whitening with regularization	Covariance for PCA whitening without regularization