garfielder007

牛津大学视觉几何组VGG卷积神经网络实践教程VGG Convolutional Neural Networks Practical

VGG Convolutional Neural Networks Practical

By Andrea Vedaldi and Andrew Zisserman

This is an Oxford Visual Geometry Group computer vision practical, authored by Andrea Vedaldi and Andrew Zisserman (Release 2015a).

Convolutional neural networks are an important class of learnable representations applicable, among others, to numerous computer vision problems. Deep CNNs, in particular, are composed of several layers of processing, each involving linear as well as non-linear operators, that are learned jointly, in an end-to-end manner, to solve a particular tasks. These methods are now the dominant approach for feature extraction from audiovisual and textual data.

This practical explores the basics of learning (deep) CNNs. The first part introduces typical CNN building blocks, such as ReLU units and linear filters, with a particular emphasis on understanding back-propagation. The second part looks at learning two basic CNNs. The first one is a simple non-linear filter capturing particular image structures, while the second one is a network that recognises typewritten characters (using a variety of different fonts). These examples illustrate the use of stochastic gradient descent with momentum, the definition of an objective function, the construction of mini-batches of data, and data jittering. The last part shows how powerful CNN models can be downloaded off-the-shelf and used directly in applications, bypassing the expensive training process.

VGG Convolutional Neural Networks Practical
- Getting started
- Part 1: CNN building blocks
  - Part 1.1: convolution
  - Part 1.2: non-linear gating
  - Part 1.3: pooling
  - Part 1.4: normalisation
- Part 2: back-propagation and derivatives
  - Part 2.1: the theory of back-propagation
  - Part 2.1: using back-propagation in practice
- Part 3: learning a tiny CNN
  - Part 3.1: training data and labels
  - Part 3.2: image preprocessing
  - Part 3.3: learning with gradient descent
  - Part 3.4: experimenting with the tiny CNN
- Part 4: learning a character CNN
  - Part 4.1: prepare the data
  - Part 4.2: intialize a CNN architecture
  - Part 4.3: train and evaluate the CNN
  - Part 4.4: visualise the learned filters
  - Part 4.5: apply the model
  - Part 4.6: training with jitter
  - Part 4.7: Training using the GPU
- Part 5: using pretrained models
  - Part 5.1: load a pre-trained model
  - Part 5.2: use the model to classify an image
- Links and further work
- Acknowledgements
- History

Getting started

Read and understand the requirements and installation instructions. The download links for this practical are:

Code and data: practical-cnn-2015a.tar.gz
Code only: practical-cnn-2015a-code-only.tar.gz
Data only: practical-cnn-2015a-data-only.tar.gz
Git repository (for lab setters and developers)

After the installation is complete, open and edit the script exercise1.m in the MATLAB editor. The script contains commented code and a description for all steps of this exercise, forPart I of this document. You can cut and paste this code into the MATLAB window to run it, and will need to modify it as you go through the session. Other files exercise2.m,exercise3.m, and exercise4.m are given for Part II, III, and IV.

Each part contains several Questions (that require pen and paper) and Tasks (that require experimentation or coding) to be answered/completed before proceeding further in the practical.

Part 1: CNN building blocks

Part 1.1: convolution

A feed-forward neural network can be thought of as the composition of number of functions

f (x) = f L (\dots f 2 (f 1 (x; w 1); w 2) \dots), w L) .

Each function

fl fl takes as input a datum

xl xl and a parameter vector

wl wl and produces as output a datum

xl+1 xl+1. While the type and sequence of functions is usually handcrafted, the parameters

w=(w1,…,wL) w=(w1,…,wL) are learned from data in order to solve a target problem, for example classifying images or sounds.

In a convolutional neural network data and functions have additional structure. The data x1,…,xn are images, sounds, or more in general maps from a lattice1 to one or more real numbers. In particular, since the rest of the practical will focus on computer vision applications, data will be 2D arrays of pixels. Formally, each xi will be a M×N×K real array of M×N pixels and K channels per pixel. Hence the first two dimensions of the array span space, while the last one spans channels. Note that only the input x=x1 of the network is an actual image, while the remaining data are intermediate feature maps.

The second property of a CNN is that the functions fl have a convolutional structure. This means that fl applies to the input map xl an operator that is local and translation invariant. Examples of convolutional operators are applying a bank of linear filters to xl .

In this part we will familiarise ourselves with a number of such convolutional and non-linear operators. The first one is the regular linear convolution by a filter bank. We will start by focusing our attention on a single function relation as follows:

f : R M \times N \times K \to R M' \times N' \times K', x \mapsto y .

Open the example1.m file, select the following part of the code, and execute it in MATLAB (right button > Evaluate selection or Shift+F7).

% Read an example image
x = imread('peppers.png') ;

% Convert to single format
x = im2single(x) ;

% Visualize the input x
figure(1) ; clf ; imagesc(x)

This should display an image of bell peppers in Figure 1:

Use MATLAB size command to obtain the size of the array x. Note that the array x is converted to single precision format. This is because the underlying MatConvNet assumes that data is in single precision.

Question. The third dimension of x is 3. Why?

Now we will create a bank 10 of 5×5×3 filters.

% Create a bank of linear filters
w = randn(5,5,3,10,'single') ;

The filters are in single precision as well. Note that w has four dimensions, packing 10 filters. Note also that each filter is not flat, but rather a volume with three layers. The next step is applying the filter to the image. This uses the vl_nnconv function from MatConvNet:

% Apply the convolution operator
y = vl_nnconv(x, w, []) ;

Remark: You might have noticed that the third argument to the vl_nnconv function is the empty matrix []. It can be otherwise used to pass a vector of bias terms to add to the output of each filter.

The variable y contains the output of the convolution. Note that the filters are three-dimensional, in the sense that it operates on a map x with K channels. Furthermore, there are K′ such filters, generating a K′ dimensional map y as follows

y i' j' k' = \sum i j k w i j k k' x i + i', j + j', k

Questions: Study carefully this expression and answer the following:

Given that the input map x has M×N×K dimensions and that each of the K′ filters has dimension Mf×Nf×K , what is the dimension of y ?

Note that x is indexed by i+i′ and j+j′ , but that there is no plus sign between k and k′ . Why?

Task: check that the size of the variable y matches your calculations.

We can now visualise the output y of the convolution. In order to do this, use the vl_imarraysc function to display an image for each feature channel in y:

% Visualize the output y
figure(2) ; clf ; vl_imarraysc(y) ; colormap gray ;

Question: Study the feature channels obtained. Most will likely contain a strong response in correspondences of edges in the input image x. Recall that w was obtained by drawing random numbers from a Gaussian distribution. Can you explain this phenomenon?

So far filters preserve the resolution of the input feature map. However, it is often useful to downsample the output. This can be obtained by using the stride option in vl_nnconv:

% Try again, downsampling the output
y_ds = vl_nnconv(x, w, [], 'stride', 16) ;
figure(3) ; clf ; vl_imarraysc(y_ds) ; colormap gray ;

As you should have noticed in a question above, applying a filter to an image or feature map interacts with the boundaries, making the output map smaller by an amount proportional to the size of the filters. If this is undesirable, then the input array can be padded with zeros by using the pad option:

% Try padding
y_pad = vl_nnconv(x, w, [], 'pad', 4) ;
figure(4) ; clf ; vl_imarraysc(y_pad) ; colormap gray ;

Task: Convince yourself that the previous code’s output has different boundaries compared to the code that does not use padding. Can you explain the result?

In order to consolidate what has been learned so far, we will now design a filter by hand:

w = [0  1 0 ;
     1 -4 1 ;
     0  1 0 ] ;
w = single(repmat(w, [1, 1, 3])) ;
y_lap = vl_nnconv(x, w, []) ;

figure(5) ; clf ; colormap gray ;
subplot(1,2,1) ; 
imagesc(y_lap) ; title('filter output') ;
subplot(1,2,2) ;
imagesc(-abs(y_lap)) ; title('- abs(filter output)') ;

Questions:

What filter have we implemented?

How are the RGB colour channels processed by this filter?

What image structure are detected?

Part 1.2: non-linear gating

As we stated in the introduction, CNNs are obtained by composing several different functions. In addition to the linear filters shown in the previous part, there are several non-linear operators as well.

Question: Some of the functions in a CNN must be non-linear. Why?

The simplest non-linearity is obtained by following a linear filter by a non-linear gating function, applied identically to each component (i.e. point-wise) of a feature map. The simplest such function is the Rectified Linear Unit (ReLU)

y i j k = max {0, x i j k} .

This function is implemented by vl_relu; let’s try this out:

w = single(repmat([1 0 -1], [1, 1, 3])) ;
w = cat(4, w, -w) ;
y = vl_nnconv(x, w, []) ;
z = vl_nnrelu(y) ;

figure(6) ; clf ; colormap gray ;
subplot(1,2,1) ; vl_imarraysc(y) ;
subplot(1,2,2) ; vl_imarraysc(z) ;

Tasks:

Run the code above and understand what the filter w is doing.

Explain the final result z .

Part 1.3: pooling

There are several other important operators in a CNN. One of them is pooling. A pooling operator operates on individual feature channels, coalescing nearby feature values into one by the application of a suitable operator. Common choices include max-pooling (using the max operator) or sum-pooling (using summation). For example, max-pooling is defined as:

y i j k = max {y i' j' k : i \leq i' < i + p, j \leq j' < j + p}

Max pooling is implemented by the vl_nnpool function. Try this now:

y = vl_nnpool(x, 15) ;
figure(6) ; clf ; imagesc(y) ;

Question: look at the resulting image. Can you interpret the result?

The function vl_nnpool supports subsampling and padding just like vl_nnconv. However, for max-pooling feature maps are padded with the value −∞ instead of 0. Why?

Part 1.4: normalisation

Another important CNN building block is channel-wise normalisation. This operator normalises the vector of feature channels at each spatial location in the input map x . The form of the normalisation operator is actually rather curious:

y i j k' = x i j k ( κ + α \sum k \in G ( k ' ) x 2 i j k ) β

where

G(k)=[k−⌊ρ2⌋,k+⌈ρ2⌉]∩{1,2,…,K} G(k)=[k−⌊ρ2⌋,k+⌈ρ2⌉]∩{1,2,…,K} is a group of

ρ ρ consecutive feature channels in the input map.

Task: Understand what this operator is doing. How would you set κ , α and β to achieve simple L2 normalisation?

Now let’s try this out:

rho = 5 ;
kappa = 0 ;
alpha = 1 ;
beta = 0.5 ;
y_nrm = vl_nnnormalize(x, [rho kappa alpha beta]) ;
figure(6) ; clf ; imagesc(y_nrm) ;

Tasks:

Inspect the figure just obtained. Can you interpret it?

Compute the L2 norm of the feature channels in the output map y_nrm. What do you notice?

Explain this result in relation to the particular choice of the parameters ρ , κ , α and β .

Part 2: back-propagation and derivatives

The parameters of a CNN w=(w1,…wL) should be learned in such a manner that the overall CNN function z=f(x;w) achieves a desired goal. In some cases, the goal is to model the distribution of the data, which leads to a generative objective. Here, however, we will use f as a regressor and obtain it by minimising a discriminative objective. In simple terms, we are given:

examples of the desired input-output relations (x1,z1),…,(xn,zn) where xi are input data and zi corresponding output values;
and a loss ℓ(z,z^) that expresses the penalty for predicting z^ instead of z .

We use those to write the empirical loss of the CNN f by averaging over the examples:

L (w) = 1 n \sum i = 1 n ℓ (z i, f (x i; w))

Note that the composition of the function

f f with the loss

ℓ ℓ can be though of as a CNN with one more layer (called a loss layer). Hence, with a slight abuse of notation, in the rest of this part we incorporate the loss in the function

f f (which therefore is a map

X→R X→R) and do not talk about it explicitly anymore.

The simplest algorithm to minimise L , and in fact one that is used in practice, is gradient descent. The idea is simple: compute the gradient of the objective L at a current solution wt and then update the latter along the direction of fastest descent of L :

w t + 1 = w t - η t \partial f \partial w (w t)

where

ηt∈R+ ηt∈R+ is the learning rate.

Part 2.1: the theory of back-propagation

The basic computational problem to solve is the calculation of the gradient of the function with respect to the parameter w . Since f is the composition of several functions, the key ingredient is the chain rule:

\partial f \partial w l = \partial \partial w l f L (\dots f 2 (f 1 (x; w 1); w 2) \dots), w L) = \partial vec f L \partial vec x ⊤ L \partial vec f L - 1 \partial vec x ⊤ L - 1 \dots \partial vec f l + 1 \partial vec x ⊤ l + 1 \partial vec f l \partial w ⊤ l

The notation requires some explanation. Recall that each function

fl fl is a map from a

M×N×K M×N×K array to a

M′×N′×K′ M′×N′×K′ array. The operator

vec vec vectorises such arrays by stacking their elements in a column vector (the stacking order is arbitrary but conventionally column-major). The symbol

∂vecfl/∂vecx⊤l ∂vec⁡fl/∂vec⁡xl⊤ then denotes the derivative of a column vector of output variables by a row vector of input variables. Note that

wl wl is already assumed to be a column vector so it does not require explicit vectorisation.

Questions: Make sure you understand the structure of this formula and answer the following:

∂vecfl/∂vecx⊤l is a matrix. What are its dimensions?

The formula can be rewritten with a slightly different notation by replacing the symbols fl with the symbols xl+1 . If you do so, do you notice any formal cancellation?

The formula only includes the derivative symbols. However, these derivatives must be computed at a well defined point. What is this point?

To apply the chain rule we must be able to compute, for each function fl , its derivative with respect to the parameters wl as well as its input xl . While this could be done naively, a problem is the very high dimensionality of the matrices involved in this calculation as these are M′N′K′×MNK arrays. We will now introduce a “trick” that allows this to be reduced to working with MNK numbers only and which will yield the back-propagation algorithm.

The key observation is that we are not after ∂vecfl/∂w⊤l but after ∂f/∂w⊤l :

\partial f \partial w ⊤ l = \partial g l + 1 \partial vec x ⊤ l + 1 \partial vec f l \partial w ⊤ l

where

gl+1=fL∘⋯∘fl+1 gl+1=fL∘⋯∘fl+1 is the “tail” of the CNN.

Question: Explain why the dimensions of the vectors ∂gl+1/∂vecxl+1 and ∂f/∂wl equals the number of elements in xl+1 and wl respectively. Hence, in particular, the symbol ∂gl+1/∂xl+1 (without vectorisation) denotes an array with the same size of xl+1 .

Hint: recall that the last layer is the loss.

Hence the algorithm can focus on computing the derivatives of gl instead of fl which are far lower-dimensional. To see how this can be done iteratively, decompose gl as:

x l ⟶ f l ↑ w l ⟶ x l + 1 ⟶ g l + 1 ⟶ x L

Then the key of the iteration is obtaining the derivatives for layer

l l given the ones for layer

l+1 l+1:

Input:
- the derivative ∂gl+1/∂xl+1 .
Output:
- the derivative ∂gl/∂xl
- the derivative ∂gl/∂wl

Question: Suppose that fl is the function xl+1=Axl where xl and xl+1 are column vectors. Suppose that B=∂gl+1/∂xl+1 is given. Derive an expression for C=∂gl/∂xl and an expression for D=∂gl/∂wl .

Part 2.1: using back-propagation in practice

A key feature of MatConvNet and similar neural network packages is the ability to support back-propagation. In order to do so, lets focus on a single computational block f , followed by a function g :

x ⟶ f ↑ w ⟶ y ⟶ g ⟶ z

where

z z is assumed to be a scalar. Then each computation block (for example vl_nnconv or vl_nnpool) can compute

∂z/∂x ∂z/∂x and

∂z/∂w ∂z/∂w given as input

x x and

∂z/∂y ∂z/∂y. Let’s put this into practice:

% Read an example image
x = im2single(imread('peppers.png')) ;

% Create a bank of linear filters and apply them to the image
w = randn(5,5,3,10,'single') ;
y = vl_nnconv(x, w, []) ;

% Create the derivative dz/dy
dzdy = randn(size(y), 'single') ;

% Back-propagation
[dzdx, dzdw] = vl_nnconv(x, w, [], dzdy) ;

Task: Run the code above and check the dimensions of dzdx and dzdy. Does this matches your expectations?

An advantage of this modular view is that new building blocks can be coded and added to the architecture in a simple manner. However, it is easy to make mistakes in the calculation of complex derivatives. Hence, it is a good idea to verify results numerically. Consider the following piece of code:

% Check the derivative numerically
ex = randn(size(x), 'single') ;
eta = 0.0001 ;
xp = x + eta * ex  ;
yp = vl_nnconv(xp, w, []) ;

dzdx_empirical = sum(dzdy(:) .* (yp(:) - y(:)) / eta) ;
dzdx_computed = sum(dzdx(:) .* ex(:)) ;

fprintf(...
  'der: empirical: %f, computed: %f, error: %.2f %%\n', ...
  dzdx_empirical, dzdx_computed, ...
  abs(1 - dzdx_empirical/dzdx_computed)*100) ;

Questions:

What is the meaning of ex in the code above?

What are the derivatives dzdx_empirical and dzdx_computed?

Tasks:

Run the code and convince yourself that vl_nnconv derivatives is (probably) correct.

Create a new version of this code to test the derivative calculation with respect to w .

We are now ready to build our first elementary CNN, composed of just two layers, and to compute its derivatives:

% Parameters of the CNN
w1 = randn(5,5,3,10,'single') ;
rho2 = 10 ;

% Run the CNN forward
x1 = im2single(imread('peppers.png')) ;
x2 = vl_nnconv(x1, w1, []) ;
x3 = vl_nnpool(x2, rho2) ;

% Create the derivative dz/dx3
dzdx3 = randn(size(x3), 'single') ;

% Run the CNN backward
dzdx2 = vl_nnpool(x2, rho2, dzdx3) ;
[dzdx1, dzdw1] = vl_nnconv(x1, w1, [], dzdx2) ;

Question: Note that the last derivative in the CNN is dzdx3. Here, for the sake of the example, this derivative is initialised randomly. In a practical application, what would this derivative represent?

We can now use the same technique as before to check that the derivative computed through back-propagation are correct.

% Check the derivative numerically
ew1 = randn(size(w1), 'single') ;
eta = 0.0001 ;
w1p = w1 + eta * ew1  ;

x1p = x1 ;
x2p = vl_nnconv(x1p, w1p, []) ;
x3p = vl_nnpool(x2p, rho2) ;

dzdw1_empirical = sum(dzdx3(:) .* (x3p(:) - x3(:)) / eta) ;
dzdw1_computed = sum(dzdw1(:) .* ew1(:)) ;

fprintf(...
  'der: empirical: %f, computed: %f, error: %.2f %%\n', ...
  dzdw1_empirical, dzdw1_computed, ...
  abs(1 - dzdw1_empirical/dzdw1_computed)*100) ;

Part 3: learning a tiny CNN

In this part we will learn a very simple CNN. The CNN is composed of exactly two layers: a convolutional layer and a max-pooling layer:

x 2 = W * x 1 + b, x 3 = maxpool ρ x 2 .

W W contains a single

3×3 3×3 square filter, so that

b b is a scalar. and the input image

x=x1 x=x1 has a single channel.

Task

Open the file tinycnn.m and inspect the code. Convince yourself that the code computes the CNN just described.

Look at the paddings used in the code. If the input image x1 has dimensions M×N , what is the dimension of the output feature map x3 ?

In the rest of the section we will learn the CNN parameters in order to extract blob-like structures from images, such as the ones in the following image:

Part 3.1: training data and labels

The first step is to load the image data/dots.jpg and to use the supplied extractBlackBlobs function to extract all the black dots in the image.

% Load an image
im = rgb2gray(im2single(imread('data/dots.jpg'))) ;

% Compute the location of black blobs in the image
[pos,neg] = extractBlackBlobs(im) ;

The arrays pos and neg contain now pixel labels and will be used as annotations for the supervised training of the CNN. These annotations can be visualised as follows:

figure(1) ; clf ; 
subplot(1,3,1) ; imagesc(im) ; axis equal ; title('image') ;
subplot(1,3,2) ; imagesc(pos) ; axis equal ; title('positive points (blob centres)') ;
subplot(1,3,3) ; imagesc(neg) ; axis equal ; title('negative points (not a blob)') ;
colormap gray ;

Task: Inspect pos and neg and convince yourself that:

pos contains a single true value in correspondence of each blob centre;

neg contains a true value for each pixel sufficiently far away from a blob.

Are there pixels for which both pos and neg evaluate to false?

Part 3.2: image preprocessing

Before we attempt to train the CNN, the image is pre-processed to remove its mean value. It is also smoothed by applying a Gaussian kernel of standard deviation 3 pixels:

% Pre-smooth the image
im = vl_imsmooth(im,3) ;

% Subtract median value
im = im - median(im(:)) ;

We will come back to this preprocessing steps later.

Part 3.3: learning with gradient descent

We will now setup a learning problem to learn W and b to detect black blobs in images. Recall that the CNN computes for each image pixel (u,v) a score f(x;w,b)(u,v) . We would like this score to be:

at least as large as 1 for any pixel that is marked as a blob centre (pos or (u,v)∈P ) and
at most zero for any pixel that is marked as being far away from a blob (neg or (u,v)∈N ).

We do so by defining and then optimising the following objective function:

E (w, b) = λ 2 ∥ w ∥ 2 + 1 | P | \sum (u, v) \in P max {0, 1 - f (x; w, b) (u, v)} + 1 | N | \sum (u, v) \in N max {0, f (x; w, b) (u, v)} .

Questions:

What can you say about the score of each pixel if λ=0 and E(w,b)=0 ?

Note that the objective enforces a margin between the scores of the positive and negative pixels. How much is this margin?

We can now train the CNN by minimising the objective function with respect to w and b . We do so by using an algorithm called gradient descent with momentum. Given the current solution (wt,bt) and update it , this is updated to (wt+1,bt) by following the direction of fastest descent as given by the negative gradient −∇E(wt,bt) of the objective. However, gradient updates are smoothed by considering a momentum term (w¯t,μ¯t) , yielding the update equations

w ¯ t + 1 \leftarrow μ w ¯ t + η \partial E \partial w t, w t + 1 \leftarrow w t - w ¯ t .

and similarly for the bias term. Here

μ μ is the momentum rate and

η η the learning rate.

Questions:

Explain why the momentum rate must be smaller than 1. What is the effect of having a momentum rate close to 1?

The learning rate establishes how fast the algorithm will try to minimise the objective function. Can you see any problem with a large learning rate?

The parameters of the algorithm are set as follows:

numIterations = 500 ;
rate = 5 ;
momentum = 0.9 ;
shrinkRate = 0.0001 ;
plotPeriod = 10 ;

Tasks:

Inspect the code in the file exercise3.m. Convince yourself that the code is implementing the algorithm described above. Pay particular attention at the forward and backward passes as well as at how the objective function and its derivatives are computed.

Run the algorithm and observe the results. Then answer the following questions:

The learned filter should resemble the discretisation of a well-known differential operator. Which one?

What is the average of the filter values compared to the average of the absolute values?

Run the algorithm again and observe the evolution of the histograms of the score of the positive and negative pixels in relation to the values 0 and 1. Answer the following:

Is the objective function minimised monotonically?

As the histograms evolve, can you identify at least two “phases” in the optimisation?

Once converged, do the score distribute in the manner that you would expect?

Hint: the plotPeriod option can be changed to plot the diagnostic figure with a higher or lower frequency; this can significantly affect the speed of the algorithm.

Part 3.4: experimenting with the tiny CNN

In this part we will experiment with several variants of the network just learned. First, we study the effect of the image smoothing:

Task: Train again the tiny CNN without smoothing the input image in preprocessing. Answer the following questions:

Is the learned filter very different from the one learned before?

If so, can you figure out what “went wrong”?

Look carefully at the output of the first layer, magnifying with the loupe tool. Is the maximal filter response attained in the middle of each blob?

Hint: The Laplacian of Gaussian operator responds maximally at the centre of a blob only if the latter matches the blob size. Relate this fact to the combination of pre-smoothing the image and applying the learned 3×3 filter.

Now restore the smoothing but switch off subtracting the median from the input image.

Task: Train again the tiny CNN without subtracting the median value in preprocessing. Answer the following questions:

Does the algorithm converge?

Reduce a hundred-fold the learning are and increase the maximum number of iterations by an equal amount. Does it get better?

Explain why adding a constant to the input image can have such a dramatic effect on the performance of the optimisation.

Hint: What constraint should the filter w satisfy if the filter output should be zero when (i) the input image is zero or (ii) the input image is a large constant? Do you think that it would be easy for gradient descent to enforce (ii) at all times?

What you have just witnessed is actually a fairly general principle: centring the data usually makes learning problems much better conditioned.

Now we will explore several parameters in the algorithms:

Task: Restore the preprocessing as given in experiment4.m. Try the following:

Try increasing the learning rate eta. Can you achieve a better value of the energy in the 500 iterations?

Disable momentum by setting momentum = 0. Now try to beat the result obtained above by choosing eta. Can you succeed?

Finally, consider the regularisation effect of shrinking:

Task: Restore the learning rate and momentum as given in experiment4.m. Then increase the shrinkage factor tenfold and a hundred-fold.

What is the effect on the convergence speed?

What is the effect on the final value of the total objective function and of the average loss part of it?

Part 4: learning a character CNN

In this part we will learn a CNN to recognise images of characters.

Part 4.1: prepare the data

Open up exercise4.m and execute Part 4.1. The code loads a structure imdb containing images of the characters a, b, …, z rendered using approximately 931 fonts downloaded from the Google Fonts Project. Look at the imdb.images substructure:

>> imdb.images
ans = 
       id: [1x24206 double]
     data: [32x32x24206 single]
    label: [1x24206 double]
      set: [1x24206 double]

These are stored as the array imdb.images.id is a 24,206-dimensional vector of numeric IDs for each of the 24,206 character images in the dataset. imdb.images.data contains a 32×32 image for each character, stored as a slide of a 32×32×24,206 -dimensional array. imdb.images.label is a vector of image labels, denoting which one of the 26 possible characters it is. imdb.images.set is equal to 1 for each image that should be used to train the CNN and to 2 for each image that should be used for validation.

Task: look at the Figure 1 generated by the code and at the code itself and make sure that you understand what you are looking at.

Part 4.2: intialize a CNN architecture

The function initializeCharacterCNN.m creates a CNN initialised with random weights that will be trained to recognise character images.

Tasks:

By inspecting initializeCharacterCNN.m get a sense of the architecture that will be trained. How many layers are there? How big are the filters?

Use the function vl_simplenn_display to produce a table summarising the architecture.

Note that the penultimate layer has 26 output dimensions, one for each character. Character recognition looks at the maximal output to identify which character is processed by the network.

However, the last network layer is vl_nnsoftmaxloss, which in turn is a combination of the vl_nnsoftmax function and of the classification log-loss vl_nnloss. The softmaxoperator is given by

y i j k' = e x i j k ' \sum k e x i j k

whereas the log-loss is given by

y i j = - log x i j c i j

where

cij cij is the index of the ground-truth class at spatial location

(i,j) (i,j).

Remark: While in MatConvNet all operators are convolutional, in this case the network is configured such that the output of the classification layer is a 1×1×26 -dimensional feature map, i.e. there remains only one spatial location.

Tasks:

Understand what the softmax operator does. Hint: to use the log-loss the data must be in the (0, 1] interval.

Understand what is the effect of minimising the log-loss. Which neural response should become larger?

Why do you think MatConvNet provides a third function vl_nnsoftmaxloss combining both functions into a single layer?

Part 4.3: train and evaluate the CNN

We are now ready to train the CNN. To this end we use the example SGD implementation in MatConvNet (examples/cnn_train.m). This function requires some options:

trainOpts.batchSize = 100 ;
trainOpts.numEpochs = 100 ;
trainOpts.continue = true ;
trainOpts.useGpu = false ;
trainOpts.learningRate = 0.001 ;
trainOpts.numEpochs = 15 ;
trainOpts.expDir = 'data/chars-experiment' ;

This says that the function will operate on SGD mini-batches of 100 elements, it will run for 15 epochs (passes through the data), it will continue from the last epoch if interrupted, if will not use the GPU, it will use a learning rate of 0.001, and it will save any file in the data/chars-experiment subdirectory.

Before the training starts, the average image value is subtracted:

% Take the average image out
imageMean = mean(imdb.images.data(:)) ;
imdb.images.data = imdb.images.data - imageMean ;

This is similar to what we have done in Part 3.

The training code is called as follows:

% Call training function in MatConvNet
[net,info] = cnn_train(net, imdb, @getBatch, trainOpts) ;

Here the key, in addition to the trainOpts structure, is the @getBatch function handle. This is how cnn_train obtains a copy of the data to operate on. Examine this function (see the bottom of the exercise4.m file):

function [im, labels] = getBatch(imdb, batch)
im = imdb.images.data(:,:,batch) ;
im = 256 * reshape(im, 32, 32, 1, []) ;
labels = imdb.images.label(1,batch) ;

The function extracts the m images corresponding to the vector of indexes batch. It also reshape them as a 32×32×1×m array (as this is the format expected by the MatConvNet functions) and multiplies the values by 256 (the resulting values match the network initialisation and learning parameters). Finally, it also returns a vector of labels, one for each image in the batch.

Task: Run the learning code and examine the plots that are produced. As training completes answer the following questions:

How many images per second can you process? (Look at the output in the MATLAB screen)

There are two sets of curves: energy and prediction error. What do you think is the difference? What is the “energy”?

Some curves are labelled “train” and some other “val”. Should they be equal? Which one should be lower than the other?

Both the top-1 and top-5 prediction errors are plotted. What do they mean? What is the difference?

Once training is finished, the model is saved back:

% Save the result for later use
net.layers(end) = [] ;
net.imageMean = imageMean ;
save('data/chars-experiment/charscnn.mat', '-struct', 'net') ;

Note that we remember the imageMean for later use. Note also that the softmaxloss layer is removed from the network before saving.

Part 4.4: visualise the learned filters

The next step is to glance at the filters that have been learned:

figure(2) ; clf ; colormap gray ;
vl_imarraysc(squeeze(net.layers{1}.filters),'spacing',2)
axis equal ;
title('filters in the first layer') ;

Task: what can you say about the filters?

Part 4.5: apply the model

We now apply the model to a whole sequence of characters. This is the image data/sentence-lato.png:

% Load the CNN learned before
net = load('data/chars-experiment/charscnn.mat') ;

% Load the sentence
im = im2single(imread('data/sentence-lato.png')) ;
im = 256 * (im - net.imageMean) ;

% Apply the CNN to the larger image
res = vl_simplenn(net, im) ;

Question: The image is much wider than 32 pixels. Why can you apply to it the CNN learned before for 32×32 patches?

Task: examine the size of the CNN output using size(res(end).x). Does this match your expectation?

Now use the decodeCharacters() function to visualise the results:

% Visualize the results
figure(3) ; clf ;
decodeCharacters(net, imdb, im, res) ;

Tasks: inspect the output of the decodeCharacters() function and answer the following:

Is the quality of the recognition any good?

Does this match your expectation given the recognition rate in your validation set (as reported by cnn_train during training)?

Part 4.6: training with jitter

A key issue with the previous CNN is that it is not trained to recognise characters in the context of other characters. Furthermore, characters are perfectly centred in the patch. We can relax these assumptions by making the training data “more realistic”. In this part we will train a second network applying data jittering by:

Randomly adding a character to the left and to the right of the one recognised and
Randomly shifting the characters by up to ±5 pixels horizontally and ±2 pixels vertically.

This is implemented by the getBatchWithJitter() function (note that jittering is applied on the fly as it is so fast).

Tasks:

Train a second model, using the jittered data.

Look at the training and validation errors. Is their gap as wide as it was before?

Use the new model to recognise the characters in the sentence by repeating the previous part. Does it work better?

Advanced. What else can you change to make the performance even better?

Part 4.7: Training using the GPU

Skip this part if you do not wish to experiment training using GPU hardware.

A key challenge in deep learning is the sheer amount of computation required to train gigantic models from equally gigantic data collections. State-of-the-art vision models, for example, take weeks to train on specialised hardware such as GPUs, and they are essentially untrainable on CPU (unless you have access to a very large cluster). Thus it is practically important to learn how to use this hardware.

In MatConvNet this is almost trivial as it builds on the easy-to-use GPU support in MATLAB. You can follow this list of steps to try it out:

Clear the models generated and cached in the previous steps. To do this, rename or delete the directories data/characters-experiment and data/characters-jit-experiment.
Make sure that MatConvNet is compiled with GPU support. To do this, use
> setup('useGpu', true) ;
Try again training the model of exercise4.m switching to true the useGpu flag.

Task: Follow the steps above and note the speed of training. How many images per second can you process now?

For these small images, the GPU speedup is probably modest (perhaps 2-5 fold). However, for larger models it becomes really dramatic (>10 fold).

Part 5: using pretrained models

A characteristic of deep learning is that it constructs representations of the data. These representations tend to have a universal value, or at least to be applicable to an array of problems that transcends the particular task a model was trained for. This is fortunate as training complex models requires weeks of works on one or more GPUs or hundreds of CPUs; these models can then be frozen and reused for a number of additional applications, with no or minimal additional work.

In this part we will see how MatConvNet can be used to download and run high-performance CNN models for image classification. These models are trained from 1.2M images in the ImageNet datasets to discriminate 1,000 different object categories.

Several pertained models can be downloaded from the MatConvNet website, including several trained using other CNN implementations such as Caffe. One such models is included in the practical data/imagenet-vgg-verydeep-16.mat file. This is one of the best models from the ImageNet ILSVCR Challenge 2014.

Part 5.1: load a pre-trained model

The first step is to load the model itself. This is in the format of the vl_simplenn CNN wrapper, and ships as a MATLAB .mat file:

net = load('data/imagenet-vgg-verydeep-16.mat') ;
vl_simplenn_display(net) ;

Tasks:

Look at the output of vl_simplenn_display and understand the structure of the model. Can you understand why it is called “very deep”?

Look at the size of the file data/imagenet-vgg-verydeep-16.mat on disk. This is just the model.

Part 5.2: use the model to classify an image

We can now use the model to classify an image. We start from peppers.png, a MATLAB stock image:

% obtain and preprocess an image
im = imread('peppers.png') ;
im_ = single(im) ; % note: 255 range
im_ = imresize(im_, net.normalization.imageSize(1:2)) ;
im_ = im_ - net.normalization.averageImage ;

The code normalises the image in a format compatible with the model net. This amounts to: converting the image to single format (but with range 0,…,255 rather than [0, 1] as typical in MATLAB), resizing the image to a fixed size, and then subtracting an average image.

It is now possible to call the CNN:

% run the CNN
res = vl_simplenn(net, im_) ;

As usual, res contains the results of the computation, including all intermediate layers. The last one can be used to perform the classification:

% show the classification result
scores = squeeze(gather(res(end).x)) ;
[bestScore, best] = max(scores) ;

figure(1) ; clf ; imagesc(im) ;
title(sprintf('%s (%d), score %.3f',...
  net.classes.description{best}, best, bestScore)) ;

That completes this practical.

Links and further work

The code for this practical is written using the software package MatConvNet. This is a software library written in MATLAB, C++, and CUDA and is freely available as source code and binary.
The ImageNet model is the VGG very deep 16 of Karen Simonyan and Andrew Zisserman.

Acknowledgements

Beta testing by: Karel Lenc and Carlos Arteta.
Bugfixes/typos by: Sun Yushi.

History

Used in the Oxford AIMS CDT, 2014-15.

A two-dimensional lattice is a discrete grid embedded in R2 , similar for example to a checkerboard. ↩

from: http://www.robots.ox.ac.uk/~vgg/practicals/cnn/

你可能感兴趣的:(Deep,Learning)

机器学习与深度学习间关系与区别 ℒℴѵℯ心·动ꦿ໊ོ꫞ 人工智能学习深度学习 python
一、机器学习概述定义机器学习（MachineLearning,ML）是一种通过数据驱动的方法，利用统计学和计算算法来训练模型，使计算机能够从数据中学习并自动进行预测或决策。机器学习通过分析大量数据样本，识别其中的模式和规律，从而对新的数据进行判断。其核心在于通过训练过程，让模型不断优化和提升其预测准确性。主要类型1.监督学习（SupervisedLearning）监督学习是指在训练数据集中包含输入
JavaScript 中，深拷贝（Deep Copy）和浅拷贝（Shallow Copy）跳房子的前端前端面试 javascript 开发语言 ecmascript
在JavaScript中，深拷贝（DeepCopy）和浅拷贝（ShallowCopy）是用于复制对象或数组的两种不同方法。了解它们的区别和应用场景对于避免潜在的bugs和高效地处理数据非常重要。以下是对深拷贝和浅拷贝的详细解释，包括它们的概念、用途、优缺点以及实现方式。1.浅拷贝（ShallowCopy）概念定义：浅拷贝是指创建一个新的对象或数组，其中包含了原对象或数组的基本数据类型的值和对引用数
深度 Qlearning：在直播推荐系统中的应用 AGI通用人工智能之禅程序员提升自我硅基计算碳基计算认知计算生物计算深度学习神经网络大数据 AIGC AGI LLM Java Python 架构设计 Agent 程序员实现财富自由
深度Q-learning：在直播推荐系统中的应用关键词：深度Q-learning,强化学习,直播推荐系统,个性化推荐1.背景介绍1.1问题的由来随着互联网技术的飞速发展,直播平台如雨后春笋般涌现。面对海量的直播内容,用户很难快速找到自己感兴趣的内容。因此,个性化推荐系统在直播平台中扮演着越来越重要的角色。1.2研究现状目前,主流的个性化推荐算法包括协同过滤、基于内容的推荐等。这些方法在一定程度上缓
深度学习-点击率预估-研究论文2024-09-14速读 sp_fyf_2024 深度学习人工智能
深度学习-点击率预估-研究论文2024-09-14速读1.DeepTargetSessionInterestNetworkforClick-ThroughRatePredictionHZhong,JMa,XDuan,SGu,JYao-2024InternationalJointConferenceonNeuralNetworks,2024深度目标会话兴趣网络用于点击率预测摘要：这篇文章提出了一种新
探索未来，大规模分布式深度强化学习——深入解析IMPALA架构汤萌妮Margaret
探索未来，大规模分布式深度强化学习——深入解析IMPALA架构scalable_agent项目地址:https://gitcode.com/gh_mirrors/sc/scalable_agent在当今的人工智能研究前沿，深度强化学习（DRL）因其在复杂任务中的卓越表现而备受瞩目。本文要介绍的是一个开源于GitHub的重量级项目：“ScalableDistributedDeep-RLwithImp
云服务业界动态简报-20180128 Captain7
一、青云青云QingCloud推出深度学习平台DeepLearningonQingCloud，包含了主流的深度学习框架及数据科学工具包，通过QingCloudAppCenter一键部署交付，可以让算法工程师和数据科学家快速构建深度学习开发环境，将更多的精力放在模型和算法调优。二、腾讯云1.腾讯云正式发布腾讯专有云TCE(TencentCloudEnterprise)矩阵，涵盖企业版、大数据版、AI
机器学习VS深度学习 nfgo 机器学习
机器学习（MachineLearning,ML）和深度学习（DeepLearning,DL）是人工智能（AI）的两个子领域，它们有许多相似之处，但在技术实现和应用范围上也有显著区别。下面从几个方面对两者进行区分：1.概念层面机器学习：是让计算机通过算法从数据中自动学习和改进的技术。它依赖于手动设计的特征和数学模型来进行学习，常用的模型有决策树、支持向量机、线性回归等。深度学习：是机器学习的一个子领
ResNet的半监督和半弱监督模型 Valar_Morghulis
Billion-scalesemi-supervisedlearningforimageclassificationhttps://arxiv.org/pdf/1905.00546.pdfhttps://github.com/facebookresearch/semi-supervised-ImageNet1K-models/权重在timm中也有：https://hub.fastgit.org/r
联邦学习 Federated learning Google I/O‘19 笔记努力搬砖的星期五笔记联邦学习机器学习机器学习 tensorflow
FederatedLearning:MachineLearningonDecentralizeddatahttps://www.youtube.com/watch?v=89BGjQYA0uE文章目录FederatedLearning:MachineLearningonDecentralizeddata1.DecentralizeddataEdgedevicesGboard:mobilekeyboa
PCL 怎样可视化深度图像 LeonDL168 PCL 计算机视觉人工智能视觉检测图像处理算法
本小节讲解如何可视化深度图像的两种方法，在3D视窗中以点云形式进行可视化（深度图像来源于点云），另一种是，将深度值映射为颜色，从而以彩色图像方式可视化深度图像。代码首先，在PCL（PointCloudLearning）中国协助发行的书提供光盘的第7章例2文件夹中，打开名为range_image_visualization.cpp的代码文件，同文件夹下可以找到相关的测试点云文件room_scan1.
el-dialog高度设置夏之小星星前端 vue.js elementui css
el-dialog高度设置::v-deep.el-dialog{height:78vh;overflow:auto;}
elementuiPlus取消el-input的边框 qq_39016177 elementui
elementuiPlus取消el-input的边框1.通常取消边框的方法设置border为none2.还有其他类似边框的例如outlinebox-shadow这两个属性都是会产生边框效果3.el-input需要更改的话–如下需要修改box-shadow为空即可上代码:deep(.el-input__wrapper){align-items:center;background-color:#F7F
【双语新闻】AGI安全与对齐，DeepMind近期工作曲奇人工智能安全 agi 安全 llama 人工智能
我们想与AF社区分享我们最近的工作总结。以下是关于我们正在做什么，为什么会这么做以及我们认为它的意义所在的一些详细信息。我们希望这能帮助人们从我们的工作基础上继续发展，并了解他们的工作如何与我们相关联。byRohinShah,SebFarquhar,AncaDragan21stAug2024AIAlignmentForumWewantedtosharearecapofourrecentoutput
Awesome TensorFlow weixin_30594001 人工智能移动开发大数据
AwesomeTensorFlowAcuratedlistofawesomeTensorFlowexperiments,libraries,andprojects.Inspiredbyawesome-machine-learning.WhatisTensorFlow?TensorFlowisanopensourcesoftwarelibraryfornumericalcomputationusin
【ShuQiHere】探索人工智能核心：机器学习的奥秘 ShuQiHere 人工智能机器学习
【ShuQiHere】什么是机器学习？机器学习（MachineLearning,ML）是人工智能（ArtificialIntelligence,AI）中最关键的组成部分之一。它使得计算机不仅能够处理数据，还能从数据中学习，从而做出预测和决策。无论是语音识别、自动驾驶还是推荐系统，背后都依赖于机器学习模型。机器学习与传统的编程不同，它不再依赖于人类编写的固定规则，而是通过数据自我改进模型，从而更灵活
综述论文“A Survey of Zero-Shot Learning: Settings, Methods, and Applications” 硅谷秋水机器学习机器学习神经网络深度学习
该零样本学习综述，发表于ACMTrans.Intell.Syst.Technol.10,2,Article13(January2019)摘要：大多数机器学习方法着重于对已经在训练中看到其类别的实例进行分类。实际上，许多应用程序需要对实例进行分类，而这些实例的类以前没有见过。零样本学习（Zero-ShotLearning）是一种强大而有前途的学习范例，其中训练实例涵盖的类别与想分类的类别是不相交的。
机器学习 VS 表示学习 VS 深度学习 Efred.D 人工智能机器学习深度学习人工智能
文章目录前言一、机器学习是什么?二、表示学习三、深度学习总结前言本文主要阐述机器学习,表示学习和深度学习的原理和区别.一、机器学习是什么?机器学习(machinelearning),是从有限的数据集中学习到一定的规律,再把学到的规律应用到一些相似的样本集中做预测.机器学习的历史可以追溯到20世纪40年代McCulloch提出的人工神经元网络,目前学界大致把机器学习分为传统机器学习和机器学习两个类别
端到端的自动驾驶论文与代码整理大别山伧父自动驾驶
LearningbyCheatinggithubcodearxivpaperconferenceonrobotlearning最新进展(May2021)Checkoutourlatestfollow-upwork:WorldonRails(2020)Checkoutoursubmissiontothe2020CARLAChallenge!pass
Lt-8 Multithreading yanlingyun0210 java
IntendedLearningOutcomesTounderstandtheconceptofconcurrency.Tounderstandthedifferenceofaprocessandathread.TodefineathreadusingtheThreadclassandRunnableinterface.TocontrolthreadswithvariousThreadmethod
如何使用Pytorch-Metric-Learning？鱼儿也有烦恼 PyTorch pytorch
文章目录如何使用Pytorch-Metric-Learning？1.Pytorch-Metric-Learning库9个模块的功能1.1Sampler模块1.2Miner模块1.3Loss模块1.4Reducer模块1.5Distance模块1.6Regularizer模块1.7Trainer模块1.8Tester模块1.9Utils模块2.如何使用PyTorchMetricLearning库中的
[Kaiming]Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification MTandHJ neural networks
文章目录概主要内容PReLUKaiming初始化ForwardcaseBackwardcaseHeK,ZhangX,RenS,etal.DelvingDeepintoRectifiers:SurpassingHuman-LevelPerformanceonImageNetClassification[C].internationalconferenceoncomputervision,2015:1
深度神经网络详解：原理、架构与应用阿达C 活动 dnn 计算机网络人工智能神经网络机器学习深度学习
深度神经网络（DeepNeuralNetwork，DNN）是机器学习领域中最为重要和广泛应用的技术之一。它模仿人脑神经元的结构，通过多层神经元的连接和训练，能够处理复杂的非线性问题。在图像识别、自然语言处理、语音识别等领域，深度神经网络展示了强大的性能。本文将深入解析深度神经网络的基本原理、常见架构及其实际应用。一、深度神经网络的基本原理1.1神经元和感知器神经元是深度神经网络的基本组成单元。一个
前端开发需要了解的算法知识史努比的大头算法前端
手写深拷贝functiondeepClone(obj){//处理基础数据类型和函数if(obj===null||typeofobj!=='object'){returnobj;}//处理数组if(Array.isArray(obj)){returnobj.map(item=>deepClone(item));}//处理对象constclonedObj={};for(constkeyinobj){i
推荐开源项目：PyTorch-Metric-Learning 潘惟妍
推荐开源项目：PyTorch-Metric-Learningpytorch-metric-learningTheeasiestwaytousedeepmetriclearninginyourapplication.Modular,flexible,andextensible.WritteninPyTorch.项目地址:https://gitcode.com/gh_mirrors/py/pytorc
推荐：FastAPI驱动的稳定扩散LLMs演示项目褚知茉Jade
推荐：FastAPI驱动的稳定扩散LLMs演示项目FastAPI-for-Machine-Learning-Live-DemoThisrepositorycontainsthefilestobuildyourveryownAIimagegenerationwebapplication!OutlinedarethecorecomponentsoftheFastAPIwebframework,anda
【python】【Ray的概述】资源存储库 python 开发语言
Overview概述Rayisanopen-sourceunifiedframeworkforscalingAIandPythonapplicationslikemachinelearning.Itprovidesthecomputelayerforparallelprocessingsothatyoudon’tneedtobeadistributedsystemsexpert.Rayminimi
什么是监督学习（Supervised Learning）救救孩子把 AI AI 学习
一、监督学习概述监督学习（SupervisedLearning）是一种极具威力的机器学习方法，能够训练算法以识别数据中的模式，并据此进行精准的预测或分类。借助已有的标记数据，监督学习模型学会了从输入到输出的映射关系，进而在各类实际问题中实现自动化决策。无论是医疗诊断、金融市场分析、客户行为预测，还是提升生产效率以及个性化推荐系统等领域，监督学习都彰显出巨大的潜力与价值。随着技术的持续进步，监督学习
LLM系列(4)：通义千问7B在Swift/DeepSpeed上微调秘诀与实战陷阱避坑指南汀、人工智能 LLM工业级落地实践人工智能自然语言处理 prompt Swifi DeepSpeed 通义千问 Qwen
LLM系列(4)：通义千问7B在Swift/DeepSpeed上微调秘诀与实战陷阱避坑指南阿里云于2023年8月3日开源通义千问70亿参数模型，包括通用模型Qwen-7B以及对话模型Qwen-7B-Chat，这也是国内首个开源自家大模型的大厂。在诸多权威大模型能力测评基准上，如MMLU、C-Eval、GSM8K、HumanEval、WMT22，通义千问7B均取得了同参数级别开源模型中的最好表现，
使用3DUNet训练自己的数据集（pytorch）— 医疗影像分割编程日记✧ 智能医疗 pytorch 人工智能 python 计算机视觉图像处理深度学习健康医疗
代码：lee-zq/3DUNet-Pytorch:3DUNetimplementedwithpytorch(github.com)文章<cicek16miccai.pdf(uni-freiburg.de)3DU-Net:LearningDenseVolumetricSegmentation
探索任务的隐秘世界：推荐Task2Vec 邓越浪Henry
探索任务的隐秘世界：推荐Task2Vecaws-cv-task2vecOfficialcodeforthepaper"Task2Vec:TaskEmbeddingforMeta-Learning"(https://arxiv.org/abs/1902.03545,ICCV2019)项目地址:https://gitcode.com/gh_mirrors/aw/aws-cv-task2vec在机器学习
java的(PO,VO,TO,BO,DAO,POJO) Cb123456 VO TO BO POJO DAO
转: http://www.cnblogs.com/yxnchinahlj/archive/2012/02/24/2366110.html ------------------------------------------------------------------- O/R Mapping 是 Object Relational Mapping（对象关系映
spring ioc原理（看完后大家可以自己写一个spring） aijuans spring
最近，买了本Spring入门书：spring In Action 。大致浏览了下感觉还不错。就是入门了点。Manning的书还是不错的，我虽然不像哪些只看Manning书的人那样专注于Manning,但怀着崇敬的心情和激情通览了一遍。又一次接受了IOC 、DI、AOP等Spring核心概念。先就IOC和DI谈一点我的看法。IO
MyEclipse 2014中Customize Persperctive设置无效的解决方法 Kai_Ge MyEclipse2014
高高兴兴下载个MyEclipse2014，发现工具条上多了个手机开发的按钮，心生不爽就想弄掉他！结果发现Customize Persperctive失效！！有说更新下就好了，可是国内Myeclipse访问不了，何谈更新... so~这里提供了更新后的一下jar包，给大家使用！ 1、将9个jar复制到myeclipse安装目录\plugins中 2、删除和这9个jar同包名但是版本号较
SpringMvc上传 120153216 springMVC
@RequestMapping(value = WebUrlConstant.UPLOADFILE) @ResponseBody public Map<String, Object> uploadFile(HttpServletRequest request,HttpServletResponse httpresponse) { try { //
Javascript----HTML DOM 事件何必如此 JavaScript html Web
HTML DOM 事件允许Javascript在HTML文档元素中注册不同事件处理程序。事件通常与函数结合使用，函数不会在事件发生前被执行！注：DOM：指明使用的 DOM 属性级别。 1.鼠标事件属性
动态绑定和删除onclick事件 357029540 JavaScript jquery
因为对JQUERY和JS的动态绑定事件的不熟悉，今天花了好久的时间才把动态绑定和删除onclick事件搞定!现在分享下我的过程。在我的查询页面，我将我的onclick事件绑定到了tr标签上同时传入当前行(this值)参数，这样可以在点击行上的任意地方时可以选中checkbox，但是在我的某一列上也有一个onclick事件是用于下载附件的，当
HttpClient|HttpClient请求详解 7454103 apache 应用服务器网络协议网络应用 Security
HttpClient 是 Apache Jakarta Common 下的子项目，可以用来提供高效的、最新的、功能丰富的支持 HTTP 协议的客户端编程工具包，并且它支持 HTTP 协议最新的版本和建议。本文首先介绍 HTTPClient，然后根据作者实际工作经验给出了一些常见问题的解决方法。HTTP 协议可能是现在 Internet 上使用得最多、最重要的协议了，越来越多的 Java 应用程序需
递归逐层统计树形结构数据 darkranger 数据结构
将集合递归获取树形结构: /** * * 递归获取数据 * @param alist:所有分类 * @param subjname:对应统计的项目名称 * @param pk:对应项目主键 * @param reportList: 最后统计的结果集 * @param count:项目级别 */ public void getReportVO(Arr
访问WEB-INF下使用frameset标签页面出错的原因 aijuans struts2
<frameset rows="61,*,24" cols="*" framespacing="0" frameborder="no" border="0">
MAVEN常用命令 avords
Maven库： http://repo2.maven.org/maven2/ Maven依赖查询： http://mvnrepository.com/ Maven常用命令： 1. 创建Maven的普通java项目： mvn archetype:create -DgroupId=packageName
PHP如果自带一个小型的web服务器就好了 houxinyou apache 应用服务器 Web PHP 脚本
最近单位用PHP做网站，感觉PHP挺好的，不过有一些地方不太习惯，比如，环境搭建。PHP本身就是一个网站后台脚本，但用PHP做程序时还要下载apache，配置起来也不太很方便，虽然有好多配置好的apache+php+mysq的环境，但用起来总是心里不太舒服，因为我要的只是一个开发环境，如果是真实的运行环境，下个apahe也无所谓，但只是一个开发环境，总有一种杀鸡用牛刀的感觉。如果php自己的程序中
NoSQL数据库之Redis数据库管理(list类型) bijian1013 redis 数据库 NoSQL
3.list类型及操作 List是一个链表结构，主要功能是push、pop、获取一个范围的所有值等等，操作key理解为链表的名字。Redis的list类型其实就是一个每个子元素都是string类型的双向链表。我们可以通过push、pop操作从链表的头部或者尾部添加删除元素，这样list既可以作为栈，又可以作为队列。 &nbs
谁在用Hadoop？ bingyingao hadoop 数据挖掘公司应用场景
Hadoop技术的应用已经十分广泛了，而我是最近才开始对它有所了解，它在大数据领域的出色表现也让我产生了兴趣。浏览了他的官网，其中有一个页面专门介绍目前世界上有哪些公司在用Hadoop，这些公司涵盖各行各业，不乏一些大公司如alibaba,ebay,amazon,google,facebook,adobe等，主要用于日志分析、数据挖掘、机器学习、构建索引、业务报表等场景,这更加激发了学习它的热情。
【Spark七十六】Spark计算结果存到MySQL bit1129 mysql
package spark.examples.db import java.sql.{PreparedStatement, Connection, DriverManager} import com.mysql.jdbc.Driver import org.apache.spark.{SparkContext, SparkConf} object SparkMySQLInteg
Scala: JVM上的函数编程 bookjovi scala erlang haskell
说Scala是JVM上的函数编程一点也不为过，Scala把面向对象和函数型编程这两种主流编程范式结合了起来，对于熟悉各种编程范式的人而言Scala并没有带来太多革新的编程思想，scala主要的有点在于Java庞大的package优势，这样也就弥补了JVM平台上函数型编程的缺失，MS家.net上已经有了F#，JVM怎么能不跟上呢？对本人而言
jar打成exe bro_feng java jar exe
今天要把jar包打成exe，jsmooth和exe4j都用了。遇见几个问题。记录一下。两个软件都很好使，网上都有图片教程，都挺不错。首先肯定是要用自己的jre的，不然不能通用，其次别忘了把需要的lib放到classPath中。困扰我很久的一个问题是，我自己打包成功后，在一个同事的没有装jdk的电脑上运行，就是不行，报错jvm.dll为无效的windows映像，如截图最后发现
读《研磨设计模式》-代码笔记-策略模式-Strategy bylijinnan java 设计模式
声明：本文只为方便我个人查阅和理解，详细的分析以及源代码请移步原作者的博客http://chjavach.iteye.com/ /* 策略模式定义了一系列的算法，并将每一个算法封装起来，而且使它们还可以相互替换。策略模式让算法独立于使用它的客户而独立变化简单理解： 1、将不同的策略提炼出一个共同接口。这是容易的，因为不同的策略，只是算法不同，需要传递的参数
cmd命令值cvfM命令 chenyu19891124 cmd
cmd命令还真是强大啊。今天发现jar -cvfM aa.rar @aaalist 就这行命令可以根据aaalist取出相应的文件例如：在d：\workspace\prpall\test.java 有这样一个文件，现在想要将这个文件打成一个包。运行如下命令即可比如在d：\wor
OpenJWeb(1.8) Java Web应用快速开发平台 comsci java 框架 Web 项目管理企业应用
OpenJWeb(1.8) Java Web应用快速开发平台的作者是我们技术联盟的成员，他最近推出了新版本的快速应用开发平台 OpenJWeb(1.8)，我帮他做做宣传 OpenJWeb快速开发平台以快速开发为核心，整合先进的java 开源框架，本着自主开发+应用集成相结合的原则，旨在为政府、企事业单位、软件公司等平台用户提供一个架构透
Python 报错：IndentationError: unexpected indent daizj python tab 空格缩进
IndentationError: unexpected indent 是缩进的问题，也有可能是tab和空格混用啦 Python开发者有意让违反了缩进规则的程序不能通过编译，以此来强制程序员养成良好的编程习惯。并且在Python语言里，缩进而非花括号或者某种关键字，被用于表示语句块的开始和退出。增加缩进表示语句块的开
HttpClient 超时设置 dongwei_6688 httpclient
HttpClient中的超时设置包含两个部分： 1. 建立连接超时，是指在httpclient客户端和服务器端建立连接过程中允许的最大等待时间 2. 读取数据超时，是指在建立连接后，等待读取服务器端的响应数据时允许的最大等待时间在HttpClient 4.x中如下设置： HttpClient httpclient = new DefaultHttpC
小鱼与波浪 dcj3sjt126com
一条小鱼游出水面看蓝天，偶然间遇到了波浪。　　小鱼便与波浪在海面上游戏，随着波浪上下起伏、汹涌前进。　　小鱼在波浪里兴奋得大叫：“你每天都过着这么刺激的生活吗？简直太棒了。”　　波浪说：“岂只每天过这样的生活，几乎每一刻都这么刺激！还有更刺激的，要有潮汐变化，或者狂风暴雨，那才是兴奋得心脏都会跳出来。”　　小鱼说：“真希望我也能变成一个波浪，每天随着风雨、潮汐流动，不知道有多么好！”　　很快，小鱼
Error Code: 1175 You are using safe update mode and you tried to update a table dcj3sjt126com mysql
快速高效用：SET SQL_SAFE_UPDATES = 0；下面的就不要看了！今日用MySQL Workbench进行数据库的管理更新时，执行一个更新的语句碰到以下错误提示： Error Code: 1175 You are using safe update mode and you tried to update a table without a WHERE that
枚举类型详细介绍及方法定义 gaomysion enum javaee
转发 http://developer.51cto.com/art/201107/275031.htm 枚举其实就是一种类型，跟int, char 这种差不多，就是定义变量时限制输入的，你只能够赋enum里面规定的值。建议大家可以看看，这两篇文章，《java枚举类型入门》和《C++的中的结构体和枚举》，供大家参考。枚举类型是JDK5.0的新特征。Sun引进了一个全新的关键字enum
Merge Sorted Array hcx2013 array
Given two sorted integer arrays nums1 and nums2, merge nums2 into nums1 as one sorted array. Note:You may assume that nums1 has enough space (size that is
Expression Language 3.0新特性 jinnianshilongnian el 3.0
Expression Language 3.0表达式语言规范最终版从2013-4-29发布到现在已经非常久的时间了；目前如Tomcat 8、Jetty 9、GlasshFish 4已经支持EL 3.0。新特性包括：如字符串拼接操作符、赋值、分号操作符、对象方法调用、Lambda表达式、静态字段/方法调用、构造器调用、Java8集合操作。目前Glassfish 4/Jetty实现最好，对大多数新特性
超越算法来看待个性化推荐 liyonghui160com 超越算法来看待个性化推荐
一提到个性化推荐，大家一般会想到协同过滤、文本相似等推荐算法，或是更高阶的模型推荐算法，百度的张栋说过，推荐40%取决于UI、30%取决于数据、20%取决于背景知识，虽然本人不是很认同这种比例，但推荐系统中，推荐算法起的作用起的作用是非常有限的。就像任何
写给Javascript初学者的小小建议 pda158 JavaScript
　　一般初学JavaScript的时候最头痛的就是浏览器兼容问题。在Firefox下面好好的代码放到IE就不能显示了，又或者是在IE能正常显示的代码在firefox又报错了。　　如果你正初学JavaScript并有着一样的处境的话建议你：初学JavaScript的时候无视DOM和BOM的兼容性，将更多的时间花在了解语言本身（ECMAScript）。只在特定浏览器编写代码（Chrome/Fi
Java 枚举 ShihLei java enum 枚举
注：文章内容大量借鉴使用网上的资料，可惜没有记录参考地址，只能再传对作者说声抱歉并表示感谢！一基础 1）语法枚举类型只能有私有构造器（这样做可以保证客户代码没有办法新建一个enum的实例）枚举实例必须最先定义 2）特性 &nb
Java SE 6 HotSpot虚拟机的垃圾回收机制 uuhorse java HotSpot GC 垃圾回收 VM
官方资料，关于Java SE 6 HotSpot虚拟机的garbage Collection，非常全，英文。 http://www.oracle.com/technetwork/java/javase/gc-tuning-6-140523.html Java SE 6 HotSpot[tm] Virtual Machine Garbage Collection Tuning &