dh0029314

Neural+machine+translation+with+attention+-+v3

Neural Machine Translation

Welcome to your first programming assignment for this week!

You will build a Neural Machine Translation (NMT) model to translate human readable dates (“25th of June, 2009”) into machine readable dates (“2009-06-25”). You will do this using an attention model, one of the most sophisticated sequence to sequence models.

This notebook was produced together with NVIDIA’s Deep Learning Institute.

Let’s load all the packages you will need for this assignment.

from keras.layers import Bidirectional, Concatenate, Permute, Dot, Input, LSTM, Multiply
from keras.layers import RepeatVector, Dense, Activation, Lambda
from keras.optimizers import Adam
from keras.utils import to_categorical
from keras.models import load_model, Model
import keras.backend as K
import numpy as np

from faker import Faker
import random
from tqdm import tqdm
from babel.dates import format_date
from nmt_utils import *
import matplotlib.pyplot as plt
%matplotlib inline

Using TensorFlow backend.

1 - Translating human readable dates into machine readable dates

The model you will build here could be used to translate from one language to another, such as translating from English to Hindi. However, language translation requires massive datasets and usually takes days of training on GPUs. To give you a place to experiment with these models even without using massive datasets, we will instead use a simpler “date translation” task.

The network will input a date written in a variety of possible formats (e.g. “the 29th of August 1958”, “03/30/1968”, “24 JUNE 1987”) and translate them into standardized, machine readable dates (e.g. “1958-08-29”, “1968-03-30”, “1987-06-24”). We will have the network learn to output dates in the common machine-readable format YYYY-MM-DD.

1.1 - Dataset

We will train the model on a dataset of 10000 human readable dates and their equivalent, standardized, machine readable dates. Let’s run the following cells to load the dataset and print some examples.

m = 10000
dataset, human_vocab, machine_vocab, inv_machine_vocab = load_dataset(m)

100%|██████████| 10000/10000 [00:01<00:00, 8431.37it/s]

dataset[:10]

[('9 may 1998', '1998-05-09'),
 ('10.09.70', '1970-09-10'),
 ('4/28/90', '1990-04-28'),
 ('thursday january 26 1995', '1995-01-26'),
 ('monday march 7 1983', '1983-03-07'),
 ('sunday may 22 1988', '1988-05-22'),
 ('tuesday july 8 2008', '2008-07-08'),
 ('08 sep 1999', '1999-09-08'),
 ('1 jan 1981', '1981-01-01'),
 ('monday may 22 1995', '1995-05-22')]

You’ve loaded:
- dataset: a list of tuples of (human readable date, machine readable date)
- human_vocab: a python dictionary mapping all characters used in the human readable dates to an integer-valued index
- machine_vocab: a python dictionary mapping all characters used in machine readable dates to an integer-valued index. These indices are not necessarily consistent with human_vocab.
- inv_machine_vocab: the inverse dictionary of machine_vocab, mapping from indices back to characters.

Let’s preprocess the data and map the raw text data into the index values. We will also use Tx=30 (which we assume is the maximum length of the human readable date; if we get a longer input, we would have to truncate it) and Ty=10 (since “YYYY-MM-DD” is 10 characters long).

Tx = 30
Ty = 10
X, Y, Xoh, Yoh = preprocess_data(dataset, human_vocab, machine_vocab, Tx, Ty)

print("X.shape:", X.shape)
print("Y.shape:", Y.shape)
print("Xoh.shape:", Xoh.shape)
print("Yoh.shape:", Yoh.shape)

X.shape: (10000, 30)
Y.shape: (10000, 10)
Xoh.shape: (10000, 30, 37)
Yoh.shape: (10000, 10, 11)

You now have:
- X: a processed version of the human readable dates in the training set, where each character is replaced by an index mapped to the character via human_vocab. Each date is further padded to Tx values with a special character (< pad >). X.shape = (m, Tx)
- Y: a processed version of the machine readable dates in the training set, where each character is replaced by the index it is mapped to in machine_vocab. You should have Y.shape = (m, Ty).
- Xoh: one-hot version of X, the “1” entry’s index is mapped to the character thanks to human_vocab. Xoh.shape = (m, Tx, len(human_vocab))
- Yoh: one-hot version of Y, the “1” entry’s index is mapped to the character thanks to machine_vocab. Yoh.shape = (m, Tx, len(machine_vocab)). Here, len(machine_vocab) = 11 since there are 11 characters (‘-’ as well as 0-9).

Lets also look at some examples of preprocessed training examples. Feel free to play with index in the cell below to navigate the dataset and see how source/target dates are preprocessed.

index = 0
print("Source date:", dataset[index][0])
print("Target date:", dataset[index][1])
print()
print("Source after preprocessing (indices):", X[index])
print("Target after preprocessing (indices):", Y[index])
print()
print("Source after preprocessing (one-hot):", Xoh[index])
print("Target after preprocessing (one-hot):", Yoh[index])

Source date: 9 may 1998
Target date: 1998-05-09

Source after preprocessing (indices): [12  0 24 13 34  0  4 12 12 11 36 36 36 36 36 36 36 36 36 36 36 36 36 36 36
 36 36 36 36 36]
Target after preprocessing (indices): [ 2 10 10  9  0  1  6  0  1 10]

Source after preprocessing (one-hot): [[ 0.  0.  0. ...,  0.  0.  0.]
 [ 1.  0.  0. ...,  0.  0.  0.]
 [ 0.  0.  0. ...,  0.  0.  0.]
 ..., 
 [ 0.  0.  0. ...,  0.  0.  1.]
 [ 0.  0.  0. ...,  0.  0.  1.]
 [ 0.  0.  0. ...,  0.  0.  1.]]
Target after preprocessing (one-hot): [[ 0.  0.  1.  0.  0.  0.  0.  0.  0.  0.  0.]
 [ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  1.]
 [ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  1.]
 [ 0.  0.  0.  0.  0.  0.  0.  0.  0.  1.  0.]
 [ 1.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.]
 [ 0.  1.  0.  0.  0.  0.  0.  0.  0.  0.  0.]
 [ 0.  0.  0.  0.  0.  0.  1.  0.  0.  0.  0.]
 [ 1.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.]
 [ 0.  1.  0.  0.  0.  0.  0.  0.  0.  0.  0.]
 [ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  1.]]

2 - Neural machine translation with attention

If you had to translate a book’s paragraph from French to English, you would not read the whole paragraph, then close the book and translate. Even during the translation process, you would read/re-read and focus on the parts of the French paragraph corresponding to the parts of the English you are writing down.

The attention mechanism tells a Neural Machine Translation model where it should pay attention to at any step.

2.1 - Attention mechanism

In this part, you will implement the attention mechanism presented in the lecture videos. Here is a figure to remind you how the model works. The diagram on the left shows the attention model. The diagram on the right shows what one “Attention” step does to calculate the attention variables α⟨t,t′⟩ , which are used to compute the context variable context⟨t⟩ for each timestep in the output ( t=1,…,Ty ).

Figure 1: Neural machine translation with attention

Here are some properties of the model that you may notice:

There are two separate LSTMs in this model (see diagram on the left). Because the one at the bottom of the picture is a Bi-directional LSTM and comes before the attention mechanism, we will call it pre-attention Bi-LSTM. The LSTM at the top of the diagram comes after the attention mechanism, so we will call it the post-attention LSTM. The pre-attention Bi-LSTM goes through Tx time steps; the post-attention LSTM goes through Ty time steps.
The post-attention LSTM passes s⟨t⟩,c⟨t⟩ from one time step to the next. In the lecture videos, we were using only a basic RNN for the post-activation sequence model, so the state captured by the RNN output activations s⟨t⟩ . But since we are using an LSTM here, the LSTM has both the output activation s⟨t⟩ and the hidden cell state c⟨t⟩ . However, unlike previous text generation examples (such as Dinosaurus in week 1), in this model the post-activation LSTM at time t does will not take the specific generated y⟨t−1⟩ as input; it only takes s⟨t⟩ and c⟨t⟩ as input. We have designed the model this way, because (unlike language generation where adjacent characters are highly correlated) there isn’t as strong a dependency between the previous character and the next character in a YYYY-MM-DD date.
We use a⟨t⟩=[a→⟨t⟩;a←⟨t⟩] to represent the concatenation of the activations of both the forward-direction and backward-directions of the pre-attention Bi-LSTM.
The diagram on the right uses a RepeatVector node to copy s⟨t−1⟩ ’s value Tx times, and then Concatenation to concatenate s⟨t−1⟩ and a⟨t⟩ to compute e⟨t,t′ , which is then passed through a softmax to compute α⟨t,t′⟩ . We’ll explain how to use RepeatVector and Concatenation in Keras below.

Lets implement this model. You will start by implementing two functions: one_step_attention() and model().

1) one_step_attention(): At step t , given all the hidden states of the Bi-LSTM ( [a<1>,a<2>,...,a<Tx>] ) and the previous hidden state of the second LSTM ( s<t−1> ), one_step_attention() will compute the attention weights ( [α<t,1>,α<t,2>,...,α<t,Tx>] ) and output the context vector (see Figure 1 (right) for details):

c o n t e x t < t > = \sum t' = 0 T x α < t, t' > a < t' > (1)

Note that we are denoting the attention in this notebook context⟨t⟩ . In the lecture videos, the context was denoted c⟨t⟩ , but here we are calling it context⟨t⟩ to avoid confusion with the (post-attention) LSTM’s internal memory cell variable, which is sometimes also denoted c⟨t⟩ .

2) model(): Implements the entire model. It first runs the input through a Bi-LSTM to get back [a<1>,a<2>,...,a<Tx>] . Then, it calls one_step_attention() Ty times (for loop). At each iteration of this loop, it gives the computed context vector c<t> to the second LSTM, and runs the output of the LSTM through a dense layer with softmax activation to generate a prediction y^<t> .

Exercise: Implement one_step_attention(). The function model() will call the layers in one_step_attention() Ty using a for-loop, and it is important that all Ty copies have the same weights. I.e., it should not re-initiaiize the weights every time. In other words, all Ty steps should have shared weights. Here’s how you can implement layers with shareable weights in Keras:
1. Define the layer objects (as global variables for examples).
2. Call these objects when propagating the input.

We have defined the layers you need as global variables. Please run the following cells to create them. Please check the Keras documentation to make sure you understand what these layers are: RepeatVector(), Concatenate(), Dense(), Activation(), Dot().

# Defined shared layers as global variables
repeator = RepeatVector(Tx,name='rep')
concatenator = Concatenate(axis=-1,name='con')
densor1 = Dense(10, activation = "tanh",name='den1')
densor2 = Dense(1, activation = "relu",name='den2')
activator = Activation(softmax, name='attention_weights') # We are using a custom softmax(axis = 1) loaded in this notebook
dotor = Dot(axes = 1,name='dot')

Now you can use these layers to implement one_step_attention(). In order to propagate a Keras tensor object X through one of these layers, use layer(X) (or layer([X,Y]) if it requires multiple inputs.), e.g. densor(X) will propagate X through the Dense(1) layer defined above.


# GRADED FUNCTION: one_step_attention

def one_step_attention(a, s_prev):
    """
    Performs one step of attention: Outputs a context vector computed as a dot product of the attention weights
    "alphas" and the hidden states "a" of the Bi-LSTM.

    Arguments:
    a -- hidden state output of the Bi-LSTM, numpy-array of shape (m, Tx, 2*n_a)
    s_prev -- previous hidden state of the (post-attention) LSTM, numpy-array of shape (m, n_s)

    Returns:
    context -- context vector, input of the next (post-attetion) LSTM cell
    """

    ### START CODE HERE ###
    # Use repeator to repeat s_prev to be of shape (m, Tx, n_s) so that you can concatenate it with all hidden states "a" (≈ 1 line)
    s_prev = repeator(s_prev)
    # Use concatenator to concatenate a and s_prev on the last axis (≈ 1 line)
    concat = concatenator([a,s_prev])
    # Use densor1 to propagate concat through a small fully-connected neural network to compute the "intermediate energies" variable e. (≈1 lines)
    e = densor1(concat)
    # Use densor2 to propagate e through a small fully-connected neural network to compute the "energies" variable energies. (≈1 lines)
    energies = densor2(e)
    # Use "activator" on "energies" to compute the attention weights "alphas" (≈ 1 line)
    alphas = activator(energies)
    # Use dotor together with "alphas" and "a" to compute the context vector to be given to the next (post-attention) LSTM-cell (≈ 1 line)
    context = dotor([alphas,a])
    ### END CODE HERE ###

    return context

You will be able to check the expected output of one_step_attention() after you’ve coded the model() function.

Exercise: Implement model() as explained in figure 2 and the text above. Again, we have defined global layers that will share weights to be used in model().

n_a = 32
n_s = 64
post_activation_LSTM_cell = LSTM(n_s, return_state = True)
output_layer = Dense(len(machine_vocab), activation=softmax)

Now you can use these layers Ty times in a for loop to generate the outputs, and their parameters will not be reinitialized. You will have to carry out the following steps:

Propagate the input into a Bidirectional LSTM
Iterate for t=0,…,Ty−1 :
1. Call one_step_attention() on [α<t,1>,α<t,2>,...,α<t,Tx>] and s<t−1> to get the context vector context<t> .
2. Give context<t> to the post-attention LSTM cell. Remember pass in the previous hidden-state s⟨t−1⟩ and cell-states c⟨t−1⟩ of this LSTM using initial_state= [previous hidden state, previous cell state]. Get back the new hidden state s<t> and the new cell state c<t> .
3. Apply a softmax layer to s<t> , get the output.
4. Save the output by adding it to the list of outputs.
Create your Keras model instance, it should have three inputs (“inputs”, s<0> and c<0> ) and output the list of “outputs”.

# GRADED FUNCTION: model

def model(Tx, Ty, n_a, n_s, human_vocab_size, machine_vocab_size):
    """
    Arguments:
    Tx -- length of the input sequence
    Ty -- length of the output sequence
    n_a -- hidden state size of the Bi-LSTM
    n_s -- hidden state size of the post-attention LSTM
    human_vocab_size -- size of the python dictionary "human_vocab"
    machine_vocab_size -- size of the python dictionary "machine_vocab"

    Returns:
    model -- Keras model instance
    """

    # Define the inputs of your model with a shape (Tx,)
    # Define s0 and c0, initial hidden state for the decoder LSTM of shape (n_s,)
    X = Input(shape=(Tx, human_vocab_size))
    s0 = Input(shape=(n_s,), name='s0')
    c0 = Input(shape=(n_s,), name='c0')
    s = s0
    c = c0

    # Initialize empty list of outputs
    outputs = []

    ### START CODE HERE ###

    # Step 1: Define your pre-attention Bi-LSTM. Remember to use return_sequences=True. (≈ 1 line)
    a = Bidirectional(LSTM(n_a, return_sequences=True),input_shape=(m, Tx, n_a*2))(X)
    print(a.shape)
    print(Ty)
    # Step 2: Iterate for Ty steps
    for t in range(Ty):

        # Step 2.A: Perform one step of the attention mechanism to get back the context vector at step t (≈ 1 line)
        context =  one_step_attention(a, s)

        # Step 2.B: Apply the post-attention LSTM cell to the "context" vector.
        # Don't forget to pass: initial_state = [hidden state, cell state] (≈ 1 line)
        s, _, c =  post_activation_LSTM_cell(context,initial_state = [s, c] )

        # Step 2.C: Apply Dense layer to the hidden state output of the post-attention LSTM (≈ 1 line)
        out = output_layer(s)

        # Step 2.D: Append "out" to the "outputs" list (≈ 1 line)
        outputs.append(out)

    # Step 3: Create model instance taking three inputs and returning the list of outputs. (≈ 1 line)
    model = Model(inputs=[X,s0,c0],outputs=outputs)

    ### END CODE HERE ###

    return model

Run the following cell to create your model.

model = model(Tx, Ty, n_a, n_s, len(human_vocab), len(machine_vocab))

(?, ?, 64)
10

Let’s get a summary of the model to check if it matches the expected output.

model.summary()

____________________________________________________________________________________________________
Layer (type)                     Output Shape          Param #     Connected to                     
====================================================================================================
input_20 (InputLayer)            (None, 30, 37)        0                                            
____________________________________________________________________________________________________
s0 (InputLayer)                  (None, 64)            0                                            
____________________________________________________________________________________________________
bidirectional_19 (Bidirectional) (None, 30, 64)        17920       input_20[0][0]                   
____________________________________________________________________________________________________
rep (RepeatVector)               (None, 30, 64)        0           s0[0][0]                         
                                                                   lstm_24[10][0]                   
                                                                   lstm_24[11][0]                   
                                                                   lstm_24[12][0]                   
                                                                   lstm_24[13][0]                   
                                                                   lstm_24[14][0]                   
                                                                   lstm_24[15][0]                   
                                                                   lstm_24[16][0]                   
                                                                   lstm_24[17][0]                   
                                                                   lstm_24[18][0]                   
____________________________________________________________________________________________________
con (Concatenate)                (None, 30, 128)       0           bidirectional_19[0][0]           
                                                                   rep[10][0]                       
                                                                   bidirectional_19[0][0]           
                                                                   rep[11][0]                       
                                                                   bidirectional_19[0][0]           
                                                                   rep[12][0]                       
                                                                   bidirectional_19[0][0]           
                                                                   rep[13][0]                       
                                                                   bidirectional_19[0][0]           
                                                                   rep[14][0]                       
                                                                   bidirectional_19[0][0]           
                                                                   rep[15][0]                       
                                                                   bidirectional_19[0][0]           
                                                                   rep[16][0]                       
                                                                   bidirectional_19[0][0]           
                                                                   rep[17][0]                       
                                                                   bidirectional_19[0][0]           
                                                                   rep[18][0]                       
                                                                   bidirectional_19[0][0]           
                                                                   rep[19][0]                       
____________________________________________________________________________________________________
den1 (Dense)                     (None, 30, 10)        1290        con[10][0]                       
                                                                   con[11][0]                       
                                                                   con[12][0]                       
                                                                   con[13][0]                       
                                                                   con[14][0]                       
                                                                   con[15][0]                       
                                                                   con[16][0]                       
                                                                   con[17][0]                       
                                                                   con[18][0]                       
                                                                   con[19][0]                       
____________________________________________________________________________________________________
den2 (Dense)                     (None, 30, 1)         11          den1[10][0]                      
                                                                   den1[11][0]                      
                                                                   den1[12][0]                      
                                                                   den1[13][0]                      
                                                                   den1[14][0]                      
                                                                   den1[15][0]                      
                                                                   den1[16][0]                      
                                                                   den1[17][0]                      
                                                                   den1[18][0]                      
                                                                   den1[19][0]                      
____________________________________________________________________________________________________
attention_weights (Activation)   (None, 30, 1)         0           den2[10][0]                      
                                                                   den2[11][0]                      
                                                                   den2[12][0]                      
                                                                   den2[13][0]                      
                                                                   den2[14][0]                      
                                                                   den2[15][0]                      
                                                                   den2[16][0]                      
                                                                   den2[17][0]                      
                                                                   den2[18][0]                      
                                                                   den2[19][0]                      
____________________________________________________________________________________________________
dot (Dot)                        (None, 1, 64)         0           attention_weights[10][0]         
                                                                   bidirectional_19[0][0]           
                                                                   attention_weights[11][0]         
                                                                   bidirectional_19[0][0]           
                                                                   attention_weights[12][0]         
                                                                   bidirectional_19[0][0]           
                                                                   attention_weights[13][0]         
                                                                   bidirectional_19[0][0]           
                                                                   attention_weights[14][0]         
                                                                   bidirectional_19[0][0]           
                                                                   attention_weights[15][0]         
                                                                   bidirectional_19[0][0]           
                                                                   attention_weights[16][0]         
                                                                   bidirectional_19[0][0]           
                                                                   attention_weights[17][0]         
                                                                   bidirectional_19[0][0]           
                                                                   attention_weights[18][0]         
                                                                   bidirectional_19[0][0]           
                                                                   attention_weights[19][0]         
                                                                   bidirectional_19[0][0]           
____________________________________________________________________________________________________
c0 (InputLayer)                  (None, 64)            0                                            
____________________________________________________________________________________________________
lstm_24 (LSTM)                   [(None, 64), (None, 6 33024       dot[10][0]                       
                                                                   s0[0][0]                         
                                                                   c0[0][0]                         
                                                                   dot[11][0]                       
                                                                   lstm_24[10][0]                   
                                                                   lstm_24[10][2]                   
                                                                   dot[12][0]                       
                                                                   lstm_24[11][0]                   
                                                                   lstm_24[11][2]                   
                                                                   dot[13][0]                       
                                                                   lstm_24[12][0]                   
                                                                   lstm_24[12][2]                   
                                                                   dot[14][0]                       
                                                                   lstm_24[13][0]                   
                                                                   lstm_24[13][2]                   
                                                                   dot[15][0]                       
                                                                   lstm_24[14][0]                   
                                                                   lstm_24[14][2]                   
                                                                   dot[16][0]                       
                                                                   lstm_24[15][0]                   
                                                                   lstm_24[15][2]                   
                                                                   dot[17][0]                       
                                                                   lstm_24[16][0]                   
                                                                   lstm_24[16][2]                   
                                                                   dot[18][0]                       
                                                                   lstm_24[17][0]                   
                                                                   lstm_24[17][2]                   
                                                                   dot[19][0]                       
                                                                   lstm_24[18][0]                   
                                                                   lstm_24[18][2]                   
____________________________________________________________________________________________________
dense_11 (Dense)                 (None, 11)            715         lstm_24[10][0]                   
                                                                   lstm_24[11][0]                   
                                                                   lstm_24[12][0]                   
                                                                   lstm_24[13][0]                   
                                                                   lstm_24[14][0]                   
                                                                   lstm_24[15][0]                   
                                                                   lstm_24[16][0]                   
                                                                   lstm_24[17][0]                   
                                                                   lstm_24[18][0]                   
                                                                   lstm_24[19][0]                   
====================================================================================================
Total params: 52,960
Trainable params: 52,960
Non-trainable params: 0
____________________________________________________________________________________________________

Expected Output:

Here is the summary you should see

Total params:	185,484
Trainable params:	185,484
Non-trainable params:	0
bidirectional_1’s output shape	(None, 30, 128)
repeat_vector_1’s output shape	(None, 30, 128)
concatenate_1’s output shape	(None, 30, 256)
attention_weights’s output shape	(None, 30, 1)
dot_1’s output shape	(None, 1, 128)
dense_2’s output shape	(None, 11)

As usual, after creating your model in Keras, you need to compile it and define what loss, optimizer and metrics your are want to use. Compile your model using categorical_crossentropy loss, a custom Adam optimizer (learning rate = 0.005, β1=0.9 , β2=0.999 , decay = 0.01) and ['accuracy'] metrics:

### START CODE HERE ### (≈2 lines)
opt = Adam(lr=0.005, beta_1=0.9, beta_2=0.999,decay=0.01)
model.compile(loss='categorical_crossentropy', optimizer=opt,metrics=['accuracy'])
### END CODE HERE ###

The last step is to define all your inputs and outputs to fit the model:
- You already have X of shape (m=10000,Tx=30) containing the training examples.
- You need to create s0 and c0 to initialize your post_activation_LSTM_cell with 0s.
- Given the model() you coded, you need the “outputs” to be a list of 11 elements of shape (m, T_y). So that: outputs[i][0], ..., outputs[i][Ty] represent the true labels (characters) corresponding to the ith training example (X[i]). More generally, outputs[i][j] is the true label of the jth character in the ith training example.

s0 = np.zeros((m, n_s))
c0 = np.zeros((m, n_s))
outputs = list(Yoh.swapaxes(0,1))

Let’s now fit the model and run it for one epoch.

model.fit([Xoh, s0, c0], outputs, epochs=1, batch_size=100)

Epoch 1/1
10000/10000 [==============================] - 38s - loss: 17.0479 - dense_11_loss_1: 1.2586 - dense_11_loss_2: 1.0343 - dense_11_loss_3: 1.7487 - dense_11_loss_4: 2.7333 - dense_11_loss_5: 0.8554 - dense_11_loss_6: 1.3875 - dense_11_loss_7: 2.7841 - dense_11_loss_8: 1.0104 - dense_11_loss_9: 1.6879 - dense_11_loss_10: 2.5477 - dense_11_acc_1: 0.4415 - dense_11_acc_2: 0.6585 - dense_11_acc_3: 0.3054 - dense_11_acc_4: 0.0765 - dense_11_acc_5: 0.9349 - dense_11_acc_6: 0.2551 - dense_11_acc_7: 0.0418 - dense_11_acc_8: 0.9262 - dense_11_acc_9: 0.3003 - dense_11_acc_10: 0.1159

While training you can see the loss as well as the accuracy on each of the 10 positions of the output. The table below gives you an example of what the accuracies could be if the batch had 2 examples:

Thus, dense_2_acc_8: 0.89 means that you are predicting the 7th character of the output correctly 89% of the time in the current batch of data.

We have run this model for longer, and saved the weights. Run the next cell to load our weights. (By training a model for several minutes, you should be able to obtain a model of similar accuracy, but loading our model will save you time.)

model.load_weights('models/model.h5')

You can now see the results on new examples.

EXAMPLES = ['3 May 1979', '5 April 09', '21th of August 2016', 'Tue 10 Jul 2007', 'Saturday May 9 2018', 'March 3 2001', 'March 3rd 2001', '1 March 2001']
for example in EXAMPLES:

    source = string_to_int(example, Tx, human_vocab)
    source = np.array(list(map(lambda x: to_categorical(x, num_classes=len(human_vocab)), source))).swapaxes(0,1)
    prediction = model.predict([source, s0, c0])
    prediction = np.argmax(prediction, axis = -1)
    output = [inv_machine_vocab[int(i)] for i in prediction]

    print("source:", example)
    print("output:", ''.join(output))

source: 3 May 1979
output: 1979-05-03
source: 5 April 09
output: 2009-05-05
source: 21th of August 2016
output: 2016-08-21
source: Tue 10 Jul 2007
output: 2007-07-10
source: Saturday May 9 2018
output: 2018-05-09
source: March 3 2001
output: 2001-03-03
source: March 3rd 2001
output: 2001-03-03
source: 1 March 2001
output: 2001-03-01

You can also change these examples to test with your own examples. The next part will give you a better sense on what the attention mechanism is doing–i.e., what part of the input the network is paying attention to when generating a particular output character.

3 - Visualizing Attention (Optional / Ungraded)

Since the problem has a fixed output length of 10, it is also possible to carry out this task using 10 different softmax units to generate the 10 characters of the output. But one advantage of the attention model is that each part of the output (say the month) knows it needs to depend only on a small part of the input (the characters in the input giving the month). We can visualize what part of the output is looking at what part of the input.

Consider the task of translating “Saturday 9 May 2018” to “2018-05-09”. If we visualize the computed α⟨t,t′⟩ we get this:

Figure 8: Full Attention Map

Notice how the output ignores the “Saturday” portion of the input. None of the output timesteps are paying much attention to that portion of the input. We see also that 9 has been translated as 09 and May has been correctly translated into 05, with the output paying attention to the parts of the input it needs to to make the translation. The year mostly requires it to pay attention to the input’s “18” in order to generate “2018.”

3.1 - Getting the activations from the network

Lets now visualize the attention values in your network. We’ll propagate an example through the network, then visualize the values of α⟨t,t′⟩ .

To figure out where the attention values are located, let’s start by printing a summary of the model .

model.summary()

____________________________________________________________________________________________________
Layer (type)                     Output Shape          Param #     Connected to                     
====================================================================================================
input_20 (InputLayer)            (None, 30, 37)        0                                            
____________________________________________________________________________________________________
s0 (InputLayer)                  (None, 64)            0                                            
____________________________________________________________________________________________________
bidirectional_19 (Bidirectional) (None, 30, 64)        17920       input_20[0][0]                   
____________________________________________________________________________________________________
rep (RepeatVector)               (None, 30, 64)        0           s0[0][0]                         
                                                                   lstm_24[10][0]                   
                                                                   lstm_24[11][0]                   
                                                                   lstm_24[12][0]                   
                                                                   lstm_24[13][0]                   
                                                                   lstm_24[14][0]                   
                                                                   lstm_24[15][0]                   
                                                                   lstm_24[16][0]                   
                                                                   lstm_24[17][0]                   
                                                                   lstm_24[18][0]                   
____________________________________________________________________________________________________
con (Concatenate)                (None, 30, 128)       0           bidirectional_19[0][0]           
                                                                   rep[10][0]                       
                                                                   bidirectional_19[0][0]           
                                                                   rep[11][0]                       
                                                                   bidirectional_19[0][0]           
                                                                   rep[12][0]                       
                                                                   bidirectional_19[0][0]           
                                                                   rep[13][0]                       
                                                                   bidirectional_19[0][0]           
                                                                   rep[14][0]                       
                                                                   bidirectional_19[0][0]           
                                                                   rep[15][0]                       
                                                                   bidirectional_19[0][0]           
                                                                   rep[16][0]                       
                                                                   bidirectional_19[0][0]           
                                                                   rep[17][0]                       
                                                                   bidirectional_19[0][0]           
                                                                   rep[18][0]                       
                                                                   bidirectional_19[0][0]           
                                                                   rep[19][0]                       
____________________________________________________________________________________________________
den1 (Dense)                     (None, 30, 10)        1290        con[10][0]                       
                                                                   con[11][0]                       
                                                                   con[12][0]                       
                                                                   con[13][0]                       
                                                                   con[14][0]                       
                                                                   con[15][0]                       
                                                                   con[16][0]                       
                                                                   con[17][0]                       
                                                                   con[18][0]                       
                                                                   con[19][0]                       
____________________________________________________________________________________________________
den2 (Dense)                     (None, 30, 1)         11          den1[10][0]                      
                                                                   den1[11][0]                      
                                                                   den1[12][0]                      
                                                                   den1[13][0]                      
                                                                   den1[14][0]                      
                                                                   den1[15][0]                      
                                                                   den1[16][0]                      
                                                                   den1[17][0]                      
                                                                   den1[18][0]                      
                                                                   den1[19][0]                      
____________________________________________________________________________________________________
attention_weights (Activation)   (None, 30, 1)         0           den2[10][0]                      
                                                                   den2[11][0]                      
                                                                   den2[12][0]                      
                                                                   den2[13][0]                      
                                                                   den2[14][0]                      
                                                                   den2[15][0]                      
                                                                   den2[16][0]                      
                                                                   den2[17][0]                      
                                                                   den2[18][0]                      
                                                                   den2[19][0]                      
____________________________________________________________________________________________________
dot (Dot)                        (None, 1, 64)         0           attention_weights[10][0]         
                                                                   bidirectional_19[0][0]           
                                                                   attention_weights[11][0]         
                                                                   bidirectional_19[0][0]           
                                                                   attention_weights[12][0]         
                                                                   bidirectional_19[0][0]           
                                                                   attention_weights[13][0]         
                                                                   bidirectional_19[0][0]           
                                                                   attention_weights[14][0]         
                                                                   bidirectional_19[0][0]           
                                                                   attention_weights[15][0]         
                                                                   bidirectional_19[0][0]           
                                                                   attention_weights[16][0]         
                                                                   bidirectional_19[0][0]           
                                                                   attention_weights[17][0]         
                                                                   bidirectional_19[0][0]           
                                                                   attention_weights[18][0]         
                                                                   bidirectional_19[0][0]           
                                                                   attention_weights[19][0]         
                                                                   bidirectional_19[0][0]           
____________________________________________________________________________________________________
c0 (InputLayer)                  (None, 64)            0                                            
____________________________________________________________________________________________________
lstm_24 (LSTM)                   [(None, 64), (None, 6 33024       dot[10][0]                       
                                                                   s0[0][0]                         
                                                                   c0[0][0]                         
                                                                   dot[11][0]                       
                                                                   lstm_24[10][0]                   
                                                                   lstm_24[10][2]                   
                                                                   dot[12][0]                       
                                                                   lstm_24[11][0]                   
                                                                   lstm_24[11][2]                   
                                                                   dot[13][0]                       
                                                                   lstm_24[12][0]                   
                                                                   lstm_24[12][2]                   
                                                                   dot[14][0]                       
                                                                   lstm_24[13][0]                   
                                                                   lstm_24[13][2]                   
                                                                   dot[15][0]                       
                                                                   lstm_24[14][0]                   
                                                                   lstm_24[14][2]                   
                                                                   dot[16][0]                       
                                                                   lstm_24[15][0]                   
                                                                   lstm_24[15][2]                   
                                                                   dot[17][0]                       
                                                                   lstm_24[16][0]                   
                                                                   lstm_24[16][2]                   
                                                                   dot[18][0]                       
                                                                   lstm_24[17][0]                   
                                                                   lstm_24[17][2]                   
                                                                   dot[19][0]                       
                                                                   lstm_24[18][0]                   
                                                                   lstm_24[18][2]                   
____________________________________________________________________________________________________
dense_11 (Dense)                 (None, 11)            715         lstm_24[10][0]                   
                                                                   lstm_24[11][0]                   
                                                                   lstm_24[12][0]                   
                                                                   lstm_24[13][0]                   
                                                                   lstm_24[14][0]                   
                                                                   lstm_24[15][0]                   
                                                                   lstm_24[16][0]                   
                                                                   lstm_24[17][0]                   
                                                                   lstm_24[18][0]                   
                                                                   lstm_24[19][0]                   
====================================================================================================
Total params: 52,960
Trainable params: 52,960
Non-trainable params: 0
____________________________________________________________________________________________________

Navigate through the output of model.summary() above. You can see that the layer named attention_weights outputs the alphas of shape (m, 30, 1) before dot_2 computes the context vector for every time step t=0,…,Ty−1 . Lets get the activations from this layer.

The function attention_map() pulls out the attention values from your model and plots them.

attention_map = plot_attention_map(model, human_vocab, inv_machine_vocab, "Tuesday 09 Oct 1993", num = 7, n_s = 64)

On the generated plot you can observe the values of the attention weights for each character of the predicted output. Examine this plot and check that where the network is paying attention makes sense to you.

In the date translation application, you will observe that most of the time attention helps predict the year, and hasn’t much impact on predicting the day/month.

Congratulations!

You have come to the end of this assignment

Here’s what you should remember from this notebook:

Machine translation models can be used to map from one sequence to another. They are useful not just for translating human languages (like French->English) but also for tasks like date format translation.
An attention mechanism allows a network to focus on the most relevant parts of the input when producing a specific part of the output.
A network using an attention mechanism can translate from inputs of length Tx to outputs of length Ty , where Tx and Ty can be different.
You can visualize attention weights α⟨t,t′⟩ to see what the network is paying attention to while generating each output.

Congratulations on finishing this assignment! You are now able to implement an attention model and use it to learn complex mappings from one sequence to another.

你可能感兴趣的:(deep-learning)

易 AI - 使用 TensorFlow 2 Keras 实现 AlexNet CNN 架构 CatchZeng
原文：https://makeoptim.com/deep-learning/yiai-alexnet-implementation前言网络结构实现SequentialSubclassingDemo小结参考前言上一篇笔者使用如何阅读深度学习论文的方法阅读了AlexNet。为了加深理解，本文带大家使用TensorFlow2Keras实现AlexNetCNN架构。网络结构image从上一篇可以得到Al
论文学习记录之Deep-learning seismic full-waveform inversion for realistic structuralmodels 摘星星的屋顶论文深度学习人工智能
一、ABSTRACT—摘要标题：Deep-learningseismicfull-waveforminversionforrealisticstructuralmodels（用于真实结构模型的深度学习地震全波形反演）作者：BinLiu1,SenlinYang2,YuxiaoRen2,XinjiXu3,PengJiang2,andYangkangChen4（和SeisInvNet有共同作者，应该是同
论文学习记录之SeisInvNet（Deep-Learning Inversion of Seismic Data）摘星星的屋顶论文人工智能
目录1INTRODUCTION—介绍2RELATEDWORKS—相关作品3METHODOLOGYANDIMPLEMENTATION—方法和执行3.1方法3.2执行4EXPERIMENTS—实验4.1数据集准备4.2实验设置4.3基线模型4.4定向比较4.5定量比较4.6机理研究5CONCLUSION—结论1INTRODUCTION—介绍地震勘探是根据地震波在大地中的传播规律来确定地下地层结构的一种
易 AI - 机器学习计算机视觉基础 CatchZeng
原文：http://makeoptim.com/deep-learning/yiai-cv计算机视觉表达黑白图灰度图彩色图操作卷积均值滤波归一化统一量纲加速模型训练梯度下降GPU浮点运算小结参考链接上一篇讲解了机器学习数据集的概念以及如何收集图片数据集。收集到的数据是被训练的对象，那么怎么表示这些数据呢？数据又需要被怎么操作呢？本文为大家讲解计算机视觉基础，帮助大家在后面的课程中更好地理解和训练模
【Pytorch】Transposed Convolution bryant_meng pytorch 人工智能 python 反卷积逆卷积
文章目录1卷积2反/逆卷积3MaxUnpool/ConvTranspose4encoder-decoder5可视化学习参考来自：详解逆卷积操作–Up-samplingwithTransposedConvolutionPyTorch使用记录https://github.com/naokishibuya/deep-learning/blob/master/python/transposed_convo
2-EagleC: A deep-learning framework for detecting a full range of structural variations from bulk... 怎么不是呐
Hi-C技术：检测人类基因组结构变异（SVs）的一种有前景的方法。目前严重缺乏能够使用Hi-C数据进行全范围SV检测的算法,只能以低于最佳的分辨率识别染色体间易位和远程染色体内SVs（>1mb）。本文开发了一个深度学习模型，结合了深度学习和集成学习策略的框架，以高分辨率预测全范围的SVs——EagleC在癌症基因组中认识了许多先前未知的融合事件，也发掘了已知致癌基因的新型调控机制，这些发现为癌症分
用数据玩点花样！如何构建skim-gram模型来训练和可视化词向量机器之心V php 人工智能
本文介绍了如何在TensorFlow中实现skim-gram模型，并用TensorBoard进行可视化。GitHub地址：https://github.com/priya-dwivedi/Deep-Learning/blob/master/word2vec_skipgram/Skip-Grams-Solution.ipynb本教程将展示如何在TensorFlow中实现skim-gram模型，以便为
Deep-learning 斗战胜佛oh
图卷积网络在药物研发中的应用综述尽管深度学习在很多领域在过去的几年取得了一定的成功，但是在分子信息和药物发现领域成功的应用依然有限。适用于深层架构的结构化数据方面的最新进展为药物研究开辟了新的范例。该篇从四个角度阐述了图神经网络在药物发现和分子信息领域的应用。1）分子属性和活性预测；2）相互作用预测；3）合成预测；4）从头药物设计。最后总结了药物相关问题的代表性应用。讨论将图卷积网络应用于药物发现
用BERT进行机器阅读理解 javastart 自然语言
这里可以找到带有代码的Github存储库:https://github.com/edwardcqian/bert_QA。本文将讨论如何设置此项功能.机器（阅读）理解是NLP的领域，我们使用非结构化文本教机器理解和回答问题。https://www.coursera.org/specializations/deep-learning?ranMID=40328&ranEAID=J2RDoRlzkk&ra
停车场车位检测思路梳理杂七杂八的
输入列表图像，在工具台中输出图像defshow_images(self,images,cmap=None):输入的是某一张图片和给图片的name，make_write表示是否需要yyyyafafaffadfsfgf10.fhttps://github.com/priya-dwivedi/Deep-Learning/tree/master/parking_spots_detector/train_d
AI - Ubuntu 机器学习环境 (TensorFlow GPU, JupyterLab, VSCode) CatchZeng
原文：https://makeoptim.com/deep-learning/tensorflow-gpu-on-ubuntu介绍所需软件安装前GCCNVIDIApackagerepositoriesNVIDIAmachinelearningNVIDIAGPUdriverCUDAToolKitandcuDNNTensorRTMiniconda虚拟环境安装TensorFlow安装JupyterLab
deep-learning(1) - 随手记录的知识点 Laniakea_01d0
业界通常认为第一层是隐藏层的第一层AI会遇上工程类问题Padding补零操作，可以保证卷积核在每块区域都进行卷积，迭代次数越多，更有效果，提取特征更好生成器和迭代器，存在的意义，一般我们需要对一个数组进行操作的时候，我们要遍历出来操作，比如一亿个参数，我们不可能一次性全部取出来，一个一个的去取，这就是生成器存在的意义。Dataloader加载数据到内存Next（iter（a））转换成0，1转换成正
易 AI - AlexNet 论文深度讲解 CatchZeng
原文：https://makeoptim.com/deep-learning/yiai-paper-alexnet论文地址阅读方式ImageNetClassificationwithDeepConvolutionalNeuralNetworks使用深度卷积神经网络的ImageNet分类Abstract摘要1Introduction1简介2TheDataset2数据集3TheArchitecture
AI - Mac M1 机器学习环境 (TensorFlow, JupyterLab, VSCode) CatchZeng
原文https://makeoptim.com/deep-learning/mac-m1-tensorflowXcodeCommandLineToolsHomebrewMiniforge下载AppleTensorFlow创建虚拟环境安装必须的包安装特殊版本的pip和其他包安装Apple提供的包(numpy,grpcio,h5py)安装额外的包安装TensorFlow测试JupyterLabVSCo
易 AI - 机器学习卷积神经网络（CNN） CatchZeng
原文：http://makeoptim.com/deep-learning/yiai-cnn卷积神经网络结构输入层隐藏层输出层TensorFlow中定义卷积神经网络模型宏观理解卷积神经网络全连接采样卷积小结上一篇介绍了如何在TensorFlow中加载数据集。从本文开始将以王者荣耀为例，介绍卷积神经网络（CNN）。由于涉及的内容较多，本文主要先介绍以下内容：卷积神经网络结构TensorFlow中定义
易 AI - 使用 TensorFlow Object Detection API 训练自定义目标检测模型 CatchZeng
原文：https://makeoptim.com/deep-learning/yiai-object-detection前言目标检测位置发展史传统方法（候选区域+手工特征提取+分类器）RegionProposal+CNN（Two-stage）端到端（One-stage）TensorFlowObjectDetectionAPI安装依赖项安装API工程创建数据集图片标注创建TFRecord模型训练下载
AI - Mac 机器学习环境 (TensorFlow, JupyterLab, VSCode) CatchZeng
原文：https://makeoptim.com/deep-learning/mac-tensorflowCondaAnacondaMiniconda创建虚拟环境安装tensorflow检查安装JupyterLab启动安装其他依赖JupyterLab运行tensorflow安装VSCodeVSCode运行tensorflow小结延伸阅读在MacM1机器学习环境讲述了如何在M1芯片的Mac搭建机器学
NLP(新闻文本分类)——数据读取与数据分析浩波的笔记 NLP 机器学习 python nlp
初始数据importpandasaspddf_train=pd.read_csv('E:/python-project/deep-learning/datawhale/nlp/news-data/train_set.csv/train_set.csv',sep='\t')df_test=pd.read_csv('E:/python-project/deep-learning/datawhale/n
AI - Apple Silicon Mac M1 原生支持 TensorFlow 2.6 GPU 加速（tensorflow-metal PluggableDevice） CatchZeng
原文：http://makeoptim.com/deep-learning/tensorflow-metal前言系统要求当前不支持XcodeCommandLineToolsHomebrewMiniforge创建虚拟环境安装Tensorflowdependencies首次安装升级安装安装Tensorflow安装metalplugin安装必须的包测试JupyterLabVSCode延伸阅读参考前言几天
易 AI - ResNet 论文深度讲解 CatchZeng
原文：https://makeoptim.com/deep-learning/yiai-paper-resnet论文地址阅读方式DeepResidualLearningforImageRecognition图像识别的深度残差学习Abstract摘要1Introduction1简介2RelatedWork2相关工作3.DeepResidualLearning3.深度残差学习3.1.ResidualL
Windows安装PyTorch-CPU Ann剑安装PyTorch pytorch windows python
看了好多大佬的教程，终于给自己老旧电脑成功安装了PyTorch本电脑安装的软件PyTorch=1.12.1anaconda版本为conda4.8.2（anaconda自行安装）开始前以管理员方式运行anacondaprompt一、安装PyTorch一、安装PyTorch（1）创建环境为deep-learning，也可以为PyTorch（就是一个名字）。指定Python版本condacreate-n
transformer(Bert)的多头注意力对每一个head进行降维的分析想赚钱的雷大
背景：在用keras的multiattention模块做实验的时候，发现学习参数随着头数的增多而增多，与transformer中的实现不太一致结果：本着想了解透彻的思路去网上搜索了一番，第一篇我就觉得整理的不错，附上链接：http://www.sniper97.cn/index.php/note/deep-learning/note-deep-learning/4002/总结一下：一言蔽之的话，大
nvidia 3060 + cuda + cudnn + tf 代码&诗 tensorflow python 深度学习
参考：https://eipi10.cn/deep-learning/2019/11/28/centos_cuda_cudnn/1.环境版本：CentOSLinuxrelease7.8.2003(Core)Tensorflow-gpu2.5nvidia3060cuda11.2.2cudnn-11.32.环境检查：lscpi|grep-invidia#要有nvidia设备3.首先安装nvidia-3
identifier “THCudaCheck“ is undefined 的解决方法莫说相公痴 Machine Learning Python Pytorch 深度学习 pytorch 人工智能
THCudaCheck在pytorch1.11.0版本被移除了，可以看文档https://www.exxactcorp.com/blog/Deep-Learning/pytorch-1-11-0-now-available解决方法是将THCudaCheck替换成C10_CUDA_CHECK
交通事故预测—《Traffic Accident’s Severity Prediction: A Deep-Learning Approach-Based CNN Network》永恒的记忆2019 科研论文 python 机器学习人工智能
一、文章信息《TrafficAccident’sSeverityPrediction:ADeep-LearningApproach-BasedCNNNetwork》，2019年Access上的一篇文章。二、摘要基于交通事故特征的权重，提出了基于特征矩阵的灰色图像(FM2GI)算法，将交通事故数据的单一特征关系转换为包含并行组合关系的灰色图像作为模型的输入变量，网络模型是基于CNN。（也就是说这篇文
通过 MQTT 检测对象和传输图像 woshicver python opencv vnc cv opengl
在本文中，我们将学习如何使用open-cv和YOLO对象检测器每五秒捕获/保存和检测图像中的对象。然后我们将图像转换为字节数组并通过MQTT发布，这将在另一个远程设备上接收并保存为JPG。我们将使用YoloV3算法和一个免费的MQTT代理YoloV3算法：https://viso.ai/deep-learning/yolov3-overview/#:~:text=What's%20Next%3F-
DNN(Deep-Learning Neural Network) sherlock31415931 ML 神经网络深度学习人工智能 tensorflow numpy
DNN(Deep-LearningNeuralNetwork)接下来介绍比较常见的全连接层网络（fully-connectedfeedfowardneruralnetwork）名词解释首先介绍一下神经网络的基本架构，以一个神经元为例输入是一个向量，权重（weights）也是一个矩阵把两个矩阵进行相乘，最后加上偏差（bias），即w1*x1+w2*x2+b神经元里面会有一个激活函数（activati
AlexNet详解 tt丫深度学习人工智能深度学习神经网络 AlexNet
入门小菜鸟，希望像做笔记记录自己学的东西，也希望能帮助到同样入门的人，更希望大佬们帮忙纠错啦~侵权立删。✨完整代码在我的github上，有需要的朋友可以康康✨GitHub-tt-s-t/Deep-Learning:Storesomeofyourownin-depthlearningcode,whichiscurrentlyintheupdatestage.Thecontentcovers:each
论文解读：ProteinBERT: a universal deep-learning model of protein sequence and function wangpan007 生信论文神经网络 python编程深度学习神经网络 python
目录1.研究背景2.研究数据2.1预训练的蛋白质数据集2.2蛋白质基准数据集3.研究方法3.1序列和标注编码3.2蛋白质序列和注释的自我监督预训练3.3对蛋白质基准进行监督微调3.4深度学习框架4.结果4.1预训练可以改善蛋白质模型4.2ProteinBERT在不同的蛋白质基准上达到了近乎最先进的结果4.4全局注意力机制的理解5.结论作者单位：耶路撒冷希伯来大学发表期刊：《Bioinformati
【U-Net2015】U-Net: Convolutional Networks for Biomedical Image Segmentation mage Segmentation 不会声调的博er 深度学习 caffe 计算机视觉
U-Net:ConvolutionalNetworksforBiomedicalmageSegmentation生物医学图像语义分割的卷积神经网络arXiv:1505.04597v1[cs.CV]18May2015文章地址：https://arxiv.org/abs/1505.04597代码地址：https://github.com/Jack-Cherish/Deep-Learning/tree/
java工厂模式 3213213333332132 java 抽象工厂
工厂模式有 1、工厂方法 2、抽象工厂方法。下面我的实现是抽象工厂方法, 给所有具体的产品类定一个通用的接口。 package 工厂模式; /** * 航天飞行接口 * * @Description * @author FuJianyong * 2015-7-14下午02:42:05 */ public interface SpaceF
nginx频率限制+python测试 ronin47 nginx 频率 python
部分内容参考：http://www.abc3210.com/2013/web_04/82.shtml 首先说一下遇到这个问题是因为网站被攻击，阿里云报警，想到要限制一下访问频率，而不是限制ip（限制ip的方案稍后给出）。nginx连接资源被吃空返回状态码是502，添加本方案限制后返回599，与正常状态码区别开。步骤如下：
java线程和线程池的使用 dyy_gusi ThreadPool thread Runnable timer
java线程和线程池一、创建多线程的方式 java多线程很常见，如何使用多线程，如何创建线程，java中有两种方式，第一种是让自己的类实现Runnable接口，第二种是让自己的类继承Thread类。其实Thread类自己也是实现了Runnable接口。具体使用实例如下： 1、通过实现Runnable接口方式 1 2
Linux 171815164 linux
ubuntu kernel http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.1.2-unstable/ 安卓sdk代理 mirrors.neusoft.edu.cn 80 输入法和jdk sudo apt-get install fcitx su
Tomcat JDBC Connection Pool g21121 Connection
Tomcat7 抛弃了以往的DBCP 采用了新的Tomcat Jdbc Pool 作为数据库连接组件，事实上DBCP已经被Hibernate 所抛弃，因为他存在很多问题，诸如：更新缓慢，bug较多，编译问题，代码复杂等等。 Tomcat Jdbc P
敲代码的一点想法永夜-极光 java 随笔感想
入门学习java编程已经半年了,一路敲代码下来,现在也才1w+行代码量,也就菜鸟水准吧,但是在整个学习过程中,我一直在想,为什么很多培训老师,网上的文章都是要我们背一些代码?比如学习Arraylist的时候,教师就让我们先参考源代码写一遍,然
jvm指令集程序员是怎么炼成的 jvm 指令集
转自：http://blog.csdn.net/hudashi/article/details/7062675#comments 将值推送至栈顶时 const ldc push load指令 const系列该系列命令主要负责把简单的数值类型送到栈顶。(从常量池或者局部变量push到栈顶时均使用) 0x02 &nbs
Oracle字符集的查看查询和Oracle字符集的设置修改 aijuans oracle
本文主要讨论以下几个部分：如何查看查询oracle字符集、修改设置字符集以及常见的oracle utf8字符集和oracle exp 字符集问题。一、什么是Oracle字符集 Oracle字符集是一个字节数据的解释的符号集合,有大小之分,有相互的包容关系。ORACLE 支持国家语言的体系结构允许你使用本地化语言来存储，处理，检索数据。它使数据库工具，错误消息，排序次序，日期，时间，货
png在Ie6下透明度处理方法 antonyup_2006 css 浏览器 Firebug IE
由于之前到深圳现场支撑上线，当时为了解决个控件下载，我机器上的IE8老报个错，不得以把ie8卸载掉，换个Ie6,问题解决了，今天出差回来，用ie6登入另一个正在开发的系统，遇到了Png图片的问题，当然升级到ie8(ie8自带的开发人员工具调试前端页面JS之类的还是比较方便的，和FireBug一样，呵呵)，这个问题就解决了，但稍微做了下这个问题的处理。我们知道PNG是图像文件存储格式，查询资
表查询常用命令高级查询方法(二) 百合不是茶 oracle 分页查询分组查询联合查询
----------------------------------------------------分组查询 group by having --平均工资和最高工资 select avg(sal)平均工资,max(sal) from emp ; --每个部门的平均工资和最高工资
uploadify3.1版本参数使用详解 bijian1013 JavaScript uploadify3.1
使用：绑定的界面元素<input id='gallery'type='file'/>$("#gallery").uploadify({设置参数，参数如下}); 设置的属性： id: jQuery(this).attr('id'),//绑定的input的ID langFile: 'http://ww
精通Oracle10编程SQL(17)使用ORACLE系统包 bijian1013 oracle 数据库 plsql
/* *使用ORACLE系统包 */ --1.DBMS_OUTPUT --ENABLE:用于激活过程PUT,PUT_LINE,NEW_LINE,GET_LINE和GET_LINES的调用 --语法：DBMS_OUTPUT.enable(buffer_size in integer default 20000); --DISABLE:用于禁止对过程PUT,PUT_LINE,NEW
【JVM一】JVM垃圾回收日志 bit1129 垃圾回收
将JVM垃圾回收的日志记录下来，对于分析垃圾回收的运行状态，进而调整内存分配(年轻代，老年代，永久代的内存分配)等是很有意义的。JVM与垃圾回收日志相关的参数包括： -XX:+PrintGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -Xloggc -XX:+PrintGC 通
Toast使用白糖_ toast
Android中的Toast是一种简易的消息提示框，toast提示框不能被用户点击，toast会根据用户设置的显示时间后自动消失。创建Toast 两个方法创建Toast makeText(Context context, int resId, int duration) 参数：context是toast显示在
angular.identity boyitech AngularJS AngularJS API
angular.identiy 描述: 返回它第一参数的函数. 此函数多用于函数是编程. 使用方法: angular.identity(value); 参数详解: Param Type Details value * to be returned. 返回值: 传入的value 实例代码: <!DOCTYPE HTML>
java-两整数相除，求循环节 bylijinnan java
import java.util.ArrayList; import java.util.List; public class CircleDigitsInDivision { /** * 题目：求循环节，若整除则返回NULL，否则返回char*指向循环节。先写思路。函数原型：char*get_circle_digits(unsigned k,unsigned j)
Java 日期周年 Chen.H java C++c C#
/** * java日期操作(月末、周末等的日期操作) * * @author * */ public class DateUtil { /** */ /** * 取得某天相加(减)後的那一天 * * @param date * @param num *
[高考与专业]欢迎广大高中毕业生加入自动控制与计算机应用专业 comsci 计算机
不知道现在的高校还设置这个宽口径专业没有,自动控制与计算机应用专业,我就是这个专业毕业的,这个专业的课程非常多,既要学习自动控制方面的课程,也要学习计算机专业的课程,对数学也要求比较高.....如果有这个专业,欢迎大家报考...毕业出来之后,就业的途径非常广..... 以后
分层查询（Hierarchical Queries） daizj oracle 递归查询层次查询
Hierarchical Queries If a table contains hierarchical data, then you can select rows in a hierarchical order using the hierarchical query clause: hierarchical_query_clause::= start with condi
数据迁移 daysinsun 数据迁移
最近公司在重构一个医疗系统，原来的系统是两个.Net系统，现需要重构到java中。数据库分别为SQL Server和Mysql，现需要将数据库统一为Hana数据库，发现了几个问题，但最后通过努力都解决了。 1、原本通过Hana的数据迁移工具把数据是可以迁移过去的，在MySQl里面的字段为TEXT类型的到Hana里面就存储不了了，最后不得不更改为clob。 2、在数据插入的时候有些字段特别长
C语言学习二进制的表示示例 dcj3sjt126com c basic
进制的表示示例 # include <stdio.h> int main(void) { int i = 0x32C; printf("i = %d\n", i); /* printf的用法 %d表示以十进制输出 %x或%X表示以十六进制的输出 %o表示以八进制输出 */ return 0; }
NsTimer 和 UITableViewCell 之间的控制 dcj3sjt126com ios
情况是这样的: 一个UITableView, 每个Cell的内容是我自定义的 viewA viewA上面有很多的动画, 我需要添加NSTimer来做动画, 由于TableView的复用机制, 我添加的动画会不断开启, 没有停止, 动画会执行越来越多. 解决办法: 在配置cell的时候开始动画, 然后在cell结束显示的时候停止动画查找cell结束显示的代理
MySql中case when then 的使用 fanxiaolong casewhenthenend
select "主键", "项目编号", "项目名称","项目创建时间", "项目状态","部门名称","创建人" union (select pp.id as "主键", pp.project_number as &
Ehcache（01）——简介、基本操作 234390216 cache ehcache 简介 CacheManager crud
Ehcache简介目录 1 CacheManager 1.1 构造方法构建 1.2 静态方法构建 2 Cache 2.1&
最容易懂的javascript闭包学习入门 jackyrong JavaScript
http://www.ruanyifeng.com/blog/2009/08/learning_javascript_closures.html 闭包（closure）是Javascript语言的一个难点，也是它的特色，很多高级应用都要依靠闭包实现。下面就是我的学习笔记，对于Javascript初学者应该是很有用的。一、变量的作用域要理解闭包，首先必须理解Javascript特殊
提升网站转化率的四步优化方案 php教程分享数据结构 PHP 数据挖掘 Google 活动
网站开发完成后,我们在进行网站优化最关键的问题就是如何提高整体的转化率，这也是营销策略里最最重要的方面之一，并且也是网站综合运营实例的结果。文中分享了四大优化策略：调查、研究、优化、评估，这四大策略可以很好地帮助用户设计出高效的优化方案。 PHP开发的网站优化一个网站最关键和棘手的是，如何提高整体的转化率，这是任何营销策略里最重要的方面之一，而提升网站转化率是网站综合运营实力的结果。今天，我就分
web开发里什么是HTML5的WebSocket？ naruto1990 Web html5 浏览器 socket
当前火起来的HTML5语言里面，很多学者们都还没有完全了解这语言的效果情况，我最喜欢的Web开发技术就是正迅速变得流行的 WebSocket API。WebSocket 提供了一个受欢迎的技术，以替代我们过去几年一直在用的Ajax技术。这个新的API提供了一个方法，从客户端使用简单的语法有效地推动消息到服务器。让我们看一看6个HTML5教程介绍里的 WebSocket API：它可用于客户端、服
Socket初步编程——简单实现群聊 Everyday都不同 socket 网络编程初步认识
初次接触到socket网络编程，也参考了网络上众前辈的文章。尝试自己也写了一下，记录下过程吧：服务端：（接收客户端消息并把它们打印出来） public class SocketServer { private List<Socket> socketList = new ArrayList<Socket>(); public s
面试：Hashtable与HashMap的区别（结合线程） toknowme
昨天去了某钱公司面试，面试过程中被问道 Hashtable与HashMap的区别？当时就是回答了一点，Hashtable是线程安全的，HashMap是线程不安全的，说白了，就是Hashtable是的同步的，HashMap不是同步的，需要额外的处理一下。今天就动手写了一个例子，直接看代码吧 package com.learn.lesson001; import java
MVC设计模式的总结 xp9802 设计模式 mvc 框架 IOC
随着Web应用的商业逻辑包含逐渐复杂的公式分析计算、决策支持等，使客户机越来越不堪重负，因此将系统的商业分离出来。单独形成一部分，这样三层结构产生了。其中‘层’是逻辑上的划分。三层体系结构是将整个系统划分为如图2.1所示的结构[3] （1）表现层（Presentation layer）：包含表示代码、用户交互GUI、数据验证。该层用于向客户端用户提供GUI交互，它允许用户

Total params:	185,484
Trainable params:	185,484
Non-trainable params:	0
bidirectional_1’s output shape	(None, 30, 128)
repeat_vector_1’s output shape	(None, 30, 128)
concatenate_1’s output shape	(None, 30, 256)
attention_weights’s output shape	(None, 30, 1)
dot_1’s output shape	(None, 1, 128)
dense_2’s output shape	(None, 11)