[cs224n] Lecture 3 – Neural Networks

 Lecture 3 – Neural Networks


1. Course plan: coming up

[cs224n] Lecture 3 – Neural Networks_第1张图片

 Homeworks

[cs224n] Lecture 3 – Neural Networks_第2张图片

A note on your experience!

[cs224n] Lecture 3 – Neural Networks_第3张图片

Lecture Plan

[cs224n] Lecture 3 – Neural Networks_第4张图片


2. Classification setup and notation

Classification intuition

[cs224n] Lecture 3 – Neural Networks_第5张图片

Details of the softmax classifier

[cs224n] Lecture 3 – Neural Networks_第6张图片

Training with softmax and cross-entropy loss

[cs224n] Lecture 3 – Neural Networks_第7张图片

Background: What is “cross entropy” loss/error?

[cs224n] Lecture 3 – Neural Networks_第8张图片

Classification over a full dataset

[cs224n] Lecture 3 – Neural Networks_第9张图片

Traditional ML optimization

[cs224n] Lecture 3 – Neural Networks_第10张图片


3. Neural Network Classifiers

[cs224n] Lecture 3 – Neural Networks_第11张图片

Neural Nets for the Win!

[cs224n] Lecture 3 – Neural Networks_第12张图片

Classification difference with word vectors

[cs224n] Lecture 3 – Neural Networks_第13张图片

Neural computation

[cs224n] Lecture 3 – Neural Networks_第14张图片

An artificial neuron

[cs224n] Lecture 3 – Neural Networks_第15张图片

A neuron can be a binary logistic regression unit

[cs224n] Lecture 3 – Neural Networks_第16张图片

A neural network = running several logistic regressions at the same time

[cs224n] Lecture 3 – Neural Networks_第17张图片

[cs224n] Lecture 3 – Neural Networks_第18张图片

Matrix notation for a layer

[cs224n] Lecture 3 – Neural Networks_第19张图片

Non-linearities (aka “f ”): Why they’re needed

[cs224n] Lecture 3 – Neural Networks_第20张图片


4. Named Entity Recognition (NER)

[cs224n] Lecture 3 – Neural Networks_第21张图片

Named Entity Recognition on word sequences

[cs224n] Lecture 3 – Neural Networks_第22张图片

Why might NER be hard?

[cs224n] Lecture 3 – Neural Networks_第23张图片


5. Binary word window classification

[cs224n] Lecture 3 – Neural Networks_第24张图片

Window classification

[cs224n] Lecture 3 – Neural Networks_第25张图片

Window classification: Softmax

[cs224n] Lecture 3 – Neural Networks_第26张图片

Simplest window classifier: Softmax

[cs224n] Lecture 3 – Neural Networks_第27张图片

Binary classification with unnormalized scores

[cs224n] Lecture 3 – Neural Networks_第28张图片

Binary classification for NER Location

[cs224n] Lecture 3 – Neural Networks_第29张图片

Neural Network Feed-forward Computation

Main intuition for extra layer

[cs224n] Lecture 3 – Neural Networks_第30张图片

The max-margin loss

[cs224n] Lecture 3 – Neural Networks_第31张图片

[cs224n] Lecture 3 – Neural Networks_第32张图片

Simple net for score

[cs224n] Lecture 3 – Neural Networks_第33张图片

Remember: Stochastic Gradient Descent

[cs224n] Lecture 3 – Neural Networks_第34张图片

Computing Gradients by Hand

Gradients

[cs224n] Lecture 3 – Neural Networks_第35张图片

Jacobian Matrix: Generalization of the Gradient

Chain Rule

[cs224n] Lecture 3 – Neural Networks_第36张图片

Example Jacobian: Elementwise activation Function

[cs224n] Lecture 3 – Neural Networks_第37张图片

Other Jacobians

[cs224n] Lecture 3 – Neural Networks_第38张图片

Back to our Neural Net!

 [cs224n] Lecture 3 – Neural Networks_第39张图片

1. Break up equations into simple pieces

[cs224n] Lecture 3 – Neural Networks_第40张图片

2. Apply the chain rule

3. Write out the Jacobians

[cs224n] Lecture 3 – Neural Networks_第41张图片

Re-using Computation

[cs224n] Lecture 3 – Neural Networks_第42张图片

[cs224n] Lecture 3 – Neural Networks_第43张图片

Derivative with respect to Matrix: Output shape

[cs224n] Lecture 3 – Neural Networks_第44张图片

Derivative with respect to Matrix

Why the Transposes?

 [cs224n] Lecture 3 – Neural Networks_第45张图片

What shape should derivatives be?

[cs224n] Lecture 3 – Neural Networks_第46张图片

Next time: Backpropagation

[cs224n] Lecture 3 – Neural Networks_第47张图片

你可能感兴趣的:(Stanford-cs224n)