TensorFlow2.x 学习笔记(五)神经网络与全连接层

文章目录

  • 数据加载
    • keras.datasets
      • boston housing
      • mnist/fashion mnist
      • cifar10/100
      • imdb
    • tf.data.Dataset
    • Example
    • 全连接层
      • Dense
      • Sequential
    • Output
      • y ∈ R d y \in R^d yRd
      • y i ∈ [ 0 , 1 ] y_i \in [0,1] yi[0,1]
        • tf.sigmoid
      • y i ∈ [ 0 , 1 ] , ∑ y i = 1 y_i \in [0,1], \sum{y_i} = 1 yi[0,1],yi=1
        • softmax
        • Classification
      • y i ∈ [ − 1 , 1 ] y_i \in [-1, 1] yi[1,1]
        • tanh
    • 损失函数
      • MSE
      • Cross Entropy Loss
        • Entropy
        • Cross Entropy
        • Classification
          • Single output(binary)
          • Multi output
          • Categorical Cross Entropy
          • Numerical Stability

数据加载

keras.datasets

boston housing

  • Boston housing price regression dataset

mnist/fashion mnist

  • MNIST/Fashion-MNIST dataset

MNIST

(x, y), (x_test, y_test) = keras.datasets.mnist.load_data()
x.shape # 60000,28,28
y.shape # 60000,
x_test.shape, y_test.shape
# (10000, 28, 28), (10000, )
y_onehot = tf.one_hot(y, depth=10)

cifar10/100

  • small images classification dataset
  • 10/100区别在于大类下的label划分
  • [32, 32, 3]
(x, y), (x_test, y_test) = keras.datasets.cifar10.load_data()
x.shape, y.shape, x_test.shape, y_test.shape
# ((50000, 32, 32, 3), (50000, 1), (10000, 32, 32, 3), (10000, 1))

imdb

  • sentiment classification dataset

tf.data.Dataset

  • from_tensor_slices()
(x, y), (x_test, y_test) = keras.datasets.cifar10.load_data()
db = tf.data.Dataset.from_tensor_slices(x_test)
next(iter(db)).shape
# [32, 32, 3]
db = tf.data.Dataset.from_tensor_slices((x_test, y_test))
next(iter(db))[0].shape
# [32, 32, 3]
  • .shuffle
db = tf.data.Dataset.from_tensor_slices((x_test, y_test))
db = fb.shuffle(10000)
  • .map
def preprocess(x, y):
	x = tf.cast(x, dtype=tf.float32)/255.
	y = tf.cast(y, dtype=tf.int32)
	y = tf.one_hot(y, depth=10)
	return x, y
db2 = db.map(preprocess)
res = next(iter(db2))
res[0].shape, res[1].shape
# [32, 32, 3], [1, 10]
res[1][:2]
# shape:(1, 10) dtype = float32
  • batch
db3 = db2.batch(32)
res = next(iter(db3))
res[0].shape, res[1].shape
# [32, 32, 32, 3], [32, 1, 10]
  • StopIteration
db_iter = iter(db3)
while True:
	x,y = next(db_iter)
##########OutOfRangeError#############
  • .repeat()
db4 = db3.repeat()
db4 = db3.repeat(4)

Example

def pre_mnist(x, y):
	x = tf.cast(x, tf.float32) / 255.0
	y = tf.cast(y, tf.int64)
	return x, y
def mnist_dataset():
	(x, y), (x_val, y_val) = datasets.fashion_mnist.load_data()
	y = tf.one_hot(y, depth=10)
	y_val = tf.one_hot(y_val, depth=10)
	
	ds = tf.data.Dataset.from_tensor_slices((x, y))
	ds = ds.map(pre_mnist)
	ds = ds.shuffle(60000).batch(100)
	ds_val = tf.data.Dataset.from_tensor_slices((x_val, y_val))
	ds_val = ds_val.map(pre_mnist)
	ds_val = ds_val.shuffle(10000).batch(100)
	return ds, ds_val

全连接层

Dense

x = tf.random.normal([4,784])
net = tf.keras.layers.Dense(512)
out = net(x)
out.shape
TensorShape([4, 512])
net.kernel.shape, net.bias.shape
# [784, 512], [512]
net = tf.keras.layers.Dense(10)
net.bias

net.get_weights() # []
net.weights # []

net.build(input_shape=(None, 4))
net.kernel.shape, net.bias.shape
# [4, 10], [10]

net,build(input_shape=(None, 20))
net.kernel.shape, net.bias.shape
# [20, 10], [10]

Sequential

x = tf.random.normal([2, 3])
model = keras.Sequential([
		keras.layers.Dense(2, activation='relu'),
		keras.layers.Dense(2, activation='relu'),
		keras.layers.Dense(2)
	])
model.build(input_shape=[None, 3])
model.summary()

for p inmodel.trainable_variables:
	print(p.name, p.shape)

Output

y ∈ R d y \in R^d yRd

  • linear regression
  • naive classification with MSE
  • other general prediction
  • out = relu(X@W + b)
    • logits

y i ∈ [ 0 , 1 ] y_i \in [0,1] yi[0,1]

  • binary classification
    • y > 0.5 → 1 y>0.5\rightarrow 1 y>0.51
    • y > 0.5 → 1 y>0.5\rightarrow 1 y>0.51
  • Image Generation
    • rgb

tf.sigmoid

σ ( x ) = 1 1 + e − x \sigma(x) = \frac{1}{1+e^{-x}} σ(x)=1+ex1

a = tf.linspace(-6., 6, 10)
tf.sigmoid(a)
x = tf.random.normal([1, 28, 28])*5
tf.reduce_min(x), tf.reduce_max(x)

x = tf.sigmoid(x)
tf.reduce_min(x), tf.reduce_max(x)

y i ∈ [ 0 , 1 ] , ∑ y i = 1 y_i \in [0,1], \sum{y_i} = 1 yi[0,1],yi=1

softmax

a = tf.linspace(-2., 2, 5)
tf.nn.softmax(a)

Classification

logits = tf.random.uniform([1, 10], minval=-2, maxval=2)
prob = tf.nn.softmax(logits, axis=1)
tf.reduce_sum(prob, axis=1)

y i ∈ [ − 1 , 1 ] y_i \in [-1, 1] yi[1,1]

tanh

a = tf.linspace(-2., 2, 5)
tf.tanh(a)

损失函数

MSE

  • M S E = 1 N ∑ ( y − o u t ) 2 MSE = \frac{1}{N} \sum{(y - out)^2} MSE=N1(yout)2
  • L 2 − n o r m = ∑ ( y − o u t ) 2 L_{2-norm} = \sqrt{\sum{(y - out)^2}} L2norm=(yout)2
y = tf.constant([1, 2, 3, 0, 2])
y = tf.one_hot(y, depth=4)
y = tf.cast(y, dtype=tf.float32)

out = tf.random.normal([5, 4])

loss1 = tf.reduce_mean(tf.square(y - out))
loss2 = tf.square(tf.norm(y - out))/(5 * 4)
loss3 = tf.reduce_mean(tf.losses.MSE(y, out))#这个mean求的是batch的平均,MSE输出的shape为(5,)
#loss1 = loss2 = loss3

Cross Entropy Loss

Entropy

  • H ( P ) = − ∑ i p ( i ) l o g p ( i ) = − E x ∼ p [ l o g p ( x ) ] H(P) = -\sum\limits_{i}{p(i)logp(i)} = -E_{x\sim p}{[logp(x)]} H(P)=ip(i)logp(i)=Exp[logp(x)]

Cross Entropy

  • H ( p , q ) = − ∑ p ( x ) l o g q ( x ) = − E x ∼ p [ l o g q ( x ) ] H(p,q) = - \sum{p(x)logq(x)} = -E_{x\sim p}{[logq(x)]} H(p,q)=p(x)logq(x)=Exp[logq(x)]
  • H ( p , q ) = H ( p ) + D K L ( p ∣ q ) H(p,q) = H(p) + D_{KL}(p|q) H(p,q)=H(p)+DKL(pq)
  • KL散度可以衡量两个分布的差异
  • D K L ( p ∣ ∣ q ) = E x ∼ p [ l o g p ( x ) q ( x ) ] D_{KL}(p||q) = E_{x\sim p}{[log{\frac{p(x)}{q(x)}}]} DKL(pq)=Exp[logq(x)p(x)]

Classification

Single output(binary)

l o s s = − ( y l o g y ^ + ( 1 − y ) l o g ( 1 − y ^ ) ) loss = -(ylog\hat y + (1-y)log(1-\hat y)) loss=(ylogy^+(1y)log(1y^))

Multi output

onehot编码时
l o s s = − l o g q i   w h e r e   p i = 1 loss = -logq_i\ where \ pi = 1 loss=logqi where pi=1

Categorical Cross Entropy
tf.losses.categorical_crossentropy([0, 1, 0, 0], [0.25, 0.25, 0.25, 0.25])
#1.3862
tf.losses.categorical_crossentropy([0, 1, 0, 0], [0.01, 0.97, 0.01, 0.01])
#0.0304
tf.losses.BinaryCrossentropy()([1],[0.1])#class
tf.losses.binary_crossentropy([1],[0.1])#func
Numerical Stability
x = tf.random.normal([1, 784])
w = tf.random.normal([784, 2])
b = tf.zeros([2])
logits = x @ w + b
prob = tf.math.softmax(logits, axis=1)
tf.losses.categorical_crossentropy([0, 1], logits, from_logits=True)
tf.losses.categorical_crossentropy([0, 1], prob)#same as above but not recommanded

随着tf2.1的更新,这些被整合到tf.keras.losses中了

你可能感兴趣的:(tensorflow)