吴恩达深度学习作业(week2)-(1)

出发

作业地址https://github.com/robbertliu/deeplearning.ai-andrewNG

视频,bilibili吴恩达深度学习。

推荐食用方式

def basic_sigmoid(x): """ Compute sigmoid of x.
​
Arguments:
x -- A scalar
​
Return:
s -- sigmoid(x)
"""
​
### START CODE HERE ### (≈ 1 line of code)
s = 1.0 / (1.0 + 1.0 / math.exp(x))
### END CODE HERE ###
​
return s

例如出现了START CODE HERE 下面的就是答案,推荐不看它,直到自己没有思路。 看到def argument return基本上就算是编程作业了,根据之前的内容就可以尝试解出。


1 - Building basic functions with numpy

1.1sigmoid function, np.exp()

()=1/(1+e^-x)

一种激活函数。将值映射到[0,1]上,因为图像两边斜率倾向于0,值一般取的很小。

def basic_sigmoid(x): """ Compute sigmoid of x.
​
Arguments:
x -- A scalar
​
Return:
s -- sigmoid(x)
"""
​
### START CODE HERE ### (≈ 1 line of code)
s = 1.0 / (1.0 + 1.0 / math.exp(x))
### END CODE HERE ###
​
return s

这是用 math包的exp指数函数方法计算sigmoid函数。

import numpy as np
import math
import matplotlib.pyplot as plt  ### 导入绘制图像的包
​
​
def basic_sigmoid(x):
    return 1.0 / (1.0 + math.exp(-x))          #即e^-x
​
​
## 测试sigmoid 函数
print(basic_sigmoid(3))
​
x = np.linspace(-5, 5, 100)         #numpy中的切分,即将x=-5到5中的值切成100份创建一个数组
​
y = []
​
for i in range(100):
    y.append(basic_sigmoid(x[i]))
plt.plot(x, y, c="r")             #X轴显示x,Y显示y,c即颜色 “r”指的red
plt.show()
​

然后是numpy包的exp方法。

import numpy as np
import matplotlib.pyplot as mpl
​
x = np.array([1, 2, 3])## 如果是([[1,2,3]])就会创建向量,且为1行3列的行向量。 可以直接在这改后跑下
print(np.exp(x))
​
print(x)
print(x.shape)  
​
# shape[0]代表行数,shape[1]代表列数,但由于x是数组,故shape[0]即其长度
print(x.shape[0])
​
print(x + 3) #这里如果是向量,会有一个 广播机制 之后会有解释 简单说 矩阵行列有一个相等 3会自己改变维度
​
x = np.linspace(-5, 5, 1000)
​
​
def numpy_Sigmoid(i):
    return 1.0 / (1.0 + np.exp(-i))
​
​
mpl.plot(x, numpy_Sigmoid(x), c='green')
mpl.show()
​

上面的print() 可以删除。

math虽然也好用,但不能接收向量,矩阵,等数值。而numpy可以。故主要使用numpy。

1.2 - Sigmoid gradient

sigmoid的梯度。单个拿出来看就是导数。

sigmoid的导数 sigmoid=a(x) a(x)`=a(x)(1-a(x))

# GRADED FUNCTION: sigmoid_derivative
​
def sigmoid_derivative(x):
    """
    Compute the gradient (also called the slope or derivative) of the sigmoid function with respect to its input x.
    You can store the output of the sigmoid function into variables and then use it to calculate the gradient.
    
    Arguments:
    x -- A scalar or numpy array
​
    Return:
    ds -- Your computed gradient.

import numpy as np
import matplotlib.pyplot as mpl
​
​
def numpy_Sigmoid(i):
    return 1.0 / (1.0 + np.exp(-i))
​
​
def sigmoidGradient(i):
    return numpy_Sigmoid(i) * (1 - numpy_Sigmoid(i))
    ## 假设i是列向量 那return的究竟会是什么呢  所以这里要加* 是 逐元素乘积
​
​
a = np.array([1, 2, 3])
​
print(sigmoidGradient(a))
​
x = np.linspace(-15, 15, 1000)
​
mpl.plot(x, sigmoidGradient(x), 'black')
mpl.show()

顺便贴一个答案的

  ### START CODE HERE ### (≈ 2 lines of code)
    s = 1.0 / (1.0 + np.exp(-1.0 * x))
    ds = s * (1 - s)
    ### END CODE HERE ###

1.3 - Reshaping arrays

设置新的向量,矩阵的维度

# GRADED FUNCTION: image2vector
def image2vector(image):
    """
    Argument:
    image -- a numpy array of shape (length, height, depth)
    
    Returns:
    v -- a vector of shape (length*height*depth, 1)
    """

import numpy as np
​
def image2vector(image):
    v = image
    return v.reshape((v.shape[0]*v.shape[1]),(v.shape[2]))
​
image = np.array([[[ 0.67826139,  0.29380381],
        [ 0.90714982,  0.52835647],
        [ 0.4215251 ,  0.45017551]],
​
       [[ 0.92814219,  0.96677647],
        [ 0.85304703,  0.52351845],
        [ 0.19981397,  0.27417313]],
​
       [[ 0.60659855,  0.00533165],
        [ 0.10820313,  0.49978937],
        [ 0.34144279,  0.94630077]]])
print(image.shape)
print(image2vector(image).shape)
print(np.array( image2vector(image)).shape)

这个转换方式细节有待研究,而且3转2也很让人疑惑。

1.4Normalizing rows

标准化,也可以说是范数。

标准化的rows 能更好的让梯度下降收敛

可以说是将行的每个元素

# GRADED FUNCTION: normalizeRows
​
def normalizeRows(x):
    """
    Implement a function that normalizes each row of the matrix x (to have unit length).
    
    Argument:
    x -- A numpy matrix of shape (n, m)
    
    Returns:
    x -- The normalized (by row) numpy matrix. You are allowed to modify x.
    """

import numpy as np
​
def normalizingRows(x):
    ## 即axis=1为横向,axis=0为纵向
    ## keepdims 的意义可以认为是返回结果是否维持为矩阵或者向量形式,而不是(2,)这种数组的形式
    ## normalizingRows2 为没有keepdims参数的写法,
    b=np.linalg.norm(x,ord=None,axis=1,keepdims=True)
    print(b)
    print(b.shape)
    return x/b
​
​
a = np.array([[0,3,4],[2,6,4]])
print(a.shape)
​
print(normalizingRows(a))
print(normalizingRows(a).shape)
​
print('`````````````````````')
​
def normalizingRows2(x):
    ## 即axis=1为横向,axis=0为纵向
    b=np.linalg.norm(x,ord=None,axis=1)
    c=b.reshape(2,1)
    return x/c
print(normalizingRows2(a))

1.5Broadcasting and the softmax function

广播和 softmax方程

softmax 即 假设有m个x,那么x1的softmax 为e^x1/(e^x1+e^x2+e^x3...+e^xm)

def softmax(x):
    """Calculates the softmax for each row of the input x.
​
    Your code should work for a row vector and also for matrices of shape (n, m).
​
    Argument:
    x -- A numpy matrix of shape (n,m)
​
    Returns:
    s -- A numpy matrix equal to the softmax of x, of shape (n,m)
    """
import  numpy as np
x = np.array([
    [9, 2, 5, 0, 0],
    [7, 5, 0, 0 ,0]])
​
​
def softmax(x):
    a = np.exp(x)
    b=np.sum(a,axis=1,dtype=None,out=None,keepdims=True) ### 即计算没一行的总值 组成一个向量
    print(b)
    print(b.shape)
    print(a/b)
    return a/b
​
​
print(softmax(x))

关于广播 ,即一个多维矩阵的行和列与另一个多维矩阵都不相等,则无法广播。

当A行与B行相等 b列1,维度小的会自己复制自己变为大的维度的向量。

(2,3) 与(2,1) 之间的广播 应该是要行或列有一个相等

但(3,2)与(1,3)如何呢 会报错

import numpy as np
​
a = np.array([[2,2,4],[3,4,5]])
print(a.shape)
​
b = np.array([[3],[2],[4]])
print(b.shape)
​
c = np.array([[3,4,5]])
print(c.shape)
​
print(a/c)
​
# 行列都不相等的无法广播
# print(a/b)

最后一个小提示

以及创建矩阵或者向量 要[[]]两[[]]

b = np.array([[3],[2],[4]])

####

2 Vectorization

介绍了 矢量化。

DOT 点积 即行向量*列向量 (1,1)

OUTER PRODUCT 列向量乘行向量 为矩阵

ELEMENTWISE 元素 即两向量元素相乘为一个新向量 (len(x1),1)

import time
​
x1 = [9, 2, 5, 0, 0, 7, 5, 0, 0, 0, 9, 2, 5, 0, 0]
x2 = [9, 2, 2, 9, 0, 9, 2, 5, 0, 0, 9, 2, 5, 0, 0]
​
### CLASSIC DOT PRODUCT OF VECTORS IMPLEMENTATION ###
tic = time.process_time()
dot = 0
for i in range(len(x1)):
    dot+= x1[i]*x2[i]
toc = time.process_time()
print ("dot = " + str(dot) + "\n ----- Computation time = " + str(1000*(toc - tic)) + "ms")
​
### CLASSIC OUTER PRODUCT IMPLEMENTATION ###
tic = time.process_time()
outer = np.zeros((len(x1),len(x2))) # we create a len(x1)*len(x2) matrix with only zeros
for i in range(len(x1)):
    for j in range(len(x2)):
        outer[i,j] = x1[i]*x2[j]
toc = time.process_time()
print ("outer = " + str(outer) + "\n ----- Computation time = " + str(1000*(toc - tic)) + "ms")
​
### CLASSIC ELEMENTWISE IMPLEMENTATION ###
tic = time.process_time()
mul = np.zeros(len(x1))
for i in range(len(x1)):
    mul[i] = x1[i]*x2[i]
toc = time.process_time()
print ("elementwise multiplication = " + str(mul) + "\n ----- Computation time = " + str(1000*(toc - tic)) + "ms")
​
### CLASSIC GENERAL DOT PRODUCT IMPLEMENTATION ###
W = np.random.rand(3,len(x1)) # Random 3*len(x1) numpy array
tic = time.process_time()
gdot = np.zeros(W.shape[0])
for i in range(W.shape[0]):
    for j in range(len(x1)):
        gdot[i] += W[i,j]*x1[j]
toc = time.process_time()
print ("gdot = " + str(gdot) + "\n ----- Computation time = " + str(1000*(toc - tic)) + "ms")

最后一个W看作 有三个特征个数为len(x1) 的数据给了W。

gdot 即三个数据对象的特征点乘组成的一个数组(也有可能是向量,推荐自己shape一下)

gdot[i] += W[i,j]*x1[j] 可以理解为特征 乘 权。


实际上不需要这么繁琐,都被封装了

x1 = [9, 2, 5, 0, 0, 7, 5, 0, 0, 0, 9, 2, 5, 0, 0]
x2 = [9, 2, 2, 9, 0, 9, 2, 5, 0, 0, 9, 2, 5, 0, 0]
​
### VECTORIZED DOT PRODUCT OF VECTORS ###
tic = time.process_time()
dot = np.dot(x1,x2)
toc = time.process_time()
print ("dot = " + str(dot) + "\n ----- Computation time = " + str(1000*(toc - tic)) + "ms")
​
### VECTORIZED OUTER PRODUCT ###
tic = time.process_time()
outer = np.outer(x1,x2)
toc = time.process_time()
print ("outer = " + str(outer) + "\n ----- Computation time = " + str(1000*(toc - tic)) + "ms")
​
### VECTORIZED ELEMENTWISE MULTIPLICATION ###
tic = time.process_time()
mul = np.multiply(x1,x2)
toc = time.process_time()
print ("elementwise multiplication = " + str(mul) + "\n ----- Computation time = " + str(1000*(toc - tic)) + "ms")
​
### VECTORIZED GENERAL DOT PRODUCT ###
tic = time.process_time()
dot = np.dot(W,x1)
toc = time.process_time()
print ("gdot = " + str(dot) + "\n ----- Computation time = " + str(1000*(toc - tic)) + "ms")

2.1 Implement the L1 and L2 loss functions

即代价函数,m个(m个数据的)y^(预测值)-y(真实值,label),我们的目标也是最小化他们。

#
# def L1(yhat, y):
#     """
#     Arguments:
#     yhat -- vector of size m (predicted labels)
#     y -- vector of size m (true labels)
#
#     Returns:
#     loss -- the value of the L1 loss function defined above
import numpy as np
yhat = np.array([.9, 0.2, 0.1, .4, .9])
​
​
y = np.array([1, 0, 0, 1, 1])
def L1(yhat ,y):
    z= yhat -y
    for i in range(len(z)) :
        if(z[i]<0):
            z[i]=-z[i]
​
    print(z)
    return z.sum()
​
##  答案loss = np.sum(np.abs(y - yhat))
​
print("L1 = " + str(L1(yhat,y)))
​
### 平方和的代价函数 即(y^-y)^2
def L3(yhat,y):
    z=yhat-y
    z.reshape(5,1)
    print(z)
    z=np.dot(z,z)
    return z
print("L3 = " + str(L3(yhat,y)))

感谢! 此致。

你可能感兴趣的:(神经网络学习,numpy,python,经验分享)