吴恩达机器学习课后lab C1_W2_Lab01_Python_Numpy_Vectorization_Soln(向量元组)

向量元组创建

  • 代码块1(创建)
  • 代码块2
  • 代码块3
  • 代码块4
  • 代码块5
  • 代码块6
  • 代码块7
  • 代码块8
  • 代码块9
  • 代码块10
  • 代码块11
  • 代码块12
  • 代码块13(矩阵的创建)
  • 代码块14
  • 代码块15
  • 代码块16
  • 总结

代码块1(创建)

a = np.zeros(4);                print(f"np.zeros(4) :   a = {a}, a shape = {a.shape}, a data type = {a.dtype}")
a = np.zeros((4,));             print(f"np.zeros(4,) :  a = {a}, a shape = {a.shape}, a data type = {a.dtype}")
a = np.random.random_sample(4); print(f"np.random.random_sample(4): a = {a}, a shape = {a.shape}, a data type = {a.dtype}")

输出:
np.zeros(4) : a = [0. 0. 0. 0.], a shape = (4,), a data type = float64
np.zeros(4,) : a = [0. 0. 0. 0.], a shape = (4,), a data type = float64
np.random.random_sample(4): a = [0.24660509 0.65617874 0.20796114 0.99258047], a shape = (4,), a data type = float64

代码块2

a = np.arange(4.);              print(f"np.arange(4.):     a = {a}, a shape = {a.shape}, a data type = {a.dtype}")
a = np.random.rand(4);          print(f"np.random.rand(4): a = {a}, a shape = {a.shape}, a data type = {a.dtype}")

输出:
np.arange(4.): a = [0. 1. 2. 3.], a shape = (4,), a data type = float64
np.random.rand(4): a = [0.94521094 0.35495575 0.85364377 0.14953348], a shape = (4,), a data type = float64

np.arange()
函数返回一个有终点和起点的固定步长的排列,如[1,2,3,4,5],起点是1,终点是6,步长为1。
参数个数情况: np.arange()函数分为一个参数,两个参数,三个参数三种情况
1)一个参数时,参数值为终点,起点取默认值0,步长取默认值1。
2)两个参数时,第一个参数为起点,第二个参数为终点,步长取默认值1。
3)三个参数时,第一个参数为起点,第二个参数为终点,第三个参数为步长。其中步长支持小数

代码块3

# NumPy routines which allocate memory and fill with user specified values
a = np.array([5,4,3,2]);  print(f"np.array([5,4,3,2]):  a = {a},     a shape = {a.shape}, a data type = {a.dtype}")
a = np.array([5.,4,3,2]); print(f"np.array([5.,4,3,2]): a = {a}, a shape = {a.shape}, a data type = {a.dtype}")

输出:
np.array([5,4,3,2]): a = [5 4 3 2], a shape = (4,), a data type = int32
np.array([5.,4,3,2]): a = [5. 4. 3. 2.], a shape = (4,), a data type = float64
也可以自己指定数据类型,为元组的第一个

#vector indexing operations on 1-D vectors
a = np.arange(10)
print(a)
#访问元素
print(f"a[2].shape: {a[2].shape} a[2]  = {a[2]},")
#访问负数将会从后面往前访问
print(f"a[-1] = {a[-1]}")
#必须在访问内否则报错
try:
    c = a[10]
except Exception as e:
    print("The error message you'll see is:")
    print(e)

输出:
[0 1 2 3 4 5 6 7 8 9]
a[2].shape: () a[2] = 2, Accessing an element returns a scalar
a[-1] = 9
The error message you’ll see is:
index 10 is out of bounds for axis 0 with size 10

代码块4

#vector slicing operations
a = np.arange(10)
print(f"a         = {a}")

#访问五个连续的元素(start:stop:step)
c = a[2:7:1];     print("a[2:7:1] = ", c)

#访问3个元素以2为间隔
c = a[2:7:2];     print("a[2:7:2] = ", c)

#访问下标3之后的全部元素
c = a[3:];        print("a[3:]    = ", c)

#访问小于3的元素
c = a[:3];        print("a[:3]    = ", c)

#全部元素
c = a[:];         print("a[:]     = ", c)

输出:
a = [0 1 2 3 4 5 6 7 8 9]
a[2:7:1] = [2 3 4 5 6]
a[2:7:2] = [2 4 6]
a[3:] = [3 4 5 6 7 8 9]
a[:3] = [0 1 2]
a[:] = [0 1 2 3 4 5 6 7 8 9]

代码块5

a = np.array([1,2,3,4])
print(f"a             : {a}")
# 全体取负
b = -a 
print(f"b = -a        : {b}")

# 数组求和
b = np.sum(a) 
print(f"b = np.sum(a) : {b}")
#平均值
b = np.mean(a)
print(f"b = np.mean(a): {b}")
#开方
b = a**2
print(f"b = a**2      : {b}")

输出:
a : [1 2 3 4]
b = -a : [-1 -2 -3 -4]
b = np.sum(a) : 10
b = np.mean(a): 2.5
b = a**2 : [ 1 4 9 16]

代码块6

python中的数组还可以直接相加

a = np.array([ 1, 2, 3, 4])
b = np.array([-1,-2, 3, 4])
print(f"Binary operators work element wise: {a + b}")

输出:
Binary operators work element wise: [0 0 6 8]

代码块7

#try a mismatched vector operation
c = np.array([1, 2])
try:
    d = a + c
except Exception as e:
    print("The error message you'll see is:")
    print(e)

输出:
The error message you’ll see is:
operands could not be broadcast together with shapes (4,) (2,)

不同的数组不能相加

代码块8

a = np.array([1, 2, 3, 4])

# multiply a by a scalar
b = 5 * a 
print(f"b = 5 * a : {b}")

输出:b = 5 * a : [ 5 10 15 20]

代码块9

def my_dot(a, b): 
    """
   Compute the dot product of two vectors
 
    Args:
      a (ndarray (n,)):  input vector 
      b (ndarray (n,)):  input vector with same dimension as a
    
    Returns:
      x (scalar): 
    """
    x=0
    for i in range(a.shape[0]):
        x = x + a[i] * b[i]
    return x

# test 1-D
a = np.array([1, 2, 3, 4])
b = np.array([-1, 4, 3, 2])
print(f"my_dot(a, b) = {my_dot(a, b)}")

输出:my_dot(a, b) = 24

代码块10

我们可以直接调用np.dot

NumPy 1-D np.dot(a, b) = 24, np.dot(a, b).shape = () 
NumPy 1-D np.dot(b, a) = 24, np.dot(a, b).shape = () 

输出:
NumPy 1-D np.dot(a, b) = 24, np.dot(a, b).shape = ()
NumPy 1-D np.dot(b, a) = 24, np.dot(a, b).shape = ()

代码块11

np.random.seed(1)
a = np.random.rand(10000000)  # very large arrays
b = np.random.rand(10000000)

tic = time.time()  # capture start time
c = np.dot(a, b)
toc = time.time()  # capture end time

print(f"np.dot(a, b) =  {c:.4f}")
print(f"Vectorized version duration: {1000*(toc-tic):.4f} ms ")

tic = time.time()  # capture start time
c = my_dot(a,b)
toc = time.time()  # capture end time

print(f"my_dot(a, b) =  {c:.4f}")
print(f"loop version duration: {1000*(toc-tic):.4f} ms ")

del(a);del(b)  #remove these big arrays from memory

输出:np.dot(a, b) = 2501072.5817
Vectorized version duration: 67.3420 ms
my_dot(a, b) = 2501072.5817
loop version duration: 2653.7542 ms

介绍:因此,在本例中,矢量化提供了很大的速度。这是因为NumPy更好地利用了底层硬件中可用的数据并行性。GPU和现代CPU实现了单指令多数据(SIMD)流水线,允许并行发出多个操作。这在数据集通常非常大的机器学习中至关重要。

代码块12

# show common Course 1 example
X = np.array([[1],[2],[3],[4]])
w = np.array([2])
c = np.dot(X[1], w)
#二维数组x
print(f"X[1] has shape {X[1].shape}")
#一维数组w
print(f"w has shape {w.shape}")
#dot得出的是一个数,没有维度
print(f"c has shape {c.shape}")

输出:
X[1] has shape (1,)
w has shape (1,)
c has shape ()

代码块13(矩阵的创建)

#一行五列的矩阵
a = np.zeros((1, 5))                                       
print(f"a shape = {a.shape}, a = {a}")      
#两行一列的矩阵               
a = np.zeros((2, 1))                                                                   
print(f"a shape = {a.shape}, a = {a}") 

a = np.random.random_sample((1, 1))  
print(f"a shape = {a.shape}, a = {a}") 

输出:
a shape = (1, 5), a = [[0. 0. 0. 0. 0.]]
a shape = (2, 1), a = [[0.] [0.]]
a shape = (1, 1), a = [[0.77390955]]

代码块14

也可以手动指定数据。尺寸用与相匹配的附加括号指定如:

# NumPy routines which allocate memory and fill with user specified values
a = np.array([[5], [4], [3]]);   print(f" a shape = {a.shape}, np.array: a = {a}")
a = np.array([[5],   # One can also
              [4],   # separate values
              [3]]); #into separate rows
print(f" a shape = {a.shape}, np.array: a = {a}")

输出:
a shape = (3, 1), np.array: a = [[5]
[4]
[3]]
a shape = (3, 1), np.array: a = [[5]
[4]
[3]]

代码块15

#vector indexing operations on matrices
a = np.arange(6).reshape(-1, 2)   #reshape是创建矩阵的便捷方式,注意这里的负数,看下面解释
print(f"a.shape: {a.shape}, \na= {a}")

#access an element
print(f"\na[2,0].shape:   {a[2, 0].shape}, a[2,0] = {a[2, 0]},     type(a[2,0]) = {type(a[2, 0])} Accessing an element returns a scalar\n")

#access a row
print(f"a[2].shape:   {a[2].shape}, a[2]   = {a[2]}, type(a[2])   = {type(a[2])}")

输出:a.shape: (3, 2),
a= [[0 1]
[2 3]
[4 5]]

a[2,0].shape: (), a[2,0] = 4, type(a[2,0]) = Accessing an element returns a scalar

a[2].shape: (2,), a[2] = [4 5], type(a[2]) =

注意:这里的负数是模糊控制,负数可以为任何数。比如 reshape(2,-1),固定两行,多少列系统根据元素数量自动计算好;同理,reshape(-2,2): 固定两列,行数自动计算好。若出现了无法整除的情况,系统会报错

代码块16

#vector 2-D slicing operations
a = np.arange(20).reshape(-1, 10)
print(f"a = \n{a}")

#access 5 consecutive elements (start:stop:step)
print("a[0, 2:7:1] = ", a[0, 2:7:1], ",  a[0, 2:7:1].shape =", a[0, 2:7:1].shape, "a 1-D array")

#access 5 consecutive elements (start:stop:step) in two rows
print("a[:, 2:7:1] = \n", a[:, 2:7:1], ",  a[:, 2:7:1].shape =", a[:, 2:7:1].shape, "a 2-D array")

# access all elements
print("a[:,:] = \n", a[:,:], ",  a[:,:].shape =", a[:,:].shape)

# access all elements in one row (very common usage)
print("a[1,:] = ", a[1,:], ",  a[1,:].shape =", a[1,:].shape, "a 1-D array")
# same as
print("a[1]   = ", a[1],   ",  a[1].shape   =", a[1].shape, "a 1-D array")

输出:
a =
[[ 0 1 2 3 4 5 6 7 8 9]
[10 11 12 13 14 15 16 17 18 19]]
a[0, 2:7:1] = [2 3 4 5 6] , a[0, 2:7:1].shape = (5,) a 1-D array
a[:, 2:7:1] =
[[ 2 3 4 5 6]
[12 13 14 15 16]] , a[:, 2:7:1].shape = (2, 5) a 2-D array
a[:,:] =
[[ 0 1 2 3 4 5 6 7 8 9]
[10 11 12 13 14 15 16 17 18 19]] , a[:,:].shape = (2, 10)
a[1,:] = [10 11 12 13 14 15 16 17 18 19] , a[1,:].shape = (10,) a 1-D array

总结

本节lab
1.给出了python中数组以及多维矩阵的创建和操作方式;
2.因此,在本例中,矢量化提供了很大的速度。这是因为NumPy更好地利用了底层硬件中可用的数据并行性。GPU和现代CPU实现了单指令多数据(SIMD)流水线,允许并行发出多个操作。这在数据集通常非常大的机器学习中至关重要。

你可能感兴趣的:(吴恩达机器学习课上lab,numpy,python)