这次作业的大部分内容均参考自下面这篇博客文章 https://blog.csdn.net/pengjian444/article/details/71075544 。梯度下降算法理论可以回去查看吴恩达的机器学习视频,下面动手将上边博客的代码跑一遍。
Rosenbrock函数:
f ( x , y ) = ( 1 − x ) 2 + 100 ( y − x 2 ) 2 f(x,y)=(1−x)^2+100(y−x_2)^2 f(x,y)=(1−x)2+100(y−x2)2
函数 f f f 分别对 x x x, y y y 求偏导得到
∂ f ( x , y ) ∂ x = − 2 ( 1 − x ) − 2 ∗ 100 ( y − x 2 ) ∗ 2 x \frac{∂f(x,y)}{∂x}=−2(1−x)−2∗100(y−x^2)∗2x ∂x∂f(x,y)=−2(1−x)−2∗100(y−x2)∗2x
∂ f ( x , y ) ∂ y = 2 ∗ 100 ( y − x 2 ) \frac{∂f(x,y)}{∂y}=2∗100(y−x^2) ∂y∂f(x,y)=2∗100(y−x2)
这里,使用numpy实现梯度下降算法, x x x, y y y的初始值设置为0,函数cal_rosenbrock_prax和cal_rosenbrock_pray求取x和y方向上的梯度。
import numpy as np
def cal_rosenbrock(x1, x2):
"""
计算rosenbrock函数的值
:param x1:
:param x2:
:return:
"""
return (1 - x1) ** 2 + 100 * (x2 - x1 ** 2) ** 2
def cal_rosenbrock_prax(x1, x2):
"""
对x1求偏导
"""
return -2 + 2 * x1 - 400 * (x2 - x1 ** 2) * x1
def cal_rosenbrock_pray(x1, x2):
"""
对x2求偏导
"""
return 200 * (x2 - x1 ** 2)
def for_rosenbrock_func(max_iter_count=100000, step_size=0.001):
pre_x = np.zeros((2,), dtype=np.float32)
loss = 10
iter_count = 0
while loss > 0.001 and iter_count < max_iter_count:
error = np.zeros((2,), dtype=np.float32)
error[0] = cal_rosenbrock_prax(pre_x[0], pre_x[1])
error[1] = cal_rosenbrock_pray(pre_x[0], pre_x[1])
for j in range(2):
pre_x[j] -= step_size * error[j]
loss = cal_rosenbrock(pre_x[0], pre_x[1]) # 最小值为0
print("iter_count: ", iter_count, "the loss:", loss)
iter_count += 1
return pre_x
if __name__ == '__main__':
w = for_rosenbrock_func()
print(w)
已知函数在 ( 1 , 1 ) (1, 1) (1,1)处有最小值0,梯度下降算法的结果如下:
iter_count: 5759 the loss: 0.0010091979556820377
iter_count: 5760 the loss: 0.0010083502093514948
iter_count: 5761 the loss: 0.0010075028288032883
iter_count: 5762 the loss: 0.0010066566073887067
iter_count: 5763 the loss: 0.0010058099570385917
iter_count: 5764 the loss: 0.0010049644657556017
iter_count: 5765 the loss: 0.0010041200851937761
iter_count: 5766 the loss: 0.0010032768613008629
iter_count: 5767 the loss: 0.0010024340007618712
iter_count: 5768 the loss: 0.0010015915035627475
iter_count: 5769 the loss: 0.0010007493696895125
iter_count: 5770 the loss: 0.0009999075991282626
[0.96840495 0.9376793 ]
下面使用PyTorch搭建一个线性回归模型,参考下面这篇文章 https://jvn.io/aakashns/e556978bda9343f3b30b3a9fd2a25012
import numpy as np
import torch
#Input (temp, rainfall, humidity)
inputs = np.array([[73, 67, 43],
[91, 88, 64],
[87, 134, 58],
[102, 43, 37],
[69, 96, 70]], dtype='float32')
#Targets (apples, oranges)
targets = np.array([[56, 70],
[81, 101],
[119, 133],
[22, 37],
[103, 119]], dtype='float32')
#Convert inputs and targets to tensors
inputs = torch.from_numpy(inputs)
targets = torch.from_numpy(targets)
print(inputs)
print(targets)
#Weights and biases
w = torch.randn(2, 3, requires_grad=True)
b = torch.randn(2, requires_grad=True)
print(w)
print(b)
def model(x):
return x @ w.t() + b
#Generate predictions
preds = model(inputs)
print(preds)
#Compare with targets
print(targets)
#MSE loss
def mse(t1, t2):
diff = t1 - t2
return torch.sum(diff * diff) / diff.numel()
#Compute loss
loss = mse(preds, targets)
print(loss)
#Compute gradients
loss.backward()
#Gradients for weights
print(w)
print(w.grad)
w.grad.zero_()
b.grad.zero_()
print(w.grad)
print(b.grad)
#Generate predictions
preds = model(inputs)
print(preds)
#Calculate the loss
loss = mse(preds, targets)
print(loss)
#Compute gradients
loss.backward()
print(w.grad)
print(b.grad)
#Adjust weights & reset gradients
with torch.no_grad():
w -= w.grad * 1e-5
b -= b.grad * 1e-5
w.grad.zero_()
b.grad.zero_()
print(w)
print(b)
#Calculate loss
preds = model(inputs)
loss = mse(preds, targets)
print(loss)
#Train for 100 epochs
for i in range(100):
preds = model(inputs)
loss = mse(preds, targets)
loss.backward()
with torch.no_grad():
w -= w.grad * 1e-5
b -= b.grad * 1e-5
w.grad.zero_()
b.grad.zero_()
#Calculate loss
preds = model(inputs)
loss = mse(preds, targets)
print(loss)
print(preds)
print(targets)