gradient descent 的python实现

从模拟数据集

到曲线拟合

# -*- coding: utf-8 -*-
"""
Created on Tue Sep  5 21:21:58 2017
@author: wjw
模拟产生数据集,然后再进行拟合
"""
def nomalization(X):#不归一化时梯度下降时数值太大,报错
    maxX = max(X)
    minX = min(X)
    normalized_X = []
    for x in X:
        normalized_X.append((x-minX)/(maxX-minX))
    return normalized_X 

def gradicent(X_train,Y_train,a,b,c,d):
    #当估计函数为    a*x**3+b*x**2+c*x+d
    n = 0
    max_itor = 20000
    alpha = 0.02
    epslion = 1e-8
    error1 = 0
    error2 = 0
    
    while True:
        n += 1
        if n>max_itor:break
    
        for i in range(X_train.__len__()): #得到每一行的数据
            x = X_train[i]
            y = Y_train[i]
            a -= (alpha*(a*(x**3)+b*x**2+c*x+d-y)*x**3)
            b -= (alpha*(a*(x**3)+b*x**2+c*x+d-y)*x**2)
            c -= (alpha*(a*(x**3)+b*x**2+c*x+d-y)*x)
            d -= (alpha*(a*(x**3)+b*x**2+c*x+d-y))
            error2 += (y-a*(x**3)+b*x**2+c*x+d)**2#先累加差异,再求平均差异
       
        if n%1000==0:
            print('times:%d'%(n))
            print('error:%f,train_accuracy:%f'%(abs(error2-error1)/X_train.__len__(),                                             calculate_acuracy(a,b,c,d,X_train,Y_train)))#前后平均差异之差
#            print('train_accuracy:',calculate_acuracy(a,b,c,X_train,Y_train))
#            break
        if abs(error2-error1)

填坑:在梯度下降运算时,解释器warning数据超过计算范围,除非设置alpha很小。解决方法:因为在计算梯度是要计算x^3,且还要累加error可能导致超过范围,在x带入计算前,进行归一化处理,问题解决!

效果如下:
gradient descent 的python实现_第1张图片

 

你可能感兴趣的:(数据挖掘,python)