Dense神经网络可以做什么【深度学习系列之-Dense神经网络可以做什么】

本文通过假设,理论说明,实践输出,来演示dense神经网络的作用。例子采用keras编程。

 假设特征向量如:[a,b,c]  特征维度为3,  a,b,c为特征值,特征值可取离散的整数。例如输入特征向量为 [1,2,4], 输出特征为 0 或者 1.
 我们的目标是,查找哪些逻辑关系可以学习到。如:我们将 a+b=3 的特征向量
 的输出设置为1,如果神经网络Q可以通过训练,预测输入向量为[2,1,4] 的输出为 1.
 那么则说,a + b = 3的逻辑关系可以通过神经网络Q学习到。

 那么,Dense神经网络可以学习到什么呢?
 首先,下面的神经元输出激活都为sigmoid函数
 
 1 加减除关系的拟合
 形如: a + b >1   a-b <3   a/b >5  的关系,其形态都可以转换为 x*a + y*b > z 的形式
 其关系符合Dense矩阵乘法的定义,一层连接就可以拟合,无需多解释。
 Q定义:输出层:Dense(1,input_dim=2)

 2 and,or 逻辑关系
 形如 a > 1 and b > 1 ; and,or 逻辑关系;
 第一层隐藏层需要两个神经元,分别拟合a > 1 (神经元1) 和 b > 1(神经元2);输出神经元拟合(神经元1) and (神经元2)。
Q定义:隐藏层:Dense(2,input_dim=2) 
      输出层:Dense(1)
 
 3 乘法关系的拟合
    a,b的取值是连续的,此处为了方便分析,假设:a取值(1,2,3,4,5) b取值(1,2,3,4,5) 需要拟合的乘法为 a * b > 12。乘的关系看似最为复杂,但其实只需要两层连接,这边经历了一个实验阶段,我也将实验阶段表示出来。
 
实验假设:
     a * b > 12 问题可以转换为 (3,4) (3,5) (4,3) (4,5) (5,3) (5,4) 这6种选择(3,4) 其实就 (1/3 * a + 1/4 *b ) = 0 线性表示,每种线性表示都需要一个神经元,那么如果某隐藏 层有6个神经元,每个神经元学到了一个线性表示,那么输出神经元形则可以通过学习 隐藏神经元1 + 隐藏神经元2 + 隐藏神经元3 + 隐藏神经元4 ...隐藏神经元6 > 0 来达到目标。

      那么问题来了, (1/3 * a + 1/4 *b ) = 0 如何学到? 我采用一次线性变换,是学不到的,因为sigmoid函数是单调的。一层隐藏层只能学习到 1中的逻辑。但是(1/3 * a + 1/4 *b ) = 0 问题却可以转为((1/3 * a + 1/4 *b ) + 极小数 > 0 ) and (( 1/3 * a + 1/4 *b ) - 极小数 < 0 )的问题。也就是第一层隐藏层先拟合((1/3 * a + 1/4 *b ) + 极小数 > 0 )和((1/3 * a + 1/4 *b - 极小数) < 0 ),第二层隐藏层再拟合他们之间的and,经过测试果然是可以的。按照上面的推测,乘法需要3次连接,第一层隐藏神经元表达 (1/3 * a + 1/4 *b )>0 - 极小数 ;第二层隐藏层表达(1/3 * a + 1/4 *b )=0 ;输出层表达 (3,4) or (3,5) or (4,3) or (4,5) or (5,3) or (5,4)。

   Q表达:隐藏层:Dense(2*6种选择,input_dim = 2)

  隐藏层 Dense(6)

   输出层 Dense(1)


实践结果:
     然而我做训练的时候,只用了一层隐藏层和一个输出层,结果就是正确的了,召回律和精准律都大于百分之99,我加大的feature数,原来就a b c,现在一直到z,精准律,召回律还是很高。
 Q定义:隐藏层:Dense(2)
   输出层:Dense(1) 
    就是说明,形如a*b>12的非线性运算,通过 sigmoid(c3*sigmoid(c1*a+b1) + c4*sigmoid(c2*a+b2) + b3) 的确可以拟合出来。至于为什么可以拟合,要参考同胚变换,后面补上。
 
 
import numpy as np
from keras.models import  Sequential
from keras.layers import  Dense,LSTM
from keras.utils  import np_utils
samples=20000
features = 3
def modelData1():
    np.random.seed(7)
    X = []
    Y = []
    for f in range(samples):
        sample = []
        for e in range(features):
            v = np.random.random()
            sample.append(v)
        if sample[0]>sample[1]:
            Y.append(1)
        else:
            Y.append(0)
        X.append(sample)
    X = np.array(X)
    Y = np.array(Y)
    split = int(np.round(samples/2))
    x_train = X[:split]
    y_train = Y[:split]
    x_test = X[split:]
    y_test = Y[split:]
    return x_train,y_train,x_test,y_test
def modelData2():
    np.random.seed(7)
    X = []
    Y = []
    for f in range(samples):
        sample = []
        for e in range(features):
            v = np.random.random()
            sample.append(v)
        if (sample[0]+sample[1])>1:
            Y.append(1)
        else:
            Y.append(0)
        X.append(sample)
    X = np.array(X)
    Y = np.array(Y)
    split = int(np.round(samples/2))
    x_train = X[:split]
    y_train = Y[:split]
    x_test = X[split:]
    y_test = Y[split:]
    return x_train,y_train,x_test,y_test
def modelData3():
    np.random.seed(7)
    X = []
    Y = []
    for f in range(samples):
        sample = []
        for e in range(features):
            v = np.random.random()
            sample.append(v)
        if (sample[0]+sample[1]+sample[2])>1.5:
            Y.append(1)
        else:
            Y.append(0)
        X.append(sample)
    X = np.array(X)
    Y = np.array(Y)
    split = int(np.round(samples/2))
    x_train = X[:split]
    y_train = Y[:split]
    x_test = X[split:]
    y_test = Y[split:]
    return x_train,y_train,x_test,y_test
def modelData4():
    np.random.seed(7)
    X = []
    Y = []
    for f in range(samples):
        sample = []
        for e in range(features):
            v = np.random.random()
            sample.append(v)
        if (sample[0] > 0.5)&(sample[1] > 0.5):
            Y.append(1)
        else:
            Y.append(0)
        X.append(sample)
    X = np.array(X)
    Y = np.array(Y)
    split = int(np.round(samples/2))
    x_train = X[:split]
    y_train = Y[:split]
    x_test = X[split:]
    y_test = Y[split:]
    return x_train,y_train,x_test,y_test
def modelData5():
    np.random.seed(7)
    X = []
    Y = []
    for f in range(samples):
        sample = []
        for e in range(features):
            v = np.random.random()
            sample.append(v)
        if ((sample[0]/sample[1]) > 0.5):
            Y.append(1)
        else:
            Y.append(0)
        X.append(sample)
    X = np.array(X)
    Y = np.array(Y)
    split = int(np.round(samples/2))
    x_train = X[:split]
    y_train = Y[:split]
    x_test = X[split:]
    y_test = Y[split:]
    return x_train,y_train,x_test,y_test

def modelData6():
    np.random.seed(7)
    X = []
    Y = []
    for f in range(samples):
        sample = []
        for e in range(features):
            v = np.random.random()
            sample.append(v)
        if ((sample[0]*sample[1]) > 0.25):
            Y.append(1)
        else:
            Y.append(0)
        X.append(sample)
    X = np.array(X)
    Y = np.array(Y)
    split = int(np.round(samples/2))
    x_train = X[:split]
    y_train = Y[:split]
    x_test = X[split:]
    y_test = Y[split:]
    return x_train,y_train,x_test,y_test

def modelData7():
    np.random.seed(7)
    X = []
    Y = []
    for f in range(samples):
        sample = []
        for e in range(features):
            v = np.random.random()
            sample.append(v)
        if ((sample[0] + sample[1]) > 0.95) & ((sample[0] + sample[1]) < 1.05):
            Y.append(1)
        else:
            Y.append(0)
        X.append(sample)
    X = np.array(X)
    Y = np.array(Y)
    split = int(np.round(samples/2))
    x_train = X[:split]
    y_train = Y[:split]
    x_test = X[split:]
    y_test = Y[split:]
    return x_train,y_train,x_test,y_test

def model1(data):
    x_train,y_train,x_test,y_test = data
    model = Sequential()
    model.add(Dense(1,input_shape=(x_train.shape[1],),activation='sigmoid'))
    model.compile(loss='binary_crossentropy',optimizer='rmsprop',metrics=['accuracy'])
    return  model
def model2(data):
    x_train,y_train,x_test,y_test = data
    model = Sequential()
    model.add(Dense(32,input_shape=(x_train.shape[1],),activation='sigmoid'))
    model.add(Dense(1,activation='sigmoid'))
    model.compile(loss='binary_crossentropy',optimizer='rmsprop',metrics=['accuracy'])
    return  model
def model3(data):
    x_train,y_train,x_test,y_test = data
    model = Sequential()
    model.add(Dense(128,input_shape=(x_train.shape[1],),activation='sigmoid'))
    model.add(Dense(1,activation='sigmoid'))
    model.compile(loss='binary_crossentropy',optimizer='rmsprop',metrics=['accuracy'])
    return  model
def model4(data):
    x_train,y_train,x_test,y_test = data
    model = Sequential()
    model.add(Dense(2,input_shape=(x_train.shape[1],),activation='sigmoid'))
    model.add(Dense(1,activation='sigmoid'))
    model.compile(loss='binary_crossentropy',optimizer='rmsprop',metrics=['accuracy'])
    return  model
def fitmode(model,data,weight):
    x_train, y_train, x_test, y_test = data
    model.fit(x_train,y_train,epochs=500,batch_size=32,verbose=2)
    model.save_weights(weight)
    return  model
def evaluatemode(model,data,weight):
    x_train, y_train, x_test, y_test = data
    model.load_weights(weight)
    result = model.predict(x_test[:20])
    for a,b in zip(result,y_test[:20]):
        print(a,np.round(a),b)
    result = model.predict(x_test)
    TP = 0
    FP = 0
    FN = 0
    TN = 0
    for a,b in zip(result,y_test):
        predict = int(np.round(a))
        real = int(b)
        if (predict==1)&(real==1):
            TP = TP + 1
        if (predict==1)&(real==0):
            FP = FP + 1
        if (predict == 0) & (real == 0):
            TN = TN + 1
        if (predict==0)&(real==1):
            FN = FN + 1
    print("真数:",TP + FN)
    print("假数:",TN + FP)
    print("精准律:(在预言的真相中,有多少是真相)")
    #放置除0的错误,这边+1
    print((TP+1)/(TP+FP+1))
    print("召回律:(在真相中,为找到多少)")
    print((TP+1)/(TP+FN+1))
def sigmoid(inX):
    return 1.0/(1+np.exp(-inX))

if __name__ == '__main__':
    pass
    #weight1 = "weights/testdense1.h5"
    #data = modelData1()
    #model = fitmode(model1(data),data,weight1)
    #evaluatemode(model1(data),data,weight1)

    #weight2 = "weights/testdense2.h5"
    #data = modelData2()
    #model = fitmode(model1(data),data,weight2)
    #evaluatemode(model1(data),data,weight2)

    # weight3 = "weights/testdense3.h5"
    # data = modelData3()
    # model = fitmode(model1(data),data,weight3)
    # evaluatemode(model1(data),data,weight3)

    #学习不佳
    #weight4 = "weights/testdense4.h5"
    #data = modelData4()
    #model = fitmode(model1(data),data,weight4)
    #evaluatemode(model1(data),data,weight4)

    #更多的样本情况下,可以学习到一个不错的水平
    #weight5 = "weights/testdense5.h5"
    #data = modelData4()
    #model = fitmode(model2(data),data,weight5)
    #evaluatemode(model2(data),data,weight5)

    #weight6 = "weights/testdense6.h5"
    #data = modelData3()
    #model = fitmode(model1(data),data,weight6)
    #evaluatemode(model1(data),data,weight6)

    #学习不佳
    #weight7 = "weights/testdense7.h5"
    #data = modelData6()
    #model = fitmode(model1(data),data,weight7)
    #evaluatemode(model1(data),data,weight7)

    #学习的不错
    #精准律:0.99
    #召回律:0.99
    #weight8 = "weights/testdense8.h5"
    #data = modelData6()
    #model = fitmode(model3(data),data,weight8)
    #evaluatemode(model3(data),data,weight8)
    #print(model3(data).get_weights())

    #学不到
    #召回律为0
    #weight9 = "weights/testdense9.h5"
    #data = modelData7()
    #model = fitmode(model1(data),data,weight9)
    #evaluatemode(model1(data),data,weight9)

    #992 9008
    #精准律0.92 召回律0.83
    #weight10 = "weights/testdense10.h5"
    #data = modelData7()
    #model = fitmode(model2(data),data,weight10)
    #evaluatemode(model2(data),data,weight10)

    #精准律0.99
    #召回律0.99
    #weight9 = "weights/testdense9.h5"
    #data = modelData6()
    #model = fitmode(model4(data),data,weight9)
    #evaluatemode(model4(data),data,weight9)




你可能感兴趣的:(机器学习,神经网络,人工智能)