众所周知,回归在机器学习中一般是指对连续值得预测,但是对数几率回归其实是一个用来处理离散值得分类模型,是从线性回归演化而来的。
与线性回归相似,求解对数几率回国模型就是求解参数w和b。不同的地方是周志华书本上线性回归模型中参数求解方法是最小二乘法(说白了就是求导取极值),当然也可以用梯度下降法。这就是为什么当你调用python中的包实现线性回归时需要设置epoch,learning_rate这些参数的原因,当你自己求解是用最小二乘就不需要那么麻烦,直接一个公式就可以求出参数。
梯度其实也是导数,梯度下降就是按照函数梯度的反方向去找函数值下降最快的方向,可以适用于求解线性和非线性模型,但容易陷入局部最优值。最小二乘不回,但是,最小二乘只能解决线性模型。
代码
本文用的例子是一个女生相亲的例子,一个女生看男生的条件主要有(1)身高;(2)长相;(3)收入;(4)职业,分别用-1(差),0(一般),1(好) 代表男生的状况。标签集合0代表女生不见,1代表女生见
public class logistic {
public static void main(String[] args) {
//以女孩相亲的例子来构造数据集X,以及标签Y
//数据集中共有10个数据,每条数据包含4个属性
//身高,长相, 收入,职业
//取值-1,0,1
//分别代表不好,一般,好
//标签1,代表见,标签0代表不见
//训练集
double[][] train_set={{1,1,1,1},
{1,1,1,0},
{1,1,0,1},
{1,0,1,1},
{0,1,1,1},
{1,1,1,-1},
{1,1,-1,1},
{1,-1,1,1},
{-1,1,1,1},
{0,0,0,0},
{0,0,0,1},
{0,0,1,0},
{0,1,0,0},
{1,0,0,0},
{-1,-1,-1,-1},
{-1,-1,-1,1},
{-1,-1,1,-1},
{-1,1,-1,-1},
{1,-1,-1,-1},
};
double[] trainlabel_set={1,1,1,1,1,0,0,0,0,1,1,0,0,0,0,0,0,0,0};
//构造测试集
// int[][] test_set={{1,0,0,1},
// {0,1,1,0},
// {-1, 1,0,-1},
// {0,1,-1,1},
// {-1,0,0,-1}};
//初始化参数
double[] w= new double[train_set[0].length];
double b=0;
double[] y_pre=log(train_set,trainlabel_set, w,b,10000, 0.1 );
for (int i=0; i<y_pre.length;i++)
{
System.out.println(y_pre[i]);
}
}
public static double[] dot(double[][] x, double[] y)
{
double[] sum=new double[x.length];
if (x[0].length==y.length)
{
for(int i=0; i<x.length; i++){
for(int j=0; j<x[i].length; j++){
sum[i]=sum[i]+x[i][j]*y[j];
}
}
}
return sum;
}
public static double matrix(double[][] x, double[] y) {
double sum = 0;
if (x.length == y.length) {
double s = 0;
for (int j = 0; j < x[0].length; j++) {
for (int i = 0; i < x[i].length; i++) {
s = s + x[j][i] * y[i];
}
sum = sum + s;
}
}else{
System.out.println("输入的数组有误");
}
return sum;
}
public static double sigmoid(double x){
double e=Math.E;
double z=1/(1+Math.pow(e, -x));
return z;
}
public static double[] log(double[][] trainset, double[] trainlabel, double[] w, double b, int epoch, double learning_rate)
{
double[] y_pre=new double[trainlabel.length];
for (int iter_num=0; iter_num<epoch; iter_num++) {
int num_train = trainset.length;
double[] y_hat = new double[trainlabel.length];
for (int i = 0; i < y_hat.length; i++) {
double[] r = dot(trainset, w);
y_hat[i] = sigmoid(r[i] + b);
}
double[] loss = new double[y_hat.length];
for (int i = 0; i < trainlabel.length; i++) {
loss[i] = loss[i] + trainlabel[i] * Math.log(y_hat[i]) + (1 - trainlabel[i]) * Math.log(1 - y_hat[i]);
loss[i] = -1 / num_train * loss[i];
}
double[] y = new double[trainlabel.length];
for (int i = 0; i < y.length; i++) {
y[i] = y_hat[i] - trainlabel[i];
}
double dw = matrix(trainset, y) / num_train;
double db = 0;
for (int i = 0; i < y_hat.length; i++) {
db = db + y_hat[i] - trainlabel[i];
}
db = db / num_train;
System.out.println("第" + (iter_num + 1) + "代迭代开始:");
for (int j = 0; j < w.length; j++) {
w[j] = w[j] - learning_rate * dw;
}
b = b -learning_rate * db;
System.out.println("第" + (iter_num + 1) + "代迭代dw:" + dw);
System.out.println("第" + (iter_num + 1) + "代迭代db:" + db);
double[] params = new double[w.length + 1];
for (int i = 0; i < w.length; i++) {
params[i] = w[i];
}
params[w.length] = b;
double[] r = dot(trainset, w);
for (int i = 0; i < y_hat.length; i++) {
y_pre[i] = sigmoid(r[i] + b);
}
for (int i = 0; i < y_pre.length; i++) {
if (y_pre[i] > 0.5) {
y_pre[i] = 1;
} else {
y_pre[i] = 0;
}
}
}
return y_pre;
}
}
学习率learning_rate选择0.01,epoch选择10000,最终一部分结果展示如下:
C语言代码也可提供 有需要请留言。