广义线性模型和广义加法模型
总览 (Overview)
Both the documentation and the code is heavily inspired by pyGLMnet.:
pyGLMnet极大地启发了文档和代码:
The first thing I did was to separate all the calculations into new functions in a different class called GLMCostFunction. At first, I had derived this class from the FirstOrderCostFunction class but I later got rid of that inheritance as it proved to be more of a hindrance.
我要做的第一件事是将所有计算分离到另一个名为GLMCostFunction的类中的新函数中。 最初,我是从FirstOrderCostFunction类派生此类的,但是后来我摆脱了那个继承,因为它被证明是一个更大的障碍。
The original GLM class dealt with the gradient descent actually minimizing the loss function. I made it so that it derived from the Iterative Machine class which makes it really convenient to write any iterative algorithms. There are two functions init_model() and iteration() that you have to override and your work is done. The train_machine() as defined in the IterativeMachine class takes care of everything by first calling init_model() followed by iteration() a fixed number of times unless the algorithm converges.
原始的GLM类涉及梯度下降,实际上使损失函数最小。 我这样做是因为它派生自Iterative Machine类,这使得编写任何迭代算法非常方便。 有两个函数init_model()和迭代(),您必须覆盖和你的工作就完成了。 除非算法收敛,否则IterativeMachine类中定义的train_machine()会首先调用init_model(),然后迭代()固定次数,以处理所有问题。
This was the main pull request that I have opened:
这是我打开的主要拉取请求:
In the future, this GLM is supposed to have all the different kinds of distributions like BINOMIAL, GAMMA, SOFTPLUS, PROBIT, POISSON.
将来,该GLM应该具有所有不同种类的发行版,例如BINOMIAL,GAMMA,SOFTPLUS,PROBIT和POISSON。
I made an enum for this which will decide which distribution is going on:
我为此做了一个枚举,它将决定正在进行的分发:
There were several things I learned, like, using cmath instead of math.h or a lot of ways to make C++ code better like making parameterized constructors call the non-parameterized constructors to help with the abstraction.
我学到了很多东西,例如使用cmath代替math.h或通过多种方法使C ++代码更好,例如使参数化构造函数调用非参数化构造函数来帮助抽象。
I was first using the
我首先使用
Also, the optimization classes of Shogun have been really useful when optimizing the algorithm. There are entire classes for tasks like gradient descent updates or for managing constant and variable learning rates, even IterativeMachine for that matter. All this really takes the load of the coder’s shoulders and he/she can focus on getting the code to work.
此外,将军的优化类在优化算法时也非常有用。 对于诸如梯度下降更新之类的任务,或用于管理恒定和可变学习率的整个类,甚至针对该问题的IterativeMachine 。 所有这一切确实负担了编码人员的负担,他/她可以专注于使代码正常工作。
Another really neat thing about Shogun is linalg. Since there are no direct libraries for Linear Algebra in C++, using linalg makes it a breeze without increasing the computation times by going through loops by running Eigen3 and ViennaCL in the background.
关于幕府将军的另一件事是利纳尔格 。 由于在C ++中没有用于线性代数的直接库,因此使用linalg使其轻而易举,而无需通过在后台运行Eigen3和ViennaCL进行循环来增加计算时间。
One point to note is that Shogun uses matrices with each row corresponding to one feature and each column an example. This is called column-major ordering, which is quite the opposite as compared to most python libraries.
需要注意的一点是,Shogun使用矩阵,每一行对应一个功能,每一列对应一个示例。 这称为列主要排序 ,与大多数python库相比,这是完全相反的。
Let’s see how the code works:
让我们看看代码是如何工作的:
init_model() (init_model())
There is a simple init_model() function which runs once before the iterations begin. All it does is take care of all the initializations for the weights and the bias.
有一个简单的init_model()函数在迭代开始之前运行一次。 它所做的就是照顾所有权重和偏差的初始化。
Here we are setting the weights to be random values taken from a random distribution. That is where RandomMixin comes in which is used when inheriting the LinearMachine class.
在这里,我们将权重设置为取自随机分布的随机值。 这就是RandomMixin谈到在继承LinearMachine类时使用。
迭代() (iteration())
Let’s look at the iteration function. This function performs simple gradient descent:
让我们看一下迭代函数。 此函数执行简单的梯度下降:
You may be wondering where we get our gradient_w and gradient_bias from, that is where all of the math comes in.
您可能想知道我们从哪里获得gradient_w和gradient_bias ,那是所有数学运算的源泉。
Firstly I defined a few functions to help with better abstraction of the code.
首先,我定义了一些函数来帮助更好地抽象代码。
compute_z() (compute_z())
The compute_z function:
compute_z函数:
non_linearity() (non_linearity())
The function to implement the non-linearity:
实现非线性的函数:
gradient_non_linearity() (gradient_non_linearity())
And finally the gradient of this non-linearity:
最后是这种非线性的梯度:
With these simple functions out of the way, let’s get to the actual gradient calculation. We will be using the expression we derived in the previous blog post.
通过这些简单的功能,让我们开始实际的梯度计算。 我们将使用上一篇博客文章中派生的表达式。
get_gradient() (get_gradient())
I wrote similar code for finding the gradient with respect to the bias, it’s just that we don’t have to worry about all the complex linear algebra and we can simply deal with scalars.
我写了类似的代码来找到相对于偏差的梯度,只是我们不必担心所有复杂的线性代数,而我们可以简单地处理标量。
apply_regression() (apply_regression())
Finally comes the stage where you have to apply the model to a dataset once it is trained. This is done using the apply_regression() method.
终于到了训练模型后必须将模型应用于数据集的阶段。 这是使用apply_regression()方法完成的。
This was how I implemented Poisson Regression in Shogun.
这就是我在幕府时代实施泊松回归的方式。
翻译自: https://medium.com/@tejsukhatme/generalized-linear-models-code-d7b26117a1b9
广义线性模型和广义加法模型