用pytorch实现boxcox变换

我之前讨论过boxcox变换:用pyspark学习《应用预测建模》(二)进一步讨论BoxCox变换_littlehuangnan的博客-CSDN博客

boxcox变换的关键是找到合适的lambda。上面这篇文章分析了scipy的源码,它是用对数似然估计来求lambda,优化的方法是optimize.brent。这次用自动微分+随机梯度下降搞一把,哈哈

import pandas as pd
segmentationOriginal = pd.read_csv('Documents/segmentationOriginal.csv')
py_segmentationOriginal = segmentationOriginal[segmentationOriginal['Case']=='Train']

import torch
x = torch.tensor(py_segmentationOriginal['AreaCh1'].values)
a = torch.randn((), device=device, dtype=dtype, requires_grad=True)

def boxcox_llf(lmb, data):
    # data = np.asarray(data)
    N = data.size()[0]

    logdata = torch.log(data) #np.log(data)

    # Compute the variance of the transformed data.
    if lmb == 0:
        variance = torch.var(logdata) #np.var(logdata, axis=0)
    else:
        # Transform without the constant offset 1/lmb.  The offset does
        # not effect the variance, and the subtraction of the offset can
        # lead to loss of precision.
        variance =  torch.var(data.pow(lmb)/lmb) #np.var(data**lmb / lmb, axis=0)

    return (lmb -1) * logdata.sum() - N/2 * torch.log(variance)
            #(lmb - 1) * np.sum(logdata, axis=0) - N/2 * np.log(variance)

learning_rate = 1e-6
for t in range(20000):
    # Forward pass: compute predicted y using operations on Tensors.
    y_pred = boxcox_llf(a, x)

    # Compute and print loss using operations on Tensors.
    # Now loss is a Tensor of shape (1,)
    # loss.item() gets the scalar value held in the loss.
    loss = -1 * y_pred
    if t % 100 == 99:
        print(t, loss.item())

    # Use autograd to compute the backward pass. This call will compute the
    # gradient of loss with respect to all Tensors with requires_grad=True.
    # After this call a.grad, b.grad. c.grad and d.grad will be Tensors holding
    # the gradient of the loss with respect to a, b, c, d respectively.
    loss.backward()

    # Manually update weights using gradient descent. Wrap in torch.no_grad()
    # because weights have requires_grad=True, but we don't need to track this
    # in autograd.
    with torch.no_grad():
        a -= learning_rate * a.grad

        # Manually zero the gradients after updating weights
        a.grad = None

20000步之后得到lambda为-0.8647,和scipy的结果基本一致。

你可能感兴趣的:(机器学习,pytorch)