☛FreshMan

pytorch全连接层梯度链式法则记录

注：本文仅为查阅网络诸多博客后做出的总结与推理思考，不代表正确观点

雅可比矩阵

向量 $X_{1*n}$ 通过右乘 $W_{n*m}$ 得到向量 $Y_{1*m}$ ， $Y$ 关于 $X$ 的偏导数为 $W^T_{m*n}$ 。
雅可比矩阵满足链式法则
注意雅可比矩阵满足链式法则只当连续的求导都是在向量之间，引入矩阵之后不满足链式法则

一个例子说明雅可比矩阵的链式法则：
$X_{1*n}\cdot(W^1_{n*m})=Y_{1*m}（1）$
$Y_{1*m}\cdot(W^2{m*p})=Z_{1*p}（2）$
链式法则表明下式成立，
$d Z / d X = d Z / d Y * d Y / d X$
即 $dZ/dX=W^{2T}_{p*m}\cdot(W^{1T}_{m*n})$

可以证明上式成立，把（1）代入（2）式得，
$X_{1*n}\cdot(W^1_{n*m})\cdot(W^2_{m*p})=Z_{1*p}$

根据雅可比矩阵定义可知 $dZ/dX=W^{2T}_{p*m}\cdot(W^{1T}_{m*n})$

pytorch代码验证如下，由于某个向量梯度求出来之后是为了能够乘以一个学习率然后直接和该向量相减进行学习，所以需要梯度的维度和向量维度匹配，最简单的办法就是引入一个1*输出维度的雅可比矩阵然后应用链式法则，也即Pytorch代码中的loss.backward(loss_like的ones Tensor)

# torch grad test
# -------------------------------------------
input_dim = 2
output_dim1 = 3
output_dim2 = 2

x = torch.randn(input_dim, requires_grad=True)
w1 = torch.randn((input_dim, output_dim1), requires_grad=True)
w2 = torch.randn((output_dim1, output_dim2), requires_grad=True)
label = torch.randn(output_dim2)
y1 = x.matmul(w1)
y1.retain_grad()
y2 = y1.matmul(w2)
loss = y2 - label
loss.backward(torch.ones_like(loss))
print('y1.grad: ', y1.grad)
print('our y1.grad: ', np.ones_like(loss.detach().numpy()).dot(w2.detach().numpy().T))
print('x.grad: ', x.grad)
print('our y2.grad: ', np.ones_like(loss.detach().numpy()).dot(w2.detach().numpy().T).dot(w1.detach().numpy().T))

--------------------output------------------------------
y1.grad:  tensor([-2.8673, -0.0941, -1.0313])
our y1.grad:  [-2.8672557  -0.09413806 -1.0312822 ]
x.grad:  tensor([-5.1732,  2.5583])
our y2.grad:  [-5.17317    2.5583477]

对权重求导

有了上述观点以后注意到loss如果对某一层权重求导，先应用雅可比矩阵链式法则对该层输出求导，然后利用该层输出对该层权重求导，这已经不叫链式法则了。

$X_{1*n}\cdot(W^1_{n*m})=Y_{1*m}（1）$

根据（1）式求 $dY/dW^1$ ，
因为梯度要和权重维度一致，所以易得 $dY/dW=[X^T, ...m-2个X^T, X^T]_{n*m}$

当经由反向传播求对于权重的梯度时，比如输出是 $Z_{1*p}$ ，那么 $dZ/dY=M_{1*m}$ ，这里链式法则乘了一个loss_like的ones矩阵

此时 $dY/dW^1=[X^T, ...m-2个X^T, X^T]_{n*m}*M_{n*m}$

这里 $M_{n*m}$ 代表用 $M_{1*m}$ 组合出来的一个矩阵，然后直接对应点乘而不是矩阵相乘

pytorch代码验证如下，

# torch grad test
# -------------------------------------------
input_dim = 2
output_dim1 = 3
output_dim2 = 2

x = torch.randn(input_dim, requires_grad=True)
w1 = torch.randn((input_dim, output_dim1), requires_grad=True)
w2 = torch.randn((output_dim1, output_dim2), requires_grad=True)
label = torch.randn(output_dim2)
y1 = x.matmul(w1)
y1.retain_grad()
y2 = y1.matmul(w2)
loss = y2 - label
loss.backward(torch.ones_like(loss))
print('w2.grad: ', w2.grad)
last_jac = np.ones_like(loss.detach().numpy())
last_jacs = last_jac
for i in range(w2.size()[0]-1):
    last_jacs = np.vstack((last_jacs, last_jac))
input_array = y1.detach().numpy().reshape(-1, 1)
input_arrays = input_array
for i in range(w2.size()[1]-1):
    input_arrays = np.hstack((input_arrays, input_array))
print('our w2.grad: ', input_arrays * last_jacs)
print('y1.grad: ', y1.grad)
print('our y1.grad: ', np.ones_like(loss.detach().numpy()).dot(w2.detach().numpy().T))
print('w1.grad: ', w1.grad)
last_jac = y1.grad.numpy()
last_jacs = last_jac
for i in range(w1.size()[0]-1):
    last_jacs = np.vstack((last_jacs, last_jac))
input_array = x.detach().numpy().reshape(-1, 1)
input_arrays = input_array
for i in range(w1.size()[1]-1):
    input_arrays = np.hstack((input_arrays, input_array))
print('our w1.grad: ', input_arrays * last_jacs)
print('x.grad: ', x.grad)
print('our y2.grad: ', np.ones_like(loss.detach().numpy()).dot(w2.detach().numpy().T).dot(w1.detach().numpy().T))

--------------------------output----------------------------
w2.grad:  tensor([[ 0.2581,  0.2581],
        [-0.4644, -0.4644],
        [-1.0143, -1.0143]])
our w2.grad:  [[ 0.2581184   0.2581184 ]
 [-0.46440646 -0.46440646]
 [-1.0143005  -1.0143005 ]]
y1.grad:  tensor([ 0.5938, -0.6923,  1.2869])
our y1.grad:  [ 0.59384376 -0.6922908   1.2869087 ]
w1.grad:  tensor([[-0.3074,  0.3583, -0.6661],
        [ 0.3067, -0.3576,  0.6647]])
our w1.grad:  [[-0.3073509   0.35830334 -0.66605496]
 [ 0.3067317  -0.35758147  0.6647131 ]]
x.grad:  tensor([2.7224, 1.1200])
our y2.grad:  [2.7224424 1.12001  ]

python+numpy实现simple autograd

查阅pytorch源代码以及pytorch tutorial官方提供的autograd示例猜测推理pytorch autograd结构如下

基本原理，pytorch记录前向过程中的每一步操作，然后在反向传播的时候借由链式法则求得对每个参数的导数。

网络的实质只是一堆运算，基本运算包括加法，乘法，sum（降维）和expand（升维）。所以理论上只需要这些运算即可完成网络的前馈与反向传播。但是矩阵运算可以加速，所以基于基本运算封装了很多运算，一方面提高运算速度，一方面增多用户接口方便使用。

根据基本原理知道，首先要有一个记录前向操作的tapes，这个tapes记录每一次操作的inputs,outputs和操作函数，采用一个list结构存放即可，先操作先进。
然后在反向传播的时候loss.backward可以看到pytorch源码调用了c自动微分引擎实现自动微分，根据提供的示例代码猜测每一个pytorch提供的能自动微分的函数里面都有一个类似示例里的propagate函数，这个函数是forward函数里的一个闭包，输入反向传播来的某一层的微分，与该层对每个参数的微分相作用（不同的操作函数实现不同的作用可能是点乘矩阵乘法，参考上两节对向量和权重的求导）。
对记录的tapes反过来，对其中每一条内容，根据对应的propagate求出对应的梯度，对同一变量的梯度采取相加策略（链式法则对同一变量梯度也有这一步y=a+b; c=y*b; dc/db=dy/db * b + db/db * y)。
采取某种结构记录下来每个参数的梯度即可

综上，猜测pytorch的loss.backward()时调用自动微分引擎，反向遍历tapes并基于每个可自动微分函数的类propagate函数实现自动微分，将梯度结果保存到tensor.grad变量中。

当然pytorch的autograd机制比这复杂得多，这只是简化版的理解而已

#!/usr/bin/env python3.6
# -*- coding:utf-8 -*-

import torch
from typing import List, NamedTuple, Callable, Dict, Optional


_name: int = 0
def fresh_name() -> str:
    """ create a new unique name for a variable: v0, v1, v2 """
    global _name
    r = f'v{_name}'
    _name += 1
    return r


class Variable:
    def __init__(self, value: torch.Tensor, name: str = None):
        self.value = value
        self.name = name or fresh_name()
        print(f'{self.name} = {value}')

    # We need to start with some tensors whose values were not computed
    # inside the autograd. This function constructs leaf nodes.
    @staticmethod
    def constant(value: torch.Tensor, name: str = None):
        r = Variable(value, name)
        print(f'{r.name} = {value}')
        return r

    def __repr__(self):
        return repr(self.value)

    # This performs a pointwise multiplication of a Variable, tracking gradients
    def __mul__(self, rhs: 'Variable') -> 'Variable':
        # defined later in the notebook
        return operator_mul(self, rhs)

    def __add__(self, rhs: 'Variable') -> 'Variable':
        return operator_add(self, rhs)

    def sum(self, name: Optional[str] = None) -> 'Variable':
        return operator_sum(self, name)

    def expand(self, sizes: List[int]) -> 'Variable':
        return operator_expand(self, sizes)


class TapeEntry(NamedTuple):
    # names of the inputs to the original computation
    inputs : List[str]
    # names of the outputs of the original computation
    outputs: List[str]
    # apply chain rule
    propagate: 'Callable[[List[Variable]], List[Variable]]'


gradient_tape : List[TapeEntry] = []


def reset_tape():
  gradient_tape.clear()
  global _name
  _name = 0 # reset variable names too to keep them small.


def operator_mul(self: Variable, rhs: Variable) -> Variable:
    if isinstance(rhs, float) and rhs == 1.0:
        # peephole optimization
        return self

    # define forward
    r = Variable(self.value * rhs.value)
    print(f'{r.name} = {self.name} * {rhs.name}')

    # record what the inputs and outputs of the op were
    inputs = [self.name, rhs.name]
    outputs = [r.name]

    # define backprop
    def propagate(dL_doutputs: List[Variable]):
        dL_dr, = dL_doutputs

        dr_dself = rhs  # partial derivative of r = self*rhs
        dr_drhs = self  # partial derivative of r = self*rhs

        # chain rule propagation from outputs to inputs of multiply
        dL_dself = dL_dr * dr_dself
        dL_drhs = dL_dr * dr_drhs
        dL_dinputs = [dL_dself, dL_drhs]
        return dL_dinputs

    # finally, we record the compute we did on the tape
    gradient_tape.append(TapeEntry(inputs=inputs, outputs=outputs, propagate=propagate))
    return r


def operator_add(self : Variable, rhs: Variable) -> Variable:
    # Add follows a similar pattern to Mul, but it doesn't end up
    # capturing any variables.
    r = Variable(self.value + rhs.value)
    print(f'{r.name} = {self.name} + {rhs.name}')
    def propagate(dL_doutputs: List[Variable]):
        dL_dr, = dL_doutputs
        dr_dself = 1.0
        dr_drhs = 1.0
        dL_dself = dL_dr * dr_dself
        dL_drhs = dL_dr * dr_drhs
        return [dL_dself, dL_drhs]
    gradient_tape.append(TapeEntry(inputs=[self.name, rhs.name], outputs=[r.name], propagate=propagate))
    return r


# sum is used to turn our matrices into a single scalar to get a loss.
# expand is the backward of sum, so it is added to make sure our Variable
# is closed under differentiation. Both have rules similar to mul above.
def operator_sum(self: Variable, name: Optional[str]) -> 'Variable':
    r = Variable(torch.sum(self.value), name=name)
    print(f'{r.name} = {self.name}.sum()')
    def propagate(dL_doutputs: List[Variable]):
        dL_dr, = dL_doutputs
        size = self.value.size()
        return [dL_dr.expand(*size)]
    gradient_tape.append(TapeEntry(inputs=[self.name], outputs=[r.name], propagate=propagate))
    return r


def operator_expand(self: Variable, sizes: List[int]) -> 'Variable':
    assert(self.value.dim() == 0) # only works for scalars
    r = Variable(self.value.expand(sizes))
    print(f'{r.name} = {self.name}.expand({sizes})')
    def propagate(dL_doutputs: List[Variable]):
        dL_dr, = dL_doutputs
        return [dL_dr.sum()]
    gradient_tape.append(TapeEntry(inputs=[self.name], outputs=[r.name], propagate=propagate))
    return r


def grad(L, desired_results: List[Variable]) -> List[Variable]:
    # this map holds dL/dX for all values X
    dL_d : Dict[str, Variable] = {}
    # It starts by initializing the 'seed' dL/dL, which is 1
    dL_d[L.name] = Variable(torch.ones(()))
    print(f'd{L.name} ------------------------')

    # look up dL_dentries. If a variable is never used to compute the loss,
    # we consider its gradient None, see the note below about zeros for more information.
    def gather_grad(entries: List[str]):
        # 当某个偏导数为0时，不用继续反向传播浪费计算能力，所以后续在反向传播的时候会先判断上一层偏导是不是全零
        # 但是判断一个很大的矩阵是否全零没有判断是否为None快，所以返回None效率更高
        return [dL_d[entry] if entry in dL_d else None for entry in entries]

    # propagate the gradient information backward
    for entry in reversed(gradient_tape):
        dL_doutputs = gather_grad(entry.outputs)
        if all(dL_doutput is None for dL_doutput in dL_doutputs):
            # 假如有如下
            # c = a * b
            # e = a + f
            # de/db理论上是全零偏导
            # 看一下根据代码的反向传播过程，
            # 第一步，可以得到de/da, de/df
            # 第二步，首先会判断de/dc的值存不存在，第一步过后并没有产生该值所以不存在，
            # 理论上来讲de/dc是全零，反向传播回去de/db也是全零，但是如果已知是全零还反向传播回去属于浪费算力
            # 因此会先判断是否全零，而判断全零比判断None更耗时，所以就不返回全零而是返回None
            # optimize for the case where some gradient pathways are zero. See
            # The note below for more details.
            continue

        # perform chain rule propagation specific to each compute
        dL_dinputs = entry.propagate(dL_doutputs)

        # Accululate the gradient produced for each input.
        # Each use of a variable produces some gradient dL_dinput for that
        # use. The multivariate chain rule tells us it is safe to sum
        # all the contributions together.
        for input, dL_dinput in zip(entry.inputs, dL_dinputs):
            if input not in dL_d:
                dL_d[input] = dL_dinput
            else:
                dL_d[input] += dL_dinput

    # print some information to understand the values of each intermediate
    for name, value in dL_d.items():
        print(f'd{L.name}_d{name} = {value.name}')
    print(f'------------------------')

    return gather_grad(desired.name for desired in desired_results)


a_global, b_global = torch.rand(4), torch.rand(4)

def simple(a, b):
    t = a + b
    return t * b

reset_tape() # reset any compute from other cells
a = Variable.constant(a_global, name='a')
b = Variable.constant(b_global, name='b')
loss = simple(a, b)
da, db = grad(loss, [a, b])
print("da", da)
print("db", db)

给自己的备注，注意里面自己的中文注释

python+numpy实现全连接层

依据上面自动反向传播的简单基本框架，分析知要实现全连接层的反向传播只需再实现一个矩阵乘法的反向传播即可

接下来用自己实现的全连接层真实训练一个非线性函数用来拟合sin函数，分析知道还需要有非线性激活函数和loss函数，基于梯度简单的想法采用relu函数，loss使用mse并实现两者的反向传播自动微分

有了网络反向传播还需要实现优化器，使用最简单的SGD，而且是最简单的一个样本的SGD

实现之后训练网络并与pytorch做对比

网络的初始化权重尤为重要，另外SGD在该问题上的表现并不好，不稳定而且对学习率控制要求高，采用Adam会更好，初始化权重如果不对也训练不出来，采用pytorch全连接层默认参数初始化

pytorch代码
注意为了做对比实验使用了batch_size=1, shuffle=False, SGD优化器

import torch
import numpy as np
import torch.nn as nn
import torch.nn.functional as F
import matplotlib.pyplot as plt
from torch.utils.data import TensorDataset, DataLoader


class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.fc1 = nn.Linear(1, 10)
        self.fc2 = nn.Linear(10, 100)
        self.fc3 = nn.Linear(100, 10)
        self.fc4 = nn.Linear(10, 1)

    def forward(self, x):
        x = self.fc1(x)
        x = F.relu(x)
        x = self.fc2(x)
        x = F.relu(x)
        x = self.fc3(x)
        x = F.relu(x)
        x = self.fc4(x)

        return x


if __name__ == '__main__':
    net = Net()
    optimizer = torch.optim.SGD(net.parameters(), lr=0.01)
    schedule = torch.optim.lr_scheduler.StepLR(optimizer, step_size=500, gamma=0.1)
    losses = []
    x = np.linspace(-10, 10, 500)
    x = x.reshape(-1, 1)
    y = np.sin(x)
    y = y.reshape(-1, 1)
    x = np.asarray(x, np.float32)
    x = torch.from_numpy(x)
    y = np.asarray(y, np.float32)
    label = torch.from_numpy(y)
    dataset = TensorDataset(x, label)
    dataloader = DataLoader(dataset, batch_size=1, shuffle=False)
    for i in range(10):
        for input, output in dataloader:
            pred = net(input)
            loss = F.mse_loss(pred, output)
            losses.append(loss.item())
            optimizer.zero_grad()
            loss.backward()
            optimizer.step()
        schedule.step()
    plt.plot(losses)
    plt.show()
    plt.plot(x.numpy(), y, 'g')
    plt.plot(x.numpy(), net(x).detach().numpy(), 'r')

    plt.show()

结果如下，

自己实现的全连接层，采取pytorch tutorial提供的autograd框架，现在原框架上实现一些要用到的函数和反向传播，再实现全连接层和优化器

simplenaiveautogradimplementation.py

#!/usr/bin/env python3.6
# -*- coding:utf-8 -*-

import torch
import numpy as np
from typing import List, NamedTuple, Callable, Dict, Optional


_name: int = 0
def fresh_name() -> str:
    """ create a new unique name for a variable: v0, v1, v2 """
    global _name
    r = f'v{_name}'
    _name += 1
    return r


class Variable:
    def __init__(self, value: torch.Tensor, name: str = None):
        self.value = value
        self.name = name or fresh_name()
        print(f'{self.name} = {value}')

    # We need to start with some tensors whose values were not computed
    # inside the autograd. This function constructs leaf nodes.
    @staticmethod
    def constant(value: torch.Tensor, name: str = None):
        r = Variable(value, name)
        print(f'{r.name} = {value}')
        return r

    def __repr__(self):
        return repr(self.value)

    # This performs a pointwise multiplication of a Variable, tracking gradients
    def __mul__(self, rhs: 'Variable') -> 'Variable':
        # defined later in the notebook
        return operator_mul(self, rhs)

    def __add__(self, rhs: 'Variable') -> 'Variable':
        return operator_add(self, rhs)

    def sum(self, name: Optional[str] = None) -> 'Variable':
        return operator_sum(self, name)

    def expand(self, sizes: List[int]) -> 'Variable':
        return operator_expand(self, sizes)

    def vectorMatMul(self, weights):
        return operator_vectorMatMul(self, weights)

    def mseLoss(self, label):
        return mseLoss(self, label)


class TapeEntry(NamedTuple):
    # names of the inputs to the original computation
    inputs : List[str]
    # names of the outputs of the original computation
    outputs: List[str]
    # apply chain rule
    propagate: 'Callable[[List[Variable]], List[Variable]]'


gradient_tape : List[TapeEntry] = []


def reset_tape():
    gradient_tape.clear()
    global _name
    _name = 0 # reset variable names too to keep them small.


def operator_mul(self: Variable, rhs: Variable) -> Variable:
    if isinstance(rhs, float) and rhs == 1.0:
        # peephole optimization
        return self

    # define forward
    r = Variable(self.value * rhs.value)
    print(f'{r.name} = {self.name} * {rhs.name}')

    # record what the inputs and outputs of the op were
    inputs = [self.name, rhs.name]
    outputs = [r.name]

    # define backprop
    def propagate(dL_doutputs: List[Variable]):
        dL_dr, = dL_doutputs

        dr_dself = rhs  # partial derivative of r = self*rhs
        dr_drhs = self  # partial derivative of r = self*rhs

        # chain rule propagation from outputs to inputs of multiply
        dL_dself = dL_dr * dr_dself
        dL_drhs = dL_dr * dr_drhs
        dL_dinputs = [dL_dself, dL_drhs]
        return dL_dinputs

    # finally, we record the compute we did on the tape
    gradient_tape.append(TapeEntry(inputs=inputs, outputs=outputs, propagate=propagate))
    return r


def operator_add(self : Variable, rhs: Variable) -> Variable:
    # Add follows a similar pattern to Mul, but it doesn't end up
    # capturing any variables.
    r = Variable(self.value + rhs.value)
    print(f'{r.name} = {self.name} + {rhs.name}')
    def propagate(dL_doutputs: List[Variable]):
        dL_dr, = dL_doutputs
        dr_dself = 1.0
        dr_drhs = 1.0
        dL_dself = dL_dr * dr_dself
        dL_drhs = dL_dr * dr_drhs
        return [dL_dself, dL_drhs]
    gradient_tape.append(TapeEntry(inputs=[self.name, rhs.name], outputs=[r.name], propagate=propagate))
    return r


def operator_vectorMatMul(self, weights):
    assert self.value.ndim == 1
    assert self.value.size()[0] == weights.value.size()[0]
    r = Variable(torch.from_numpy(self.value.numpy().dot(weights.value.numpy())))
    print(f'{r.name} = {self.name}.dot({weights.name})')
    def propagate(dL_doutputs):
        dL_dr, = dL_doutputs
        dL_dself_val = dL_dr.value.numpy().dot(weights.value.numpy().T)
        dL_dself = Variable(torch.from_numpy(dL_dself_val))
        last_jac = dL_dr.value.numpy()
        last_jacs = last_jac
        for i in range(weights.value.size()[0] - 1):
            last_jacs = np.vstack((last_jacs, last_jac))
        input_array = self.value.numpy().reshape(-1, 1)
        input_arrays = input_array
        for i in range(weights.value.size()[1] - 1):
            input_arrays = np.hstack((input_arrays, input_array))
        dL_dweights_val = torch.from_numpy(input_arrays * last_jacs)
        dL_dweights = Variable(dL_dweights_val)
        return [dL_dself, dL_dweights]
    gradient_tape.append(TapeEntry(inputs=[self.name, weights.name], outputs=[r.name], propagate=propagate))
    return r


def mseLoss(self, label):
    r_val = torch.from_numpy(self.value.numpy() - label.value.numpy()) ** 2
    r = Variable(r_val)
    print(f"{r.name} = {self.name}.mseLoss({label.name})")
    def propagate(dL_doutputs):
        dL_dr, = dL_doutputs
        dL_dself_val = dL_dr.value.numpy() * 2 * (self.value.numpy() - label.value.numpy())
        dL_dself = Variable(torch.from_numpy(dL_dself_val))
        dL_dlabel_val = torch.from_numpy(dL_dr.value.numpy() * -2 * (self.value.numpy() - label.value.numpy()))
        dL_dlabel = Variable(dL_dlabel_val)
        return [dL_dself, dL_dlabel]
    gradient_tape.append(TapeEntry(inputs=[self.name, label.name], outputs=[r.name], propagate=propagate))
    return r


def relu(self):
    val = self.value.numpy()
    val[val < 0] = 0
    r_val = torch.from_numpy(val)
    r = Variable(r_val)
    print(f"{r.name} = relu({self.name})")
    def propagate(dL_doutputs):
        dL_dr, = dL_doutputs
        dL_dself_val = torch.from_numpy(dL_dr.value.numpy() * (np.asarray(self.value.numpy() > 0, np.float32)))
        dL_dself = Variable(dL_dself_val)
        return [dL_dself]
    gradient_tape.append(TapeEntry(inputs=[self.name], outputs=[r.name], propagate=propagate))
    return r


# sum is used to turn our matrices into a single scalar to get a loss.
# expand is the backward of sum, so it is added to make sure our Variable
# is closed under differentiation. Both have rules similar to mul above.
def operator_sum(self: Variable, name: Optional[str]) -> 'Variable':
    r = Variable(torch.sum(self.value), name=name)
    print(f'{r.name} = {self.name}.sum()')
    def propagate(dL_doutputs: List[Variable]):
        dL_dr, = dL_doutputs
        size = self.value.size()
        return [dL_dr.expand(*size)]
    gradient_tape.append(TapeEntry(inputs=[self.name], outputs=[r.name], propagate=propagate))
    return r


def operator_expand(self: Variable, sizes: List[int]) -> 'Variable':
    assert(self.value.dim() == 0) # only works for scalars
    r = Variable(self.value.expand(sizes))
    print(f'{r.name} = {self.name}.expand({sizes})')
    def propagate(dL_doutputs: List[Variable]):
        dL_dr, = dL_doutputs
        return [dL_dr.sum()]
    gradient_tape.append(TapeEntry(inputs=[self.name], outputs=[r.name], propagate=propagate))
    return r


def grad(L, desired_results: List[Variable]) -> List[Variable]:
    # this map holds dL/dX for all values X
    dL_d : Dict[str, Variable] = {}
    # It starts by initializing the 'seed' dL/dL, which is 1
    dL_d[L.name] = Variable(torch.ones_like(L.value))
    print(f'd{L.name} ------------------------')

    # look up dL_dentries. If a variable is never used to compute the loss,
    # we consider its gradient None, see the note below about zeros for more information.
    def gather_grad(entries: List[str]):
        # 当某个偏导数为0时，不用继续反向传播浪费计算能力，所以后续在反向传播的时候会先判断上一层偏导是不是全零
        # 但是判断一个很大的矩阵是否全零没有判断是否为None快，所以返回None效率更高
        return [dL_d[entry] if entry in dL_d else None for entry in entries]

    # propagate the gradient information backward
    for entry in reversed(gradient_tape):
        dL_doutputs = gather_grad(entry.outputs)
        if all(dL_doutput is None for dL_doutput in dL_doutputs):
            # 假如有如下
            # c = a * b
            # e = a + f
            # de/db理论上是全零偏导
            # 看一下根据代码的反向传播过程，
            # 第一步，可以得到de/da, de/df
            # 第二步，首先会判断de/dc的值存不存在，第一步过后并没有产生该值所以不存在，
            # 理论上来讲de/dc是全零，反向传播回去de/db也是全零，但是如果已知是全零还反向传播回去属于浪费算力
            # 因此会先判断是否全零，而判断全零比判断None更耗时，所以就不返回全零而是返回None
            # optimize for the case where some gradient pathways are zero. See
            # The note below for more details.
            continue

        # perform chain rule propagation specific to each compute
        dL_dinputs = entry.propagate(dL_doutputs)

        # Accululate the gradient produced for each input.
        # Each use of a variable produces some gradient dL_dinput for that
        # use. The multivariate chain rule tells us it is safe to sum
        # all the contributions together.
        for input, dL_dinput in zip(entry.inputs, dL_dinputs):
            if input not in dL_d:
                dL_d[input] = dL_dinput
            else:
                dL_d[input] += dL_dinput

    # print some information to understand the values of each intermediate
    for name, value in dL_d.items():
        print(f'd{L.name}_d{name} = {value.name}')
    print(f'------------------------')

    return gather_grad(desired.name for desired in desired_results)


if __name__ == '__main__':
    # a_global, b_global = torch.rand(4), torch.rand(4)
    #
    # def simple(a, b):
    #     t = a + b
    #     return t * b
    #
    # reset_tape() # reset any compute from other cells
    # a = Variable.constant(a_global, name='a')
    # b = Variable.constant(b_global, name='b')
    # loss = simple(a, b)
    # da, db = grad(loss, [a, b])
    # print("da", da)
    # print("db", db)

    a_val = torch.rand(4)
    weights_val = torch.rand((4, 5))
    bias_val = torch.rand(5)

    reset_tape()
    a = Variable(a_val)
    weights = Variable(weights_val)
    bias = Variable(bias_val)
    loss = (a.vectorMatMul(weights)) + bias
    da, dweights, dbias = grad(loss, [a, weights, bias])
    print('da: ', da)
    print('dweights: ', dweights)
    print('dbias: ', dbias)

simplenaivefc.py

# /usr/bin/env python3.6
# -*- coding=utf-8 -*-


import sys
sys.path.append('..')
import torch
import numpy as np
from temp.simplenaiveautogradimplementation import gradient_tape, grad, Variable, reset_tape, relu
import matplotlib.pyplot as plt


class FCLayer(object):
    names = []
    def __init__(self, inputDim, outputDim, name):
        if name in FCLayer.names:
            raise Exception('name deprecated!')
        FCLayer.names.append(name)
        super(FCLayer, self).__init__()
        self.weights = Variable(torch.from_numpy(np.asarray(np.random.uniform(-1, 1, (inputDim, outputDim)) / np.sqrt(inputDim), dtype=np.float32)), name=name+"_weights")
        self.bias = Variable(torch.from_numpy(np.asarray(np.random.uniform(-1, 1, outputDim) / np.sqrt(inputDim), dtype=np.float32)), name=name+"_bias")

    def forward(self, x: Variable):
        return x.vectorMatMul(weights=self.weights) + self.bias


class SGDOptimizer(object):
    def __init__(self, params, lr=0.01):
        self._lr = lr
        self._params = params

    def setLr(self, lr):
        self._lr = lr

    def step(self, loss):
        grads = grad(loss, self._params)
        for param, gradient in zip(self._params, grads):
            new_val = param.value.numpy() - self._lr * gradient.value.numpy()
            param.value = torch.from_numpy(new_val)
            # print(f'--{param.name}----{param.name}----{param.name}----{param.name}----{param.name}----{param.name}--')
            # print(param.value)
            # print(f'--{param.name}----{param.name}----{param.name}----{param.name}----{param.name}----{param.name}--')


if __name__ == '__main__':
    fc_layer1 = FCLayer(1, 10, 'fc1')
    fc_layer2 = FCLayer(10, 100, 'fc2')
    fc_layer3 = FCLayer(100, 10, 'fc3')
    fc_layer4 = FCLayer(10, 1, 'fc4')
    optimizer = SGDOptimizer(params=[fc_layer1.weights, fc_layer1.bias, fc_layer2.weights, fc_layer2.bias,
                                     fc_layer3.weights, fc_layer3.bias, fc_layer4.weights, fc_layer4.bias])
    losses = []
    sample_num = 500
    x = np.linspace(-10, 10, sample_num)
    x = x.reshape(-1, 1)
    y = np.sin(x)
    y = y.reshape(-1, 1)
    x = np.asarray(x, np.float32)
    y = np.asarray(y, np.float32)
    for i in range(10):
        for i in range(sample_num):
            reset_tape()
            cur_sample = x[i]
            cur_label = y[i]
            cur_sample = Variable(torch.from_numpy(cur_sample))
            cur_label = Variable(torch.from_numpy(cur_label))
            pred_y = fc_layer4.forward(relu(fc_layer3.forward(relu(fc_layer2.forward(relu(fc_layer1.forward(cur_sample)))))))
            loss = pred_y.mseLoss(cur_label)
            losses.append(loss.value)
            optimizer.step(loss)
    plt.plot(losses)
    plt.show()
    x = np.linspace(-10, 10, sample_num)
    x = x.reshape(-1, 1)
    y = np.sin(x)
    y = y.reshape(-1, 1)
    x = np.asarray(x, np.float32)
    y = np.asarray(y, np.float32)
    plt.plot(x, y, 'g')
    # plt.show()
    pred_y = []
    for i in range(sample_num):
        cur_sample = Variable(torch.from_numpy(x[i]))
        pred_y.append(fc_layer4.forward(relu(
            fc_layer3.forward(relu(
                fc_layer2.forward(relu(
                    fc_layer1.forward(cur_sample)
                )
                )
            )
            )
        )
        ).value)
    plt.plot(x, pred_y, 'r')
    plt.show()

结果如下，

可以看到与pytorch版本的效果几乎一样，证明反向传播过程无误，当然结果并不可重复，因为SGD每一次都会收敛到不同的结果，这里是收敛的比较好的一次

more naive fclayer using numpy+python

上面的autograd是用了一个字典记录grad值的，但是想尝试下能不能直接像pytorch的Tensor一样拿一个grad属性来存梯度值

你可能感兴趣的:(deep,learning笔记)

机器学习与深度学习间关系与区别 ℒℴѵℯ心·动ꦿ໊ོ꫞ 人工智能学习深度学习 python
一、机器学习概述定义机器学习（MachineLearning,ML）是一种通过数据驱动的方法，利用统计学和计算算法来训练模型，使计算机能够从数据中学习并自动进行预测或决策。机器学习通过分析大量数据样本，识别其中的模式和规律，从而对新的数据进行判断。其核心在于通过训练过程，让模型不断优化和提升其预测准确性。主要类型1.监督学习（SupervisedLearning）监督学习是指在训练数据集中包含输入
10月|愿你的青春不负梦想-读书笔记-01 Tracy的小书斋
本书的作者是俞敏洪，大家都很熟悉他了吧。俞敏洪老师是我行业的领头羊吧，也是我事业上的偶像。本日摘录他书中第一章中的金句：『一个人如果什么目标都没有，就会浑浑噩噩，感觉生命中缺少能量。能给我们能量的，是对未来的期待。第一件事，我始终为了进步而努力。与其追寻全世界的骏马，不如种植丰美的草原，到时骏马自然会来。第二件事，我始终有阶段性的目标。什么东西能给我能量？答案是对未来的期待。』读到这里的时候，我便
《投行人生》读书笔记小蘑菇的树洞
《投行人生》----作者詹姆斯-A-朗德摩根斯坦利副主席40年的职业洞见-很短小精悍的篇幅，比较适合初入职场的新人。第一部分成功的职业生涯需要规划1.情商归为适应能力分享与协作同理心适应能力，更多的是自我意识，你有能力识别自己的情并分辨这些情绪如何影响你的思想和行为。2.对于初入职场的人的建议，细节，截止日期和数据很重要截止日期，一种有效的方法是请老板为你所有的任务进行优先级排序。和老板喝咖啡的好
【一起学Rust | 设计模式】习惯语法——使用借用类型作为参数、格式化拼接字符串、构造函数广龙宇一起学Rust #Rust设计模式 rust 设计模式开发语言
提示：文章写完后，目录可以自动生成，如何生成可参考右边的帮助文档文章目录前言一、使用借用类型作为参数二、格式化拼接字符串三、使用构造函数总结前言Rust不是传统的面向对象编程语言，它的所有特性，使其独一无二。因此，学习特定于Rust的设计模式是必要的。本系列文章为作者学习《Rust设计模式》的学习笔记以及自己的见解。因此，本系列文章的结构也与此书的结构相同（后续可能会调成结构），基本上分为三个部分
git常用命令笔记咩酱-小羊 git 笔记
###用习惯了idea总是不记得git的一些常见命令，需要用到的时候总是担心旁边站了人~~~记个笔记@_@，告诉自己看笔记不丢人初始化初始化一个新的Git仓库gitinit配置配置用户信息gitconfig--globaluser.name"YourName"gitconfig--globaluser.email"[email protected]"基本操作克隆远程仓库gitclone查看
509. 斐波那契数(每日一题) lzyprime
lzyprime博客(github)创建时间：2021.01.04qq及邮箱：2383518170leetcode笔记题目描述斐波那契数，通常用F(n)表示，形成的序列称为斐波那契数列。该数列由0和1开始，后面的每一项数字都是前面两项数字的和。也就是：F(0)=0，F(1)=1F(n)=F(n-1)+F(n-2)，其中n>1给你n，请计算F(n)。示例1：输入：2输出：1解释：F(2)=F(1)+
拥有断舍离的心态，过精简生活--《断舍离》读书笔记爱吃丸子的小樱桃
不知不觉间房间里的东西越来越多，虽然摆放整齐，但也时常会觉得空间逼仄，令人心生烦闷。抱着断舍离的态度，我开始阅读《断舍离》这本书，希望从书中能找到一些有效的方法，帮助我实现空间、物品上的断舍离。《断舍离》是日本作家山下英子通过自己的经历、思考和实践总结而成的，整体内涵也从刚开始的私人生活哲学的“断舍离”升华成了“人生实践哲学”，接着又成为每个人都能实行的“改变人生的断舍离”，从“哲学”逐渐升华成“
四章-32-点要素的聚合彩云飘过
本文基于腾讯课堂老胡的课《跟我学Openlayers--基础实例详解》做的学习笔记，使用的openlayers5.3.xapi。源码见1032.html，对应的官网示例https://openlayers.org/en/latest/examples/cluster.htmlhttps://openlayers.org/en/latest/examples/earthquake-clusters.
高端密码学院笔记285 柚子_b4b4
高端幸福密码学院（高级班）幸福使者：李华第（598）期《幸福》之回归内在深层生命原动力基础篇——揭秘“激励”成长的喜悦心理案例分析主讲：刘莉一，知识扩充:成功=艰苦劳动+正确方法+少说空话。贪图省力的船夫，目标永远下游。智者的梦再美，也不如愚人实干的脚印。幸福早课堂2020.10.16星期五一笔记:1，重视和珍惜的前提是知道它的价值非常重要，当你珍惜了，你就真正定下来，真正的学到身上。2，大家需要
Day17笔记-高阶函数 ~在杰难逃~ Python 笔记 python 开发语言 pycharm 数据分析
高阶函数【重点掌握】函数的本质：函数是一个变量，函数名是一个变量名，一个函数可以作为另一个函数的参数或返回值使用如果A函数作为B函数的参数，B函数调用完成之后，会得到一个结果，则B函数被称为高阶函数常用的高阶函数：map(),reduce(),filter(),sorted()1.map()map(func,iterable)，返回值是一个iterator【容器，迭代器】func:函数iterab
Day1笔记-Python简介&标识符和关键字&输入输出 ~在杰难逃~ Python python 开发语言大数据数据分析数据挖掘
大家好，从今天开始呢，杰哥开展一个新的专栏，当然，数据分析部分也会不定时更新的，这个新的专栏主要是讲解一些Python的基础语法和知识，帮助0基础的小伙伴入门和学习Python，感兴趣的小伙伴可以开始认真学习啦！一、Python简介【了解】1.计算机工作原理编程语言就是用来定义计算机程序的形式语言。我们通过编程语言来编写程序代码，再通过语言处理程序执行向计算机发送指令，让计算机完成对应的工作，编程
node.js学习小猿L node.js node.js 学习 vim
node.js学习实操及笔记温故node.js，node.js学习实操过程及笔记~node.js学习视频node.js官网node.js中文网实操笔记githubcsdn笔记为什么学node.js可以让别人访问我们编写的网页为后续的框架学习打下基础，三大框架vuereactangular离不开node.jsnode.js是什么官网：node.js是一个开源的、跨平台的运行JavaScript的运行
数据仓库——维度表一致性墨染丶eye 背诵数据仓库
数据仓库基础笔记思维导图已经整理完毕，完整连接为：数据仓库基础知识笔记思维导图维度一致性问题从逻辑层面来看，当一系列星型模型共享一组公共维度时，所涉及的维度称为一致性维度。当维度表存在不一致时，短期的成功难以弥补长期的错误。维度时确保不同过程中信息集成起来实现横向钻取货活动的关键。造成横向钻取失败的原因维度结构的差别，因为维度的差别，分析工作涉及的领域从简单到复杂，但是都是通过复杂的报表来弥补设计
【Git】常见命令(仅笔记) 好想有猫猫 Git Linux学习笔记 git 笔记 elasticsearch linux c++
文章目录创建/初始化本地仓库添加本地仓库配置项提交文件查看仓库状态回退仓库查看日志分支删除文件暂存工作区代码远程仓库使用`.gitigore`文件让git不追踪一些文件标签创建/初始化本地仓库gitinit添加本地仓库配置项gitconfig-l#以列表形式显示配置项gitconfiguser.name"ljh"#配置user.namegitconfiguser.email"[email protected]
为什么你总是对下属不满意? ZhaoWu1050
【ZhaoWu的听课笔记】大多数公司，都存在两种问题。我创业四年，更是体会深切。这两种问题就是：老板经常不满意下属的表现；下属总是不知道老板想要什么；虽然这两种问题普遍存在，其实解决方法并不复杂。这节课，我们再聊聊第一个问题：为什么老板经常不满意下属表现?其实，这背后也是一条管理常识。管理学家德鲁克先生早就说过：管理者的任务，不是去改变人。*来自《卓有成效的管理者》只是大多数老板和我一样，都是一边
母亲节如何做小红书营销美橙传媒
小红书的一举一动引起了外界的高度关注。通过爆款笔记和流行话题，我们可以看到“干货”类型的内容在小红书中偏向实用的生活经验共享和生活指南非常受欢迎。根据运营社的分析，这种现象是由小红书用户心智和内容社区背后机制共同决定的。首先，小红书将使用“强搜索”逻辑为用户提供特定的“搜索场景”。在“我必须这样生活”中，大量使用了满足小红书站用户喜好和需求的内容。内容社区自制的高质量内容也吸引了寻找营销新途径的品
读书笔记|《遇见孩子，遇见更好的自己》5 抹茶社长
为人父母意味着放弃自己的过去，不要对以往没有实现的心愿耿耿于怀，只有这样，孩子们才能做回自己。985909803.jpg孩子在与父母保持亲密的同时更需要独立，唯有这样，孩子才会成为孩子，父母才会成其为父母。有耐心的人生往往更幸福，给孩子留点余地。认识到养儿育女是对耐心的考验。为失败做好心理准备，教会孩子控制情绪。了解自己的底线，说到底线，有一点很重要，父母之所以发脾气，真正的原因往往在于他们自己，
基于Python给出的PDF文档转Markdown文档的方法程序媛了了 python pdf 开发语言
注：网上有很多将Markdown文档转为PDF文档的方法，但是却很少有将PDF文档转为Markdown文档的方法。就算有，比如某些网站声称可以将PDF文档转为Markdown文档，尝试过，不太符合自己的要求，而且无法保证文档没有泄露风险。于是本人为了解决这个问题，借助GPT（能使用GPT镜像或者有条件直接使用GPT的，反正能调用GPT接口就行）生成Python代码来完成这个功能。笔记、代码难免存在
语文主题教学学习笔记之87 东哥杂谈
“语文主题教学”学习笔记之八十七（0125）今天继续学习小学语文主题教学的实践样态。板块三：教学中体现“书艺”味道。作为四大名著之一的《水浒传》，堪称我国文学宝库之经典。对从《水浒传》中摘选的单元，教师就要了解其原生态，即评书体特点。这也要求教师要了解一些常用的评书行话术语，然后在教学时适时地加入一些，让学生体味其文本中原有的特色。学生也要尽可能地通过朗读的方式，而不单是分析讲解的方式进行学习。细
Armv8.3 体系结构扩展--原文版代码改变世界ctw ARM-TEE-Android armv8 嵌入式 arm架构安全架构芯片 Trustzone Secureboot
快速链接:.ARMv8/ARMv9架构入门到精通-[目录]付费专栏-付费课程【购买须知】:个人博客笔记导读目录(全部)TheArmv8.3architectureextensionTheArmv8.3architectureextensionisanextensiontoArmv8.2.Itaddsmandatoryandoptionalarchitecturalfeatures.Somefeat
springboot+vue项目实战一-创建SpringBoot简单项目苹果酱0567 面试题汇总与解析 spring boot 后端 java 中间件开发语言
这段时间抽空给女朋友搭建一个个人博客，想着记录一下建站的过程，就当做笔记吧。虽然复制zjblog只要一个小时就可以搞定一个网站，或者用cms系统，三四个小时就可以做出一个前后台都有的网站，而且想做成啥样也都行。但是就是要从新做，自己做的意义不一样，更何况，俺就是专门干这个的，嘿嘿嘿要做一个网站，而且从零开始，首先呢就是技术选型了，经过一番思量决定选择-SpringBoot做后端，前端使用Vue做一
JavaScript 中，深拷贝（Deep Copy）和浅拷贝（Shallow Copy）跳房子的前端前端面试 javascript 开发语言 ecmascript
在JavaScript中，深拷贝（DeepCopy）和浅拷贝（ShallowCopy）是用于复制对象或数组的两种不同方法。了解它们的区别和应用场景对于避免潜在的bugs和高效地处理数据非常重要。以下是对深拷贝和浅拷贝的详细解释，包括它们的概念、用途、优缺点以及实现方式。1.浅拷贝（ShallowCopy）概念定义：浅拷贝是指创建一个新的对象或数组，其中包含了原对象或数组的基本数据类型的值和对引用数
阅读《认知觉醒》读书笔记就看看书
本周阅读了周岭的《认知觉醒开启自我改变的原动力》，启发较多，故做读书笔记一则，留待学习。全书共八章，讲述了大脑、潜意识、元认知、专注力、学习力、行动力、情绪力及成本最低的成长之道。具体描述了大脑、焦虑、耐心、模糊、感性、元认知、自控力、专注力、情绪专注、学习专注、匹配、深度、关联、体系、打卡、反馈、休息、清晰、傻瓜、行动、心智宽带、单一视角、游戏心态、早起、冥想、阅读、写作、运动等相关知识点。大脑
阅读笔记：阅读方法中的逻辑和转念施吉涛
聊聊一些阅读的方法论吧，别人家的读书方法刚开始想写，然后就不知道写什么了，因为作者写的非常的“精致”我有一种乡巴佬进城的感觉，看到精美的摆盘，精致的食材不知道该如何下口也就是《阅读的方法》，我们姑且来试一下强劲的大脑篇，第一节：逻辑通俗的来讲，也就是表达的排列和顺序，再进一步就是因果关系和关联实际上书已经看了大概一遍，但直到打算写一下笔记的时候，才发现作者讲的推理更多的是阅读的对象中呈现出的逻辑也
《转介绍方法论》学习笔记小可乐的妈妈
一、高效转介绍的流程：价值观---执行----方案一）转介绍发生的背景：1、对象：谁向谁转介绍？全员营销，人人参与。①员工的激励政策、客户的转介绍诱因制作客户画像：a信任；支付能力；意愿度；便利度（根据家长具备四个特征的个数分为四类）B性格分类C职业分类D年龄性别②执行：套路，策略，方法，流程2、诱因：为什么要转介绍？认同信任；多方共赢；传递美好；零风险承诺打动人心，超越期待。选择做教育，就是选择
JAVA学习笔记之23种设计模式学习 victorfreedom Java技术设计模式 android java 常用设计模式
博主最近买了《设计模式》这本书来学习，无奈这本书是以C++语言为基础进行说明，整个学习流程下来效率不是很高，虽然有的设计模式通俗易懂，但感觉还是没有充分的掌握了所有的设计模式。于是博主百度了一番，发现有大神写过了这方面的问题，于是博主迅速拿来学习。一、设计模式的分类总体来说设计模式分为三大类：创建型模式，共五种：工厂方法模式、抽象工厂模式、单例模式、建造者模式、原型模式。结构型模式，共七种：适配器
解决Obsidian写笔记中的＜img＞标签无法显示图片的问题全能全知者笔记
Obsidian中写md笔记如果使用标签会显示不出图案，后来才知道因为Obsidian的问题导致只能用绝对路径定位。所以我本人写了一个py插件，将md笔记里的img标签批量替换成Obsidian能够读取的形式。安装FixObsImgDpy:pipinstallFixObsImgDpy安装完成后在需要修复的md文件的父目录下运行命令:FixObsImgDpy就会自动修复父目录以下的全部md文件仓库
2021年周总结 03 Ruby之家
这周的生活过得也是比较快，因为暂时住的离公司有点距离，所以通勤时间相对较长一点，而在地铁上的一个半小时如何充分利用起来，则是我最近一直在思考的问题，2021年想让自己的生活都运行在计划中。(有时候自己想干一件事情就总是给自己找很多借口，想着以后怎么怎么样？然而哪有那么多的以后，能够方便当下的工作生活就立马执行就OK，这仅仅只是我此时想到背的很重的老人机笔记本电脑，也算是陪伴我快8年的—当时买的时候
2021-12-11 人生导演
今天读到佛学书籍的一段话：初学者很难直接体验到无我，但可以经常提醒自己：一切事物都是无我的。不断强化这个观念，也会相当有帮助。比如生病了我们一般会说：“我不舒服！我很痛！我很惨！”这时候如果我们提醒自己：没有我，只是这个肉体的某些部分、某些功能出了问题，不舒服、疼痛也只是一时的感受，而感受随时在变化。仅仅是知道没有一个实存的我在生病、在受苦。然后把“一切事物都是无我的”这句话，记到笔记上，并且朗读
新能源汽车 BMS 学习笔记篇—BMS 基本定义及分类 WPG大大通其他笔记汽车 BMS 经验分享新能源电池
一、BMS定义1、概念：BMS（BatteryManagementSystem）即电池管理系统，其管理对象是二次电池（充电电池或蓄电池），其主要目的是电池的利用率，防止电池出现过度充电和过度放电，可应用于电动汽车、电瓶车、机器人、无人机等图片来源：腾讯网https://new.qq.com《标准普尔警告，电动汽车电池生产面临供应链和地缘政治风险》2、四大功能①感知和测量：检测电池的电压、电流、温度
springmvc 下 freemarker页面枚举的遍历输出杨白白 enum freemarker
spring mvc freemarker 中遍历枚举 1枚举类型有一个本地方法叫values（），这个方法可以直接返回枚举数组。所以可以利用这个遍历。 enum public enum BooleanEnum { TRUE(Boolean.TRUE, "是"), FALSE(Boolean.FALSE, "否");
实习简要总结 byalias 工作
来白虹不知不觉中已经一个多月了，因为项目还在需求分析及项目架构阶段，自己在这段时间都是在学习相关技术知识，现在对这段时间的工作及学习情况做一个总结：（1）工作技能方面大体分为两个阶段，Java Web 基础阶段和Java EE阶段 1）Java Web阶段在这个阶段，自己主要着重学习了 JSP, Servlet, JDBC, MySQL，这些知识的核心点都过了一遍，也
Quartz——DateIntervalTrigger触发器 eksliang quartz
转载请出自出处：http://eksliang.iteye.com/blog/2208559 一.概述 simpleTrigger 内部实现机制是通过计算间隔时间来计算下次的执行时间，这就导致他有不适合调度的定时任务。例如我们想每天的 1：00AM 执行任务，如果使用 SimpleTrigger，间隔时间就是一天。注意这里就会有一个问题，即当有 misfired 的任务并且恢复执行时，该执行时间
Unix快捷键 18289753290 unix Unix；快捷键;
复制，删除，粘贴： dd:删除光标所在的行 &nbs
获取Android设备屏幕的相关参数酷的飞上天空 android
包含屏幕的分辨率以及屏幕宽度的最大dp 高度最大dp TextView text = (TextView)findViewById(R.id.text); DisplayMetrics dm = new DisplayMetrics(); text.append("getResources().ge
要做物联网？先保护好你的数据蓝儿唯美数据
根据Beecham Research的说法，那些在行业中希望利用物联网的关键领域需要提供更好的安全性。在Beecham的物联网安全威胁图谱上，展示了那些可能产生内外部攻击并且需要通过快速发展的物联网行业加以解决的关键领域。 Beecham Research的技术主管Jon Howes说：“之所以我们目前还没有看到与物联网相关的严重安全事件，是因为目前还没有在大型客户和企业应用中进行部署，也就
Java取模（求余）运算随便小屋 java
整数之间的取模求余运算很好求，但几乎没有遇到过对负数进行取模求余，直接看下面代码： /** * * @author Logic * */ public class Test { public static void main(String[] args) { // TODO A
SQL注入介绍 aijuans sql注入
二、SQL注入范例这里我们根据用户登录页面 <form action="" > 用户名：<input type="text" name="username"><br/> 密码：<input type="password" name="passwor
优雅代码风格 aoyouzi 代码
总结了几点关于优雅代码风格的描述：代码简单：不隐藏设计者的意图，抽象干净利落，控制语句直截了当。接口清晰：类型接口表现力直白，字面表达含义，API 相互呼应以增强可测试性。依赖项少：依赖关系越少越好，依赖少证明内聚程度高，低耦合利于自动测试，便于重构。没有重复：重复代码意味着某些概念或想法没有在代码中良好的体现，及时重构消除重复。战术分层：代码分层清晰，隔离明确，
布尔数组百合不是茶 java 布尔数组
androi中提到了布尔数组; 布尔数组默认的是false, 并且只会打印false或者是true 布尔数组的例子; 根据字符数组创建布尔数组 char[] c = {'p','u','b','l','i','c'}; //根据字符数组的长度创建布尔数组的个数 boolean[] b = new bool
web.xml之welcome-file-list、error-page bijian1013 java web.xml servlet error-page
welcome-file-list 1.定义： <welcome-file-list> <welcome-file>login.jsp</welcome> </welcome-file-list> 2.作用：用来指定WEB应用首页名称。 error-page1.定义： <error-page&g
richfaces 4 fileUpload组件删除上传的文件 sunjing clear Richfaces 4 fileupload
页面代码 <h:form id="fileForm"> <rich:
技术文章备忘 bit1129 技术文章
Zookeeper http://wenku.baidu.com/view/bab171ffaef8941ea76e05b8.html http://wenku.baidu.com/link?url=8thAIwFTnPh2KL2b0p1V7XSgmF9ZEFgw4V_MkIpA9j8BX2rDQMPgK5l3wcs9oBTxeekOnm5P3BK8c6K2DWynq9nfUCkRlTt9uV
org.hibernate.hql.ast.QuerySyntaxException: unexpected token: on near line 1解决方案白糖_ Hibernate
文章摘自：http://blog.csdn.net/yangwawa19870921/article/details/7553181 在编写HQL时，可能会出现这种代码： select a.name,b.age from TableA a left join TableB b on a.id=b.id 如果这是HQL，那么这段代码就是错误的，因为HQL不支持
sqlserver按照字段内容进行排序 bozch 按照内容排序
在做项目的时候，遇到了这样的一个需求：从数据库中取出的数据集，首先要将某个数据或者多个数据按照地段内容放到前面显示，例如:从学生表中取出姓李的放到数据集的前面； select * fro
编程珠玑-第一章-位图排序 bylijinnan java 编程珠玑
import java.io.BufferedWriter; import java.io.File; import java.io.FileWriter; import java.io.IOException; import java.io.Writer; import java.util.Random; public class BitMapSearch {
Java关于==和equals chenbowen00 java
关于==和equals概念其实很简单，一个是比较内存地址是否相同，一个比较的是值内容是否相同。虽然理解上不难，但是有时存在一些理解误区，如下情况： 1、 String a = "aaa"; a=="aaa"; ==> true 2、 new String("aaa")==new String("aaa
[IT与资本]软件行业需对外界投资热情保持警惕 comsci it
我还是那个看法,软件行业需要增强内生动力,尽量依靠自有资金和营业收入来进行经营,避免在资本市场上经受各种不同类型的风险,为企业自主研发核心技术和产品提供稳定,温和的外部环境... 如果我们在自己尚未掌握核心技术之前,企图依靠上市来筹集资金,然后使劲往某个领域砸钱,然
oracle 数据块结构 daizj oracle 块数据块块结构行目录
oracle 数据块是数据库存储的最小单位，一般为操作系统块的N倍。其结构为：块头－－〉空行－－〉数据，其实际为纵行结构。块的标准大小由初始化参数DB_BLOCK_SIZE指定。具有标准大小的块称为标准块（Standard Block）。块的大小和标准块的大小不同的块叫非标准块（Nonstandard Block）。同一数据库中，Oracle9i及以上版本支持同一数据库中同时使用标
github上一些觉得对自己工作有用的项目收集 dengkane github
github上一些觉得对自己工作有用的项目收集技能类 markdown语法中文说明回到顶部全文检索 elasticsearch bigdesk elasticsearch管理插件回到顶部 nosql mapdb 支持亿级别map, list, 支持事务. 可考虑做为缓存使用 C
初二上学期难记单词二 dcj3sjt126com english word
dangerous 危险的 panda 熊猫 lion 狮子 elephant 象 monkey 猴子 tiger 老虎 deer 鹿 snake 蛇 rabbit 兔子 duck 鸭 horse 马 forest 森林 fall 跌倒；落下 climb 爬；攀登 finish 完成；结束 cinema 电影院；电影 seafood 海鲜；海产食品 bank 银行
8、mysql外键(FOREIGN KEY)的简单使用 dcj3sjt126com mysql
一、基本概念 1、MySQL中“键”和“索引”的定义相同，所以外键和主键一样也是索引的一种。不同的是MySQL会自动为所有表的主键进行索引，但是外键字段必须由用户进行明确的索引。用于外键关系的字段必须在所有的参照表中进行明确地索引，InnoDB不能自动地创建索引。 2、外键可以是一对一的，一个表的记录只能与另一个表的一条记录连接，或者是一对多的，一个表的记录与另一个表的多条记录连接。 3、如
java循环标签 Foreach shuizhaosi888 标签 java循环 foreach
1. 简单的for循环 public static void main(String[] args) { for (int i = 1, y = i + 10; i < 5 && y < 12; i++, y = i * 2) { System.err.println("i=" + i + " y="
Spring Security（05）——异常信息本地化 234390216 exception Spring Security 异常信息本地化
异常信息本地化 Spring Security支持将展现给终端用户看的异常信息本地化，这些信息包括认证失败、访问被拒绝等。而对于展现给开发者看的异常信息和日志信息（如配置错误）则是不能够进行本地化的，它们是以英文硬编码在Spring Security的代码中的。在Spring-Security-core-x
DUBBO架构服务端告警Failed to send message Response javamingtingzhao 架构 DUBBO
废话不多说，警告日志如下，不知道有哪位遇到过，此异常在服务端抛出(服务器启动第一次运行会有这个警告)，后续运行没问题，找了好久真心不知道哪里错了。 WARN 2015-07-18 22:31:15,272 com.alibaba.dubbo.remoting.transport.dispatcher.ChannelEventRunnable.run(84)
JS中Date对象中几个用法 leeqq JavaScript Date 最后一天
近来工作中遇到这样的两个需求 1. 给个Date对象，找出该时间所在月的第一天和最后一天 2. 给个Date对象，找出该时间所在周的第一天和最后一天需求1中的找月第一天很简单，我记得api中有setDate方法可以使用使用setDate方法前，先看看getDate var date = new Date(); console.log(date); // Sat J
MFC中使用ado技术操作数据库你不认识的休道人 sql mfc
1.在stdafx.h中导入ado动态链接库 #import"C:\Program Files\Common Files\System\ado\msado15.dll" no_namespace rename("EOF","end")2.在CTestApp文件的InitInstance()函数中domodal之前写::CoIniti
Android Studio加速 rensanning android studio
Android Studio慢、吃内存！启动时后会立即通过Gradle来sync & build工程。（1）设置Android Studio a) 禁用插件 File -> Settings... Plugins 去掉一些没有用的插件。比如：Git Integration、GitHub、Google Cloud Testing、Google Cloud
各数据库的批量Update操作 tomcat_oracle java oracle sql mysql sqlite
MyBatis的update元素的用法与insert元素基本相同，因此本篇不打算重复了。本篇仅记录批量update操作的 sql语句，懂得SQL语句，那么MyBatis部分的操作就简单了。　　注意：下列批量更新语句都是作为一个事务整体执行，要不全部成功，要不全部回滚。 MSSQL的SQL语句　WITH R AS（　　SELECT 'John' as name, 18 as
html禁止清除input文本输入缓存 xp9802 input
多数浏览器默认会缓存input的值，只有使用ctl+F5强制刷新的才可以清除缓存记录。如果不想让浏览器缓存input的值，有2种方法：方法一：在不想使用缓存的input中添加 autocomplete="off"; eg: <input type="text" autocomplete="off" name