为了能够用NDarray来初始化block或者layer,想了几个办法:假设现在有两个矩阵weight和bias,需要初始化的层叫conv
- mx.init.Load()
conv.initialize(init=mx.init.Load({"weight":weight,"bias":bias}, ctx=ctx)
- mx.init.register()自定义初始化
@mx.init.register
class myInitializer(mx.init.Initializer):
def __init__(self, weight,bias):
super(myInitializer,self).__init__()
self.weight = weight
self.bias = bias
def _init_weight(self, _, arr):
arr[:] = self.weight
def _init_bias(self, _, arr):
arr[:] = self.bias
conv.initialize(init=myInitializer(weight,bias), ctx=ctx)
- mx.init.Constant()
- conv.weight.set_data()
参考
注意点
-
在前两种initialize()操作结束后,conv.weight._data里面并没有数据,是None,而是在conv.weight._differred_init中存着:
暂时的理解是只有通过一次前向传播,这些数据才会真正被初始化,所以之后都要加上一个前传的过程。