TensorRT/parsers/caffe/caffeParser/opParsers/parseInnerProduct.cpp源碼研讀

TensorRT/parsers/caffe/caffeParser/opParsers/parseInnerProduct.cpp源碼研讀

  • TensorRT/parsers/caffe/caffeParser/opParsers/parseInnerProduct.cpp
  • std::normal\_distribution
  • weight initialization
  • 參考連結

TensorRT/parsers/caffe/caffeParser/opParsers/parseInnerProduct.cpp

/* Copyright (c) 2019, NVIDIA CORPORATION. All rights reserved.
 *
 * Licensed under the Apache License, Version 2.0 (the "License");
 * you may not use this file except in compliance with the License.
 * You may obtain a copy of the License at
 *
 *     http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */

#include "opParsers.h"

using namespace nvinfer1;

namespace nvcaffeparser1
{
/*
獲取或隨機初始化權重及偏置量後,把它們記錄於mTmpAllocs中
然後為網路新增全連接層
*/
ILayer* parseInnerProduct(INetworkDefinition& network, const trtcaffe::LayerParameter& msg, CaffeWeightFactory& weightFactory, BlobNameToTensor& tensors)
{
	/*
	trtcaffe::InnerProductParameter
	定義於TensorRT/parsers/caffe/proto/trtcaffe.proto
	*/
    const trtcaffe::InnerProductParameter& p = msg.inner_product_param();

    /*
    parserutils::volume
    定義於TensorRT/parsers/common/parserUtils.h
    */
    /*
    parserutils::getCHW
    定義於TensorRT/parsers/common/parserUtils.h
    從Dims中抽取CHW三維的長度後回傳
    */
    //TensorRT全連接層的輸入可以是多維向量?
    int64_t nbInputs = parserutils::volume(parserutils::getCHW(tensors[msg.bottom(0)]->getDimensions()));
    int64_t nbOutputs = p.num_output();

    float std_dev = 1.0F / sqrtf(nbInputs * nbOutputs);
    /*
    WeightType::kGENERIC,WeightType::kBIAS:
    types for convolution, deconv, fully connected
    分別代表權重及偏置量
    */
    /*
    CaffeWeightFactory::operator()
    定義於TensorRT/parsers/caffe/caffeWeightFactory/caffeWeightFactory.cpp
    獲取layerName層的第int(weightType)個blob(即權重或偏置量),
    如果沒有,則回傳長度為0的Null weights
    */
    /*
    CaffeWeightFactory::allocateWeights
    Weights CaffeWeightFactory::allocateWeights(int64_t elems, std::normal_distribution distribution)
    定義於TensorRT/parsers/caffe/caffeWeightFactory/caffeWeightFactory.cpp
    分配getDataType()型別,elems大小的空間,用它來新建Weights物件後回傳
    採用normal distribution來隨機初始化
    */
    /*
    CaffeWeightFactory::getNullWeights
    定義於TensorRT/parsers/caffe/caffeWeightFactory/caffeWeightFactory.cpp
    回傳空的weight array
    */
    //獲取或隨機初始化權重及偏置量
    Weights kernelWeights = weightFactory.isInitialized() ? weightFactory(msg.name(), WeightType::kGENERIC) : weightFactory.allocateWeights(nbInputs * nbOutputs, std::normal_distribution<float>(0.0F, std_dev));
    //如果沒有bias_term這個field,則為它分配?
    Weights biasWeights = !p.has_bias_term() || p.bias_term() ? (weightFactory.isInitialized() ? weightFactory(msg.name(), WeightType::kBIAS) : weightFactory.allocateWeights(nbOutputs)) : weightFactory.getNullWeights();

    /*
    CaffeWeightFactory::convert
    將weights裡的內容轉換為初始化時就設定好的mDataType型別,
    並將weights.values記錄於mTmpAllocs中
    */
    weightFactory.convert(kernelWeights);
    weightFactory.convert(biasWeights);
    /*
    INetworkDefinition::addFullyConnected
    定義於TensorRT/include/NvInfer.h
    virtual IFullyConnectedLayer* addFullyConnected(
        ITensor& input, int nbOutputs, Weights kernelWeights, Weights biasWeights) TRTNOEXCEPT = 0;
    */
    return network.addFullyConnected(*tensors[msg.bottom(0)], p.num_output(), kernelWeights, biasWeights);
}
} //namespace nvcaffeparser1

std::normal_distribution

在函數parseInnerProduct中用到了std::normal_distribution

/**/std::normal_distribution<float>(0.0F, std_dev))/**/

關於std::normal_distribution,詳見C++ uniform_real_distribution及normal_distribution。

weight initialization

float std_dev = 1.0F / sqrtf(nbInputs * nbOutputs);
Weights kernelWeights = weightFactory.isInitialized() ? weightFactory(msg.name(), WeightType::kGENERIC) : weightFactory.allocateWeights(nbInputs * nbOutputs, std::normal_distribution<float>(0.0F, std_dev));

Xavier initialization的std_dev1/sqrt(num_inputs);He initialization的std_devsqrt(2/num_inputs)。那這裡的初始化方式是?

參考連結

C++ uniform_real_distribution及normal_distribution

Why better weight initialization is important in neural networks?

CNN xavier weight initialization

你可能感兴趣的:(TensorRT源碼研讀筆記)