Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT---Q-BERT:基于Hessian的超低精度BERT量化

论文标题:Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT---Q-BERT:基于Hessian的超低精度BERT量化

  • Abstract
  • 1 Related Work
    • Model compression模型压缩
    • Compressed NLP model
  • 2 Methodology
    • 2.1 Quantization process
    • 2.2 Mixed precision quantization
    • 2.3 Group-wise Quantization
  • 3 Experiment
    • 3.1 Main Results
    • 3.2 Effects of group-wise quantization--群体量化效果

你可能感兴趣的:(模型量化论文,量化论文)