论文学习:YodaNN1: An Architecture for Ultra-Low Power Binary-Weight CNN Acceleration

摘要:The
computational effort of today’s CNNs requires power-hungry
parallel processors(高耗能并行处理器) or GP-GPUs(计算图形处理器).Recent developments in CNN accelerators for system-on-chip integration(系统级芯片集成) have reduced energy consumption (耗能)significantly.Unfortunately, even these highly optimized devices(高度优化的设备) are above the power envelope(包络功率) imposed by mobile and deeply embedded applications and face hard limitations caused by CNN weight I/O and storage.This prevents the adoption of CNNs in future ultra-low power Internet of Things end-nodes(超低功耗物联网节点) for near-sensor (对近传感器)analytics.Recent algorithmic and theoretical advancements enable competitive classification accuracy even when limiting CNNs to binary (+1/-1) weights during training.These new findings bring major optimization opportunities in the arithmetic core by removing the need for expensive multiplications, as well as reducing I/O bandwidth and storage.These new findings bring major optimization opportunities in the arithmetic core(算术核心) by removing the need for expensive multiplications(大量乘法运算), as well as reducing I/O bandwidth and storage.These new findings bring major optimization opportunities in the arithmetic core by removing the need for expensive multiplications, as well as reducing I/O bandwidth and storage. In this work, we present an accelerator optimized for binary-weight CNNs that achieves 1.5 TOp/s at 1.2V on a core area of only 1.33MGE (Million Gate Equivalent,百万级等效门) or 1.9mm2 and with a power dissipation of 895μW in UMC 65nm technology at 0.6V. Our accelerator significantly outperforms the state-of-the-art in terms of energy and area efficiency achieving 61.2 TOp/s/[email protected] and 1.1 TOp/s/[email protected], respectively.

你可能感兴趣的:(论文学习)