Publication Details
ALWANN: Automatic Layer-Wise Approximation of Deep Neural Network Accelerators without Retraining
Vašíček Zdeněk, doc. Ing., Ph.D. (DCSY)
Sekanina Lukáš, prof. Ing., Ph.D. (DCSY)
HANIF, M.
Shafique Muhammad
approximate computing, deep neural networks, computational path, ResNet,
CIFAR-10
The state-of-the-art approaches employ approximate computing to reduce the energy
consumption of DNN hardware. Approximate DNNs then require extensive retraining
afterwards to recover from the accuracy loss caused by the use of approximate
operations. However, retraining of complex DNNs does not scale well. In this
paper, we demonstrate that efficient approximations can be introduced into the
computational path of DNN accelerators while retraining can completely be
avoided.
ALWANN provides highly optimized implementations of DNNs for custom low-power
accelerators in which the number of computing units is lower than the number of
DNN layers. First, a fully trained DNN is converted to operate with 8-bit weights
and 8-bit multipliers in convolutional layers. A suitable approximate multiplier
is then selected for each computing element from a library of approximate
multipliers in such a way that (i) one approximate multiplier serves several
layers, and (ii) the overall classification error and energy consumption are
minimized. The optimizations including the multiplier selection problem are
solved by means of a multiobjective optimization NSGA-II algorithm. In order to
completely avoid the computationally expensive retraining of DNN, which is
usually employed to improve the classification accuracy, we propose a simple
weight updating scheme that compensates the inaccuracy introduced by employing
approximate multipliers. The proposed approach is evaluated for two architectures
of DNN accelerators with approximate multipliers from the open-source "EvoApprox"
library. We report that the proposed approach saves 30% of energy needed for
multiplication in convolutional layers of ResNet-50 while the accuracy is
degraded by only 0.6%. The proposed technique and approximate layers are
available as an open-source extension of TensorFlow
at https://github.com/ehw-fit/tf-approximate.
@inproceedings{BUT161445,
author="MRÁZEK, V. and VAŠÍČEK, Z. and SEKANINA, L. and HANIF, M. and SHAFIQUE, M.",
title="ALWANN: Automatic Layer-Wise Approximation of Deep Neural Network Accelerators without Retraining",
booktitle="Proceedings of the IEEE/ACM International Conference on Computer-Aided Design",
year="2019",
pages="1--8",
publisher="Institute of Electrical and Electronics Engineers",
address="Denver",
doi="10.1109/ICCAD45719.2019.8942068",
isbn="978-1-7281-2350-9",
url="https://arxiv.org/abs/1907.07229"
}