Publication Details

Acceleration Techniques for Automated Design of Approximate Convolutional Neural Networks

PIŇOS, M.; MRÁZEK, V.; VAVERKA, F.; VAŠÍČEK, Z.; SEKANINA, L. Acceleration Techniques for Automated Design of Approximate Convolutional Neural Networks. IEEE Journal on Emerging and Selected Topics in Circuits and Systems, 2023, vol. 13, no. 1, p. 212-224. ISSN: 2156-3357.
Czech title
Akcelerační techniky pro automatizovaný návrh aproximativních konvolučních neuronových sítí
Type
journal article
Language
English
Authors
URL
Keywords

   - Approximate computing,
   - convolutional neural network,
   - neural architecture search,
   - energy efficiency,
   - quantization,
   - acceleration

Abstract

The main issue connected with using approximate components such as approximate
multipliers in deep convolutional neural networks (CNN) during the design process
is the necessity to emulate them due to the lack of native support for
approximate operations in modern CPUs and GPUs, which is computationally
expensive. To accelerate the emulation of approximate operations of CNNs on GPUs,
we propose TFApprox4IL, a software library supporting both symmetric as well as
asymmetric quantization modes, approximate 8xN bit multipliers emulated using
lookup tables, a new type of approximate layer known as approximate depthwise
convolution, and quantization-aware training. The TFApprox4IL performance is
extensively evaluated in the simulation of approximate implementations of
MobileNetV2 and ResNet networks on Nvidia Pascal and Tesla GPU architectures.
Furthermore, TFApprox4IL is also evaluated in neural architecture search (NAS)
algorithms to automatically design CNN architectures that directly employ
approximate multipliers. On two different NAS methods, EvoApproxNAS and Google
Model Search (GMS), we show how approximate multipliers can effectively be
incorporated into the CNN design process. To estimate the energy consumption of
the approximate CNNs, AxMultAT tool based on Timeloop and Accelergy is
introduced. Contrasted to the highly optimized GPU-based CNN simulation
implemented using exact arithmetic operations available within TensorFlow, the
average overhead of the inference and training, introduced by TFApprox4IL,
is 13.6× and 8.0× , respectively, considering ResNet50V2 and MobileNetV2 CNNs on
ImageNet and CIFAR-10 data sets. This overhead was reduced by one order of
magnitude with respect to previous methods.

Published
2023
Pages
212–224
Journal
IEEE Journal on Emerging and Selected Topics in Circuits and Systems, vol. 13, no. 1, ISSN 2156-3357
DOI
UT WoS
000965262200001
EID Scopus
BibTeX
@article{BUT180721,
  author="Michal {Piňos} and Vojtěch {Mrázek} and Filip {Vaverka} and Zdeněk {Vašíček} and Lukáš {Sekanina}",
  title="Acceleration Techniques for Automated Design of Approximate Convolutional Neural Networks",
  journal="IEEE Journal on Emerging and Selected Topics in Circuits and Systems",
  year="2023",
  volume="13",
  number="1",
  pages="212--224",
  doi="10.1109/JETCAS.2023.3235204",
  issn="2156-3357",
  url="https://ieeexplore.ieee.org/document/10011413"
}
Back to top