Publication Details

Exploiting Quantization and Mapping Synergy in Hardware-Aware Deep Neural Network Accelerators

KLHŮFEK, J.; ŠAFÁŘ, M.; MRÁZEK, V.; VAŠÍČEK, Z.; SEKANINA, L. Exploiting Quantization and Mapping Synergy in Hardware-Aware Deep Neural Network Accelerators. In 2024 27th International Symposium on Design & Diagnostics of Electronic Circuits & Systems (DDECS). Kielce: Institute of Electrical and Electronics Engineers, 2024. p. 1-6. ISBN: 979-8-3503-5934-3.
Czech title
Výzkum synergie kvantizace a mapování v oblasti hardwarových akcelerátorů hlubokých neuronových sítí
Type
conference paper
Language
English
Authors
URL
Keywords

Quantization, Neural networks, Hardware accelerator

Abstract

Energy efficiency and memory footprint of a convolutional neural network (CNN)
implemented on a CNN inference accelerator depend on many factors, including
a weight quantization strategy (i.e., data types and bit-widths) and mapping
(i.e., placement and scheduling of DNN elementary operations on hardware units of
the accelerator). We show that enabling rich mixed quantization schemes during
the implementation can open a previously hidden space of mappings that utilize
the hardware resources more effectively. CNNs utilizing quantized weights and
activations and suitable mappings can significantly improve trade-offs among the
accuracy, energy, and memory requirements compared to less carefully optimized
CNN implementations. To find, analyze, and exploit these mappings, we: (i) extend
a general-purpose state-of-the-art mapping tool (Timeloop) to support mixed
quantization, which is not currently available; (ii) propose an efficient
multi-objective optimization algorithm to find the most suitable bit-widths and
mapping for each DNN layer executed on the accelerator; and (iii) conduct
a detailed experimental evaluation to validate the proposed method. On two CNNs
(MobileNetV1 and MobileNetV2) and two accelerators (Eyeriss and Simba) we show
that for a given quality metric (such as the accuracy on ImageNet), energy
savings are up to 37% without any accuracy drop. 

Published
2024
Pages
1–6
Proceedings
2024 27th International Symposium on Design & Diagnostics of Electronic Circuits & Systems (DDECS)
Conference
International Symposium on Design and Diagnostics of Electronic Circuits and Systems, Kielce, PL
ISBN
979-8-3503-5934-3
Publisher
Institute of Electrical and Electronics Engineers
Place
Kielce
DOI
EID Scopus
BibTeX
@inproceedings{BUT188463,
  author="Jan {Klhůfek} and Miroslav {Šafář} and Vojtěch {Mrázek} and Zdeněk {Vašíček} and Lukáš {Sekanina}",
  title="Exploiting Quantization and Mapping Synergy in Hardware-Aware Deep Neural Network Accelerators",
  booktitle="2024 27th International Symposium on Design & Diagnostics of Electronic Circuits & Systems (DDECS)",
  year="2024",
  pages="1--6",
  publisher="Institute of Electrical and Electronics Engineers",
  address="Kielce",
  doi="10.1109/DDECS60919.2024.10508920",
  isbn="979-8-3503-5934-3",
  url="https://arxiv.org/abs/2404.05368"
}
Back to top