Fast emulation of approximate hardware for edge deep neural network applications

Michael Laraia

Document type:: Masterarbeit
Author(s):: Michael Laraia
Title:: Fast emulation of approximate hardware for edge deep neural network applications
Abstract:: Artificial intelligence (AI) systems have entered many aspects of human life. Many of these systems rely on deep neural networks (DNNs), which are computationally expensive and power-hungry. This poses a problem in the space of edge AI devices and datacenter applications, as edge devices are constrained in available power, and datacenter applications scale to the point where power consumption dominates the cost. Decreasing the power consumption of DNNs is thus imperative to enable new applications in the edge, both reducing costs and ncreasing the sustainability of datacenter AI applications. This need has sparked the development of approximate hardware, where approximations in numerical computations are embedded in hardware designs. AI models have to be compressed to be compatible with such systems, and this compression is often tuned to the approximate nature of the hardware design. This requires emulating the approximate behavior on traditional hardware, which is expensive as it requires custom implementations of low-level mathematical operations that cannot take advantage of built-in standard operations that are heavily optimized on the hardware level. This work uses three approaches to efficiently emulate approximate hardware on graphical processing units. First, an existing implementation of approximate hardware emulation is improved via code optimizations. Second, alternative algorithms for computing 2D image convolutions are introduced and evaluated. Third, a method is developed to incorporate approximate emulation of approximate hardware into a model-training pipeline, as approximate emulation is faster than exact emulation. This effort resulted in an overall speedup of 1.8x compared to the previous state of the emulation library. This enables faster model deployment and increases the iteration cycle of further development of hardware-aware compression techniques. «
Artificial intelligence (AI) systems have entered many aspects of human life. Many of these systems rely on deep neural networks (DNNs), which are computationally expensive and power-hungry. This poses a problem in the space of edge AI devices and datacenter applications, as edge devices are constrained in available power, and datacenter applications scale to the point where power consumption dominates the cost. Decreasing the power consumption of DNNs is thus imperative to enable new applicatio... »
Supervisor:: Felix Dietrich
Advisor:: Thomas Pfeil, Lukas Wiest
Year:: 2023
Quarter:: 4. Quartal
Year / month:: 2023-12
Month:: Dec
Language:: en
University:: Technical University of Munich
Faculty:: TUM School of Computation, Information and Technology
BibTeX

Occurrences:

mediaTUM Gesamtbestand Einrichtungen Schools TUM School of Computation, Information and Technology Departments Computer Science Informatik 5 - Lehrstuhl für Scientific Computing (Prof. Bungartz)2023