Artificial intelligence (AI) systems have entered many aspects of human life. Many of these systems rely on deep neural networks (DNNs), which are computationally expensive and power-hungry. This poses a problem in the space of edge AI devices and datacenter applications, as edge devices are constrained in available power, and datacenter applications scale to the point where power consumption dominates the cost. Decreasing the power consumption of DNNs is thus imperative to enable new applications in the edge, both reducing costs and ncreasing the sustainability of datacenter AI applications. This need has sparked the development of approximate hardware, where approximations in numerical computations are embedded in hardware designs. AI models have to be compressed to be compatible with such systems, and this compression is often tuned to the approximate nature of the hardware design. This requires emulating the approximate behavior on traditional hardware, which is expensive as it requires custom implementations of low-level mathematical operations that cannot take advantage of built-in standard operations that are heavily optimized on the hardware level. This work uses three approaches to efficiently emulate approximate hardware on graphical processing units. First, an existing implementation of approximate hardware emulation is improved via code optimizations. Second, alternative algorithms for computing 2D image convolutions are introduced and evaluated. Third, a method is developed to incorporate approximate emulation of approximate hardware into a model-training pipeline, as approximate emulation is faster than exact emulation. This effort resulted in an overall speedup of 1.8x compared to the previous state of the emulation library. This enables faster model deployment and increases the iteration cycle of further development of hardware-aware compression techniques.
«
Artificial intelligence (AI) systems have entered many aspects of human life. Many of these systems rely on deep neural networks (DNNs), which are computationally expensive and power-hungry. This poses a problem in the space of edge AI devices and datacenter applications, as edge devices are constrained in available power, and datacenter applications scale to the point where power consumption dominates the cost. Decreasing the power consumption of DNNs is thus imperative to enable new applicatio...
»