Accelerating Semantic Image Segmentation on FPGA

Mitra, Saptarshi

School

Zurück
Zurück zum Anfang der Trefferliste
Dauerhafter Link zum angezeigten Objekt

Wenn Sie Schwierigkeiten haben, das Dokument zu öffnen, versuchen Sie auch bitte diesen Link

Dokumenttyp:: Masterarbeit
Autor(en):: Mitra, Saptarshi
Titel:: Accelerating Semantic Image Segmentation on FPGA
Abstract:: From healthcare to autonomous driving, Deep Neural Networks (DNN) dominate over the traditional Computer Vision (CV) approaches in terms of accuracy and efficiency. Exponential increase of DNN applications require tremendous computation power of the underlying hardware resources. Naturally, the superior performance of DNN models comes at the cost of a huge memory footprint and complex calculations. Even if Graphics Processing Units (GPU) are the main work-horse during the training of DNNs for their massive computational capabilities, they are not suitable for mobile deployments. Field-programmable Gate Array (FPGA) provides the best trade-off between performance, power-consumption and design flexibility. Semantic image segmentation is one of the most complex tasks in computer vision, providing pixel-wise annotations for complete scene understanding. For a critical application like autonomous driving, DeepLabV3+ model provides the state-of-the-art Mean Intersection-Over-Union (mIOU) for semantic segmentation on the CityScapes dataset. In this work, a fully pipelined hardware accelerator implementing novel dilated convolution was introduced. Using this accelerator, an end-end DeepLabV3+ deployment was possible on an FPGA. This architecture exploits hardware optimizations like 3-D loop unrolling, memory tiling to maximize use of computational resources and provides 2.34 times latency improvement with respect to the baseline architecture. Further, a Genetic Algorithm (GA) based automated channel pruning technique was used to jointly optimize hardware usage and model accuracy. Finally, hardware awareness was incorporated in the pruning search by hardware heuristics and an accurate model of the custom accelerator. «
From healthcare to autonomous driving, Deep Neural Networks (DNN) dominate over the traditional Computer Vision (CV) approaches in terms of accuracy and efficiency. Exponential increase of DNN applications require tremendous computation power of the underlying hardware resources. Naturally, the superior performance of DNN models comes at the cost of a huge memory footprint and complex calculations. Even if Graphics Processing Units (GPU) are the main work-horse during the training of DNNs for th... »
Stichworte:: CNN; DNN; Semantic Segmentation; Hardware Acceleration; FPGA; HLS; OpenCL; Pruning
Fachgebiet:: ELT Elektrotechnik
DDC:: 000 Informatik, Wissen, Systeme
Betreuer:: Vemparala, Manoj Rohit
Gutachter:: Stechele, Walter (Prof. Dr.)
Jahr:: 2021
Sprache:: en
Hochschule / Universität:: Technische Universität München
Fakultät:: Fakultät für Elektrotechnik und Informationstechnik
Präsentationsdatum:: 15.07.2021
Publikationsdatum:: 01.07.2021
BibTeX

Vorkommen:

mediaTUM Gesamtbestand Elektronische Prüfungsarbeiten School TUM School of Computation, Information and Technology