Numerical approximation of kernels in convolutional neural networks

Daniel Calle Castrillon

FINAL-Daniel_Calle Castrillon_Thesis.pdf

Wenn Sie Schwierigkeiten haben, das Dokument zu öffnen, versuchen Sie auch bitte diesen Link

Dokumenttyp:: Masterarbeit
Autor(en):: Daniel Calle Castrillon
Titel:: Numerical approximation of kernels in convolutional neural networks
Abstract:: Convolutional neural networks have become a standard in image classification, object detection, and other pattern recognition problems with different data types, such as time series, images, and videos. These networks are trained mainly via iterative gradient-based algorithms, and improving the runtime and cost efficiency is an active research field. Randomly sampled networks are a faster, non-iterative, although data-agnostic alternative that samples the weights from all layers before the last and has been applied successfully in shallow networks. Nevertheless, sampling deep networks remains a challenge regarding accuracy. This thesis aims to approximate the weights of convolutional neural networks, particularly those of convolutional layers of shallow and deep architectures, in a gradient-free manner using the training data distribution. In particular, this work tackles the image classification problem on two well-known datasets, CIFAR10 and MNIST, focusing mainly on the latter. I proposed three algorithms using Principal Components Analysis (PCA) and an adapted version of the Sample Where It Matters (SWIM) algorithm to image data. First, I analyze image patches and use PCA on them to define the convolutional layer weights. Second, I introduce an alternative view on patches and a metric to measure their importance. Then, introduce two modified versions of SWIM and probability distributions to sample a reduced subset of patches and propose convolutional kernels from them. Third, I propose an algorithm combining the previous two that further adapts SWIM and is able to overcome some of their limitations. For many of the tested architectures, the results indicate an average validation accuracy less than 5% below the corresponding baselines, which were fully trained with iterative methods, in the case of two and three classes in CIFAR10 and less than 10% below the baselines in the case of 10 classes. These results conclude that using approximately 1.7% of the training data, it is possible to approximate convolutional kernels and that exploring the high-dimensional distribution of feature maps requires further research to achieve better accuracy. «
Convolutional neural networks have become a standard in image classification, object detection, and other pattern recognition problems with different data types, such as time series, images, and videos. These networks are trained mainly via iterative gradient-based algorithms, and improving the runtime and cost efficiency is an active research field. Randomly sampled networks are a faster, non-iterative, although data-agnostic alternative that samples the weights from all layers before the last... »
Aufgabensteller:: Felix Dietrich
Jahr:: 2025
Quartal:: 1. Quartal
Jahr / Monat:: 2025-02
Monat:: Feb
Sprache:: en
Hochschule / Universität:: Technical University of Munich
Fakultät:: TUM School of Computation, Information and Technology
BibTeX

Vorkommen:

mediaTUM Gesamtbestand Einrichtungen Schools TUM School of Computation, Information and Technology Departments Computer Science Informatik 5 - Lehrstuhl für Scientific Computing (Prof. Bungartz)2025