Convolutional neural networks have become a standard in image classification, object detection, and other pattern recognition problems with different data types, such as time series, images, and videos. These networks are trained mainly via iterative gradient-based algorithms, and improving the runtime and cost efficiency is an active research field. Randomly sampled networks are a faster, non-iterative, although data-agnostic alternative that samples the weights from all layers before the last and has been applied successfully in shallow networks. Nevertheless, sampling deep networks remains a challenge regarding accuracy. This thesis aims to approximate the weights of convolutional neural networks, particularly those of convolutional layers of shallow and deep architectures, in a gradient-free manner using the training data distribution. In particular, this work tackles the image classification problem on two well-known datasets, CIFAR10 and MNIST, focusing mainly on the latter. I proposed three algorithms using Principal Components Analysis (PCA) and an adapted version of the Sample Where It Matters (SWIM) algorithm to image data. First, I analyze image patches and use PCA on them to define the convolutional layer weights. Second, I introduce an alternative view on patches and a metric to measure their importance. Then, introduce two modified versions of SWIM and probability distributions to sample a reduced subset of patches and propose convolutional kernels from them. Third, I propose an algorithm combining the previous two that further adapts SWIM and is able to overcome some of their limitations. For many of the tested architectures, the results indicate an average validation accuracy less than 5% below the corresponding baselines, which were fully trained with iterative methods, in the case of two and three classes in CIFAR10 and less than 10% below the baselines in the case of 10 classes. These results conclude that using approximately 1.7% of the training data, it is possible to approximate convolutional kernels and that exploring the high-dimensional distribution of feature maps requires further research to achieve better accuracy.
«
Convolutional neural networks have become a standard in image classification, object detection, and other pattern recognition problems with different data types, such as time series, images, and videos. These networks are trained mainly via iterative gradient-based algorithms, and improving the runtime and cost efficiency is an active research field. Randomly sampled networks are a faster, non-iterative, although data-agnostic alternative that samples the weights from all layers before the last...
»