Clustering Algorithms for Lossy Data Compression

Amine M'Charrak

MCharrak_BSc_Thesis.pdf

Wenn Sie Schwierigkeiten haben, das Dokument zu öffnen, versuchen Sie auch bitte diesen Link

Dokumenttyp:: Bachelorarbeit
Autor(en):: Amine M'Charrak
Titel:: Clustering Algorithms for Lossy Data Compression
Übersetzter Titel:: Clustering Algorithms for Lossy Data Compression
Abstract:: In this work, we compare different clustering algorithm implementations and optimize them with regards to their distortion behavior. The preferred clustering method for lossy data compression depends on the scenario or application we want to apply it to. After implementing the K-means algorithm and optimizing it towards the excess distortion performance criterion for a binary memoryless channel, we compare our found results with the achievability and converse results from related literature. We discover an (n, M d, eps)-code which operates below the achievability bound for certain blocklengths n between 2 and 8. Especially, at blocklengths n of the achievability bound, where we have striking peaks, we can outperform the excess distortion by a significant margin. Therefore, the made presumptions which led to further investigation and this Bachelor's Thesis; namely that the K-means algorithm can yield a close to optimal vector quantizer in certain scenarios (e.g. specific blocklength ranges) for equiprobable binary sources is underpinned and extended to biased binary sources. The corresponding index codebook can be efficiently extracted. The K-means can be interpreted as an optimized version of random coding, a proof technique that provides us with fundamental limits of data compression, and from where the achievability and converse bounds, we compared our results to, have their origin. Relatively short blocklengths n are often an advantage due to their better delay performance, smaller memory requirements, and lower computational complexity. «
In this work, we compare different clustering algorithm implementations and optimize them with regards to their distortion behavior. The preferred clustering method for lossy data compression depends on the scenario or application we want to apply it to. After implementing the K-means algorithm and optimizing it towards the excess distortion performance criterion for a binary memoryless channel, we compare our found results with the achievability and converse results from related literature. We... »
übersetzter Abstract:: In this work, we compare different clustering algorithm implementations and optimize them with regards to their distortion behavior. The preferred clustering method for lossy data compression depends on the scenario or application we want to apply it to. After implementing the K-means algorithm and optimizing it towards the excess distortion performance criterion for a binary memoryless channel, we compare our found results with the achievability and converse results from related literature. We discover an (n, M d, eps)-code which operates below the achievability bound for certain blocklengths n between 2 and 8. Especially, at blocklengths n of the achievability bound, where we have striking peaks, we can outperform the excess distortion by a significant margin. Therefore, the made presumptions which led to further investigation and this Bachelor's Thesis; namely that the K-means algorithm can yield a close to optimal vector quantizer in certain scenarios (e.g. specific blocklength ranges) for equiprobable binary sources is underpinned and extended to biased binary sources. The corresponding index codebook can be efficiently extracted. The K-means can be interpreted as an optimized version of random coding, a proof technique that provides us with fundamental limits of data compression, and from where the achievability and converse bounds, we compared our results to, have their origin. Relatively short blocklengths n are often an advantage due to their better delay performance, smaller memory requirements, and lower computational complexity. «
In this work, we compare different clustering algorithm implementations and optimize them with regards to their distortion behavior. The preferred clustering method for lossy data compression depends on the scenario or application we want to apply it to. After implementing the K-means algorithm and optimizing it towards the excess distortion performance criterion for a binary memoryless channel, we compare our found results with the achievability and converse results from related literature. We... »
Fachgebiet:: DAT Datenverarbeitung, Informatik
DDC:: 000 Informatik, Wissen, Systeme
Betreuer:: Palzer, Lars
Gutachter:: Kramer, Gerhard (Prof. Dr.)
Jahr:: 2016
Seiten/Umfang:: 82
Sprache:: en
Sprache der Übersetzung:: en
Hochschule / Universität:: Technische Universität München
Fakultät:: Fakultät für Elektrotechnik und Informationstechnik
BibTeX

Vorkommen:

mediaTUM Gesamtbestand Elektronische Prüfungsarbeiten School TUM School of Computation, Information and Technology