In this work, we compare different clustering algorithm implementations and optimize them with regards to their distortion behavior. The preferred clustering method for lossy data compression depends on the scenario or application we want to apply it to. After implementing the K-means algorithm and optimizing it towards the excess distortion performance criterion for a binary memoryless channel, we compare our found results with the achievability and converse results from related literature. We discover an (n, M d, eps)-code which operates below the achievability bound for certain blocklengths n between 2 and 8. Especially, at blocklengths n of the achievability bound, where we have striking peaks, we can outperform the excess distortion by a significant margin. Therefore, the made presumptions which led to further investigation and this Bachelor's Thesis; namely that the K-means algorithm can yield a close to optimal vector quantizer in certain scenarios (e.g. specific blocklength ranges) for equiprobable binary sources is underpinned
and extended to biased binary sources. The corresponding index codebook can be efficiently extracted. The K-means can be interpreted as an optimized version of random coding, a proof technique that provides us with fundamental limits of data compression, and from
where the achievability and converse bounds, we compared our results to, have their origin. Relatively short blocklengths n are often an advantage due to their better delay performance, smaller memory requirements, and lower computational complexity.
«
In this work, we compare different clustering algorithm implementations and optimize them with regards to their distortion behavior. The preferred clustering method for lossy data compression depends on the scenario or application we want to apply it to. After implementing the K-means algorithm and optimizing it towards the excess distortion performance criterion for a binary memoryless channel, we compare our found results with the achievability and converse results from related literature. We...
»