Abstract
Rapid development of DNA sequencing technologies exponentially increases the amount of
publicly available genomic data. Whole genome multiple sequence alignments represent a particularly
voluminous, frequently downloaded static dataset. In this work we propose an asymmetric
source coding scheme for such alignments using evolutionary prediction in combination
with lossless black and white image compression. Compared to the Lempel-Ziv algorithm used
so far the compression rates are almost halved.
«
Abstract
Rapid development of DNA sequencing technologies exponentially increases the amount of
publicly available genomic data. Whole genome multiple sequence alignments represent a particularly
voluminous, frequently downloaded static dataset. In this work we propose an asymmetric
source coding scheme for such alignments using evolutionary prediction in combination
with lossless black and white image compression. Compared to the Lempel-Ziv algorithm used
so far the compression rates are...
»