Please use this identifier to cite or link to this item: http://repository.i3l.ac.id/jspui/handle/123456789/1080
Title: The Effect of Fasta Sequence Splitting with The Addition of Binary Position Mapping to LZ78 Algorithm for Single Genome Compression
Authors: Stepanus, Agustinus
Keywords: FASTA
LZ78
compression
Issue Date: 1-Sep-2024
Publisher: Indonesia International Institute for life science
Abstract: As the availability of genome sequencers increases, the number of genome data needed to be stored increases exponentially. A strategy to manage this problem is needed. One of the solutions is through a genome compressor. This study proposes a data splitting process on the input of the LZ78 algorithm to increase the redundancy which potentially increases the compression performance of the algorithm. AT and GC characters are split into two different substrings. Other additional ambiguity characters were also assigned to the substrings. The last substring was a mapping to the position of characters of the first and second substrings located in the original sequence. The proposed algorithm successfully reduced the compression time and decompression memory peak. However, the compression ratio did not significantly differ from the original LZ78 algorithm. The proposed algorithm’s compression ratio was also not able to compete with the current available FASTA compressor, but the effect of the proposed algorithm on the LZ78 algorithm might be able to be implemented on other algorithms with the basis of the LZ-like algorithm.
URI: http://repository.i3l.ac.id/jspui/handle/123456789/1080
Appears in Collections:Bioinformatics

Files in This Item:
File Description SizeFormat 
Agustinus Stepanus.pdf
  Restricted Access
Full text995.24 kBAdobe PDFView/Open Request a copy


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.