Authors: Ellert, Jonas
Title: Efficient string algorithmics across alphabet realms
Language (ISO): en
Abstract: Stringology is a subfield of computer science dedicated to analyzing and processing sequences of symbols. It plays a crucial role in various applications, including lossless compression, information retrieval, natural language processing, and bioinformatics. Recent algorithms often assume that the strings to be processed are over polynomial integer alphabet, i.e., each symbol is an integer that is at most polynomial in the lengths of the strings. In contrast to that, the earlier days of stringology were shaped by the weaker comparison model, in which strings can only be accessed by mere equality comparisons of symbols, or (if the symbols are totally ordered) order comparisons of symbols. Nowadays, these flavors of the comparison model are respectively referred to as general unordered alphabet and general ordered alphabet. In this dissertation, we dive into the realm of both integer alphabets and general alphabets. We present new algorithms and lower bounds for classic problems, including Lempel-Ziv compression, computing the Lyndon array, and the detection of squares and runs. Our results show that, instead of only assuming the standard model of computation, it is important to also consider both weaker and stronger models. Particularly, we should not discard the older and weaker comparison-based models too quickly, as they are not only powerful theoretical tools, but also lead to fast and elegant practical solutions, even by today's standards.
Subject Headings: String algorithmics
Square-freeness
Runs
Lyndon array
Lempel-Ziv
Rightmost Lempel-Ziv
LZ77
General ordered alphabet
General unordered alphabet
Word RAM
Word-packing
Subject Headings (RSWK): Zeichenkette
Wort <Informatik>
Datenkompression
Ordnungsreduktion
Effizienter Algorithmus
URI: http://hdl.handle.net/2003/42419
http://dx.doi.org/10.17877/DE290R-24255
Issue Date: 2024
Appears in Collections:LS 11

Files in This Item:
File Description SizeFormat 
Dissertation_Ellert.pdfDNB3.06 MBAdobe PDFView/Open


This item is protected by original copyright



This item is protected by original copyright rightsstatements.org