Structures Of String Matching And Data Compression by Jesper Larsson
Читать

Structures Of String Matching And Data Compression by Jesper Larsson

This doctoral dissertation presents a range of results concerning efficient algorithms and data structures for string processing, including several schemes contributing to sequential data compression. It comprises both theoretic results and practical implementations. We study the suffix tree data structure, presenting an efficient representation and several generalizations. This includes augmenting the suffix tree to fully support sliding window indexing (including a practical implementation) in linear time. Furthermore, we consider a variant that indexes naturally word-partitioned data, and present a linear-time construction algorithm for a tree that represents only suffixes starting at word boundaries, requiring space linear in the number of words. By applying our sliding window indexing techniques, we achieve an efficient implementation for dictionary-based compression based on the LZ-77 algorithm. Furthermore, considering predictive source