Conceptos Básicos
An efficient algorithm to compute the Longest Common Prefix (LCP) array of a labeled graph, which enables efficient pattern matching and navigation on the graph's paths.
Resumen
The paper presents an efficient algorithm for computing the Longest Common Prefix (LCP) array of a labeled graph G with n nodes and m edges.
The key steps are:
Pre-processing: The input graph G is transformed into a deterministic Wheeler pseudoforest Gis that compactly encodes the lexicographically smallest and largest strings entering each node of G. This step runs in O(min{m log n, m + n^2}) time on arbitrary labeled graphs, and in O(m) time on Wheeler semi-DFAs.
LCP Computation: A new compact-space algorithm is introduced to compute the reduced LCP array LCP*_Gis of the Wheeler pseudoforest Gis in O(n log σ) time and O(n log σ) bits of working space, where σ is the alphabet size.
Post-processing: The LCP array of the original graph G is derived from the LCP*_Gis array in O(m) time and O(m) words of space.
The overall algorithm computes the LCP array of the input graph G in O(n log σ + min{m log n, m + n^2}) time and O(m) words of space. If G is a Wheeler semi-DFA, the running time reduces to O(n log σ + m).
The authors also show that the natural generalization of a previous compact-space LCP-construction algorithm by Beller et al. runs in Ω(nσ) time on pseudoforests, motivating the need for their new algorithm.