Core Concepts
Subsequence matching under generalized gap constraints is NP-hard, but several efficiently solvable subclasses can be identified by restricting the interval structure induced by the constraints.
Abstract
The paper investigates the problem of embedding a string u as a subsequence of another string v under the presence of generalized gap constraints. A generalized gap constraint is a triple (i, j, Ci,j), where 1 ≤ i < j ≤ |u| and Ci,j is a set of strings. The embedding must satisfy the constraint that if u[i] and u[j] are mapped to v[k] and v[ℓ], respectively, then the induced gap v[k+1..ℓ-1] must be a string from Ci,j.
The authors show that this subsequence matching problem under generalized gap constraints is NP-hard, and provide a thorough complexity analysis, including both upper and lower bounds. They identify several efficiently solvable subclasses that result from restricting the interval structure induced by the generalized gap constraints.
The key highlights and insights are:
The matching problem is NP-hard in general, even for binary alphabets and constant-size semilinear or regular constraints.
If the number of constraints is bounded by a constant, the matching problem can be solved in polynomial time.
The matching problem is W[1]-hard when parameterized by the length of the pattern or the number of constraints.
Structurally restricting the interval structure of the constraints, such as having non-intersecting constraints, yields polynomial-time solvable subclasses.
An algorithm is provided that solves the matching problem in time O(nω|C|) for the case of non-intersecting constraints, where O(nω) is the time needed to multiply two n × n Boolean matrices.
A conditional lower bound is shown, stating that an algorithm with running time O(|w|g|C|h) with g + h < 3 would refute the strong exponential time hypothesis.
Stats
There are no key metrics or important figures used to support the author's key logics.
Quotes
There are no striking quotes supporting the author's key logics.