The paper focuses on the problem of finding a diverse set of longest common subsequences (LCSs) from a set of input strings, considering both sum and minimum diversity measures under Hamming distance.
The key highlights are:
When the number K of LCSs to be selected is bounded, both the Max-Sum and Max-Min versions of the problem can be solved in polynomial time using dynamic programming.
For unbounded K, the Max-Sum version admits a polynomial-time approximation scheme (PTAS), by leveraging the property that Hamming distance is a metric of negative type.
The authors also provide fixed-parameter tractable (FPT) algorithms for both the Max-Sum and Max-Min versions, parameterized by K and the length r of the input strings.
The paper shows that both problems become NP-hard when K is part of the input, even for constant string length r ≥ 3.
The parameterized complexity analysis reveals that the problems are W[1]-hard when parameterized by K alone.
The authors work in a more general setting where the input strings are represented by an edge-labeled directed acyclic graph (DAG), which can succinctly represent the set of all LCSs. This allows them to extend their positive results to this more general case.
To Another Language
from source content
arxiv.org
Key Insights Distilled From
by Yuto Shida,G... at arxiv.org 05-02-2024
https://arxiv.org/pdf/2405.00131.pdfDeeper Inquiries