Matching Patterns with Variables under Hamming Distance

Paweł Gawrychowski, Florin Manea, Stefan Siemer

A pattern $\alpha$ is a string of variables and terminal letters. We say that $\alpha$ matches a word $w$, consisting only of terminal letters, if $w$ can be obtained by replacing the variables of $\alpha$ by terminal words. The matching problem, i.e., deciding whether a given pattern matches a given word, was heavily investigated: it is NP-complete in general, but can be solved efficiently for classes of patterns with restricted structure. In this paper, we approach this problem in a generalized setting, by considering approximate pattern matching under Hamming distance. More precisely, we are interested in what is the minimum Hamming distance between $w$ and any word $u$ obtained by replacing the variables of $\alpha$ by terminal words. Firstly, we address the class of regular patterns (in which no variable occurs twice) and propose efficient algorithms for this problem, as well as matching conditional lower bounds. We show that the problem can still be solved efficiently if we allow repeated variables, but restrict the way the different variables can be interleaved according to a locality parameter. However, as soon as we allow a variable to occur more than once and its occurrences can be interleaved arbitrarily with those of other variables, even if none of them occurs more than once, the problem becomes intractable.

Knowledge Graph



Sign up or login to leave a comment