#### Space-Efficient String Indexing for Wildcard Pattern Matching

##### Moshe Lewenstein, Yakov Nekrich, Jeffrey Scott Vitter

In this paper we describe compressed indexes that support pattern matching queries for strings with wildcards. For a constant size alphabet our data structure uses $O(n\log^{\varepsilon}n)$ bits for any $\varepsilon>0$ and reports all $\mathrm{occ}$ occurrences of a wildcard string in $O(m+\sigma^g \cdot\mu(n) + \mathrm{occ})$ time, where $\mu(n)=o(\log\log\log n)$, $\sigma$ is the alphabet size, $m$ is the number of alphabet symbols and $g$ is the number of wildcard symbols in the query string. We also present an $O(n)$-bit index with $O((m+\sigma^g+\mathrm{occ})\log^{\varepsilon}n)$ query time and an $O(n(\log\log n)^2)$-bit index with $O((m+\sigma^g+\mathrm{occ})\log\log n)$ query time. These are the first non-trivial data structures for this problem that need $o(n\log n)$ bits of space.

arrow_drop_up