Papers
Topics
Authors
Recent
Search
2000 character limit reached

Fully-Online Suffix Tree and Directed Acyclic Word Graph Construction for Multiple Texts

Published 28 Jul 2015 in cs.DS | (1507.07622v5)

Abstract: We consider construction of the suffix tree and the directed acyclic word graph (DAWG) indexing data structures for a collection $\mathcal{T}$ of texts, where a new symbol may be appended to any text in $\mathcal{T} = {T_1, \ldots, T_K}$, at any time. This fully-online scenario, which arises in dynamically indexing multi-sensor data, is a natural generalization of the long solved semi-online text indexing problem, where texts $T_1, \ldots, T_{k}$ are permanently fixed before the next text $T_{k+1}$ is processed for each $1 \leq k < K$. We present fully-online algorithms that construct the suffix tree and the DAWG for $\mathcal{T}$ in $O(N \log \sigma)$ time and $O(N)$ space, where $N$ is the total lengths of the strings in $\mathcal{T}$ and $\sigma$ is their alphabet size. The standard explicit representation of the suffix tree leaf edges and some DAWG edges must be relaxed in our fully-online scenario, since too many updates on these edges are required in the worst case. Instead, we provide access to the updated suffix tree leaf edge labels and the DAWG edges to be redirected via auxiliary data structures, in $O(\log \sigma)$ time per added character.

Citations (6)

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.