Papers
Topics
Authors
Recent
Search
2000 character limit reached

Two simple full-text indexes based on the suffix array

Published 22 May 2014 in cs.DS | (1405.5919v2)

Abstract: We propose two suffix array inspired full-text indexes. One, called SA-hash, augments the suffix array with a hash table to speed up pattern searches due to significantly narrowed search interval before the binary search phase. The other, called FBCSA, is a compact data structure, similar to M{\"a}kinen's compact suffix array, but working on fixed sized blocks. Experiments on the Pizza~&~Chili 200\,MB datasets show that SA-hash is about 2--3 times faster in pattern searches (counts) than the standard suffix array, for the price of requiring $0.2n-1.1n$ bytes of extra space, where $n$ is the text length, and setting a minimum pattern length. FBCSA is relatively fast in single cell accesses (a few times faster than related indexes at about the same or better compression), but not competitive if many consecutive cells are to be extracted. Still, for the task of extracting, e.g., 10 successive cells its time-space relation remains attractive.

Citations (6)

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.