Papers
Topics
Authors
Recent
Search
2000 character limit reached

String Matching with Inversions and Translocations in Linear Average Time (Most of the Time)

Published 1 Dec 2010 in cs.DS | (1012.0280v1)

Abstract: We present an efficient algorithm for finding all approximate occurrences of a given pattern $p$ of length $m$ in a text $t$ of length $n$ allowing for translocations of equal length adjacent factors and inversions of factors. The algorithm is based on an efficient filtering method and has an $\bigO(nm\max(\alpha, \beta))$-time complexity in the worst case and $\bigO(\max(\alpha, \beta))$-space complexity, where $\alpha$ and $\beta$ are respectively the maximum length of the factors involved in any translocation and inversion. Moreover we show that under the assumptions of equiprobability and independence of characters our algorithm has a $\bigO(n)$ average time complexity, whenever $\sigma = \Omega(\log m / \log\log{1-\epsilon} m)$, where $\epsilon > 0$ and $\sigma$ is the dimension of the alphabet. Experiments show that the new proposed algorithm achieves very good results in practical cases.

Citations (22)

Summary

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.