HiveBrain v1.2.0
Get Started
← Back to all entries
snippetMinor

How to compare two circular sequences in linear time?

Submitted by: @import:stackexchange-cs··
0
Viewed 0 times
lineartimetwocircularhowcomparesequences

Problem

We have two circular sequences, which means there is no start or end in the sequences, how to test if two sequence are equal or not in linear time? I have two circular sequences of E. Coli bacteria with length (4,639,221). I thought about attaching two sample of the first sequence and find the other one in it in linear time, but I was looking for a better idea, using a suffix tree is a suggestion that I think works for this problem.

Solution

Start with computing the lexicographlcally least circular substrings$^1$ of the both circular sequences and then compare them directly.

Alternatively you can check for the substring $A$ (the first seqience) in the string $BB$ (a concatenation of $B$, the second sequence) circular sequence with itself) using for example the KMP$^2$

You might also be interested in the application of the Suffix Trees (also Suffix Arrays) and this thesis reviewing the applications to DNA sequences.

$^{[1]}$(described in K. S. Booth. Lexicographically least circular substrings.
Inf. Process. Lett., 10(4/5):240-242, 1980., and here

$^{[2]}$(Donald E. Knuth, James H. Morris, Jr., and Vaughan R. Pratt, SIAM J. Comput., 6(2), 323–350. Fast Pattern Matching in Strings

Context

StackExchange Computer Science Q#67754, answer score: 4

Revisions (0)

No revisions yet.