patternMinor
Algorithm to identify contiguous repeated series of lines in a long string
Viewed 0 times
repeatedidentifylongalgorithmcontiguousseriesstringlines
Problem
I would like an algorithm that can identify repeated parts of big stack traces like this:
```
java.lang.StackOverflowError
at transform.Erasure$Eraser.typed1(Erasure.scala:789)
at typechecker.Typers$Typer.runTyper$1(Typers.scala:5640)
at typechecker.Typers$Typer.typedInternal(Typers.scala:5672)
at typechecker.Typers$Typer.body$2(Typers.scala:5613)
at typechecker.Typers$Typer.typed(Typers.scala:5618)
at typechecker.Typers$Typer.$anonfun$typed1$38(Typers.scala:4752)
at typechecker.Typers$Typer.silent(Typers.scala:700)
at typechecker.Typers$Typer.normalTypedApply$1(Typers.scala:4754)
at typechecker.Typers$Typer.typedApply$1(Typers.scala:4801)
at typechecker.Typers$Typer.typedInAnyMode$1(Typers.scala:5586)
at typechecker.Typers$Typer.typed1(Typers.scala:5603)
at transform.Erasure$Eraser.typed1(Erasure.scala:789)
at typechecker.Typers$Typer.runTyper$1(Typers.scala:5640)
at typechecker.Typers$Typer.typedInternal(Typers.scala:5672)
at typechecker.Typers$Typer.body$2(Typers.scala:5613)
at typechecker.Typers$Typer.typed(Typers.scala:5618)
at typechecker.Typers$Typer.typedQualifier(Typers.scala:5723)
at typechecker.Typers$Typer.typedQualifier(Typers.scala:5731)
at transform.Erasure$Eraser.adaptMember(Erasure.scala:714)
at transform.Erasure$Eraser.typed1(Erasure.scala:789)
at typechecker.Typers$Typer.runTyper$1(Typers.scala:5640)
at typechecker.Typers$Typer.typedInternal(Typers.scala:5672)
at typechecker.Typers$Typer.body$2(Typers.scala:5613)
at typechecker.Typers$Typer.typed(Typers.scala:5618)
at typechecker.Typers$Typer.$anonfun$typed1$38(Typers.scala:4752)
at typechecker.Typers$Typer.silent(Typers.scala:700)
at typechecker.Typers$Typer.normalTypedApply$1(Typers.scala:4754)
at typechecker.Typers$Typer.typedApply$1(Typers.scala:4801)
at typechecker.Typers$Typer.typedInAnyMode$1(Typers.scala:5586)
at typechecker.Typers$Typer.typed1(Typers.scala:5603)
at transform.Erasure$Eraser.typed1(Erasure.scala:789)
at typechecker.Typers$Typer.runTyper$1(Typers.scala:5640)
at typechecker.
```
java.lang.StackOverflowError
at transform.Erasure$Eraser.typed1(Erasure.scala:789)
at typechecker.Typers$Typer.runTyper$1(Typers.scala:5640)
at typechecker.Typers$Typer.typedInternal(Typers.scala:5672)
at typechecker.Typers$Typer.body$2(Typers.scala:5613)
at typechecker.Typers$Typer.typed(Typers.scala:5618)
at typechecker.Typers$Typer.$anonfun$typed1$38(Typers.scala:4752)
at typechecker.Typers$Typer.silent(Typers.scala:700)
at typechecker.Typers$Typer.normalTypedApply$1(Typers.scala:4754)
at typechecker.Typers$Typer.typedApply$1(Typers.scala:4801)
at typechecker.Typers$Typer.typedInAnyMode$1(Typers.scala:5586)
at typechecker.Typers$Typer.typed1(Typers.scala:5603)
at transform.Erasure$Eraser.typed1(Erasure.scala:789)
at typechecker.Typers$Typer.runTyper$1(Typers.scala:5640)
at typechecker.Typers$Typer.typedInternal(Typers.scala:5672)
at typechecker.Typers$Typer.body$2(Typers.scala:5613)
at typechecker.Typers$Typer.typed(Typers.scala:5618)
at typechecker.Typers$Typer.typedQualifier(Typers.scala:5723)
at typechecker.Typers$Typer.typedQualifier(Typers.scala:5731)
at transform.Erasure$Eraser.adaptMember(Erasure.scala:714)
at transform.Erasure$Eraser.typed1(Erasure.scala:789)
at typechecker.Typers$Typer.runTyper$1(Typers.scala:5640)
at typechecker.Typers$Typer.typedInternal(Typers.scala:5672)
at typechecker.Typers$Typer.body$2(Typers.scala:5613)
at typechecker.Typers$Typer.typed(Typers.scala:5618)
at typechecker.Typers$Typer.$anonfun$typed1$38(Typers.scala:4752)
at typechecker.Typers$Typer.silent(Typers.scala:700)
at typechecker.Typers$Typer.normalTypedApply$1(Typers.scala:4754)
at typechecker.Typers$Typer.typedApply$1(Typers.scala:4801)
at typechecker.Typers$Typer.typedInAnyMode$1(Typers.scala:5586)
at typechecker.Typers$Typer.typed1(Typers.scala:5603)
at transform.Erasure$Eraser.typed1(Erasure.scala:789)
at typechecker.Typers$Typer.runTyper$1(Typers.scala:5640)
at typechecker.
Solution
Build suffix tree using Ukkonen's algorithm, this way in $\mathcal O(n)$ you will find all substrings in provided text with indices.
In the case of approximate matching, there is also extended version of Ukkonen's algorithm.
In the case of approximate matching, there is also extended version of Ukkonen's algorithm.
Context
StackExchange Computer Science Q#102463, answer score: 5
Revisions (0)
No revisions yet.