HiveBrain v1.2.0
Get Started
← Back to all entries
patterncppMinor

Is there an elegant algorithm for applying multiple simultaneous search-and-replace operations to a string?

Submitted by: @import:stackexchange-cs··
0
Viewed 0 times
operationssearchreplacesimultaneousalgorithmforelegantmultipleandapplying

Problem

The operation I'm interested in is a slight twist on your standard search-and-replace operation, in that instead of a single pair of arguments (replaceme and withme), it takes a keyed set of N [before, after] pairs as an argument, and for each pair in the set, replaces all instances of ` in the string with .

For example, if we have this initial string:

three point one four one five


... and our set of
[before, after]` pairs looks like this:

[one, two]
[two, three]
[three, four]
[four, five]
[five, six]


... then after the operation completed, our updated string would be:

four point two five two six


The intuitive (but wrong) way to implement this would be to just iterate over the set and do a traditional search-and-replace for each pair in series. That works for cases where the values in the set are unrelated, but it fails in other cases (such as our example above) because the later iterations inappropriately operate on the results of the earlier operations. So doing it that way yields an incorrect result like this:

six point six six six six


My question is, there any well-known algorithm for handling this operation efficiently? I was able to come up with an algorithm that seems to accomplish it, but I suspect there might be a better way to do it than my hackish approach.

Solution

You can build an Aho-Corasick automata for the before words, run your string through it, and whenever you find a match go back in the string and replace it with the after value, and then reset the automata. This takes linear time (in the length of both the input and output).

Context

StackExchange Computer Science Q#159240, answer score: 3

Revisions (0)

No revisions yet.