HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonMinor

Multi-replacement in a string, independently and prioritized

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
multiindependentlyprioritizedandstringreplacement

Problem

I need a shorter way to do this.

-
input: an ordered list of string pairs L; and a string s

-
task: for each pair (a,b), replace each occurrence a in s by b

-
note 1: the replacements have to be independent. The result of one replacement can not be taken as a trigger for a subsequent replacement

Example: L=(("a","b"),("b","c")); s="ab"; desired output: "bc", and not "cc".

-
note 2: L is ordered; the beginning of the list has higher priority, so if L is (("xyz", "P"), ("xy", Q)) and s is "wxyzabc" then the result should be "wPabc". If the order of L were reversed, the result would be "wQzabc".

-
bonus: it's a great thing if it can also handle regex patterns as replacement pairs

My idea:

  • iterate over L, and replace each source to a sequence of joker characters of equal length (the joker does not appear anywhere in the string), while noting the position where we found it.



  • then split the string to a list of strings of length 1, replace the first joker of each sequence with the desired target replacement, join the list back together and remove jokers.



Here is my implementation in Python:

```
def multireplace(text, changes):
# find a possible joker character
for num in range(256):
joker = chr(num)
if text.find(joker) != -1:
continue

jokerUsable = True
for old,new in changes:
if old.find(joker) != -1 or new.find(joker) != -1:
jokerUsable = False
break

if jokerUsable:
break

# list of lists, each list is about one replacement pair,
# and its elements will be the indices where they have to be done
index_lists=[]

for old, new in changes:
indices = []
pattern = re.escape(old)
for m in re.finditer(pattern, text):
indices.append(m.start())

text = text.replace(old, joker*len(old))
index_lists.append(indices)

character_list = list(text)
for i in

Solution

Do it recursively. Split the string by the first "from" value, recurse with the rest of the changes for each resulting part, then join them back together with the "to" value.

def multireplace(text, changes):
    if not changes:
        return text

    from_, to = changes[0]
    parts = text.split(from_)
    return to.join((multireplace(part, changes[1:]) for part in parts))


  • Advantage: Short and clean code



  • Disadvantage: Uses linear stack with respect to the number of changes. Might lead to stack overflow in case of lots of replacement pairs.



(Note: this answer is based on an earlier answer that somehow got deleted)

Code Snippets

def multireplace(text, changes):
    if not changes:
        return text

    from_, to = changes[0]
    parts = text.split(from_)
    return to.join((multireplace(part, changes[1:]) for part in parts))

Context

StackExchange Code Review Q#64096, answer score: 2

Revisions (0)

No revisions yet.