patternpythonMinor
Multi-replacement in a string, independently and prioritized
Viewed 0 times
multiindependentlyprioritizedandstringreplacement
Problem
I need a shorter way to do this.
-
input: an ordered list of string pairs
-
task: for each pair
-
note 1: the replacements have to be independent. The result of one replacement can not be taken as a trigger for a subsequent replacement
Example: L=(("a","b"),("b","c")); s="ab"; desired output: "bc", and not "cc".
-
note 2: L is ordered; the beginning of the list has higher priority, so if
-
bonus: it's a great thing if it can also handle regex patterns as replacement pairs
My idea:
Here is my implementation in Python:
```
def multireplace(text, changes):
# find a possible joker character
for num in range(256):
joker = chr(num)
if text.find(joker) != -1:
continue
jokerUsable = True
for old,new in changes:
if old.find(joker) != -1 or new.find(joker) != -1:
jokerUsable = False
break
if jokerUsable:
break
# list of lists, each list is about one replacement pair,
# and its elements will be the indices where they have to be done
index_lists=[]
for old, new in changes:
indices = []
pattern = re.escape(old)
for m in re.finditer(pattern, text):
indices.append(m.start())
text = text.replace(old, joker*len(old))
index_lists.append(indices)
character_list = list(text)
for i in
-
input: an ordered list of string pairs
L; and a string s-
task: for each pair
(a,b), replace each occurrence a in s by b-
note 1: the replacements have to be independent. The result of one replacement can not be taken as a trigger for a subsequent replacement
Example: L=(("a","b"),("b","c")); s="ab"; desired output: "bc", and not "cc".
-
note 2: L is ordered; the beginning of the list has higher priority, so if
L is (("xyz", "P"), ("xy", Q)) and s is "wxyzabc" then the result should be "wPabc". If the order of L were reversed, the result would be "wQzabc".-
bonus: it's a great thing if it can also handle regex patterns as replacement pairs
My idea:
- iterate over L, and replace each source to a sequence of joker characters of equal length (the joker does not appear anywhere in the string), while noting the position where we found it.
- then split the string to a list of strings of length 1, replace the first joker of each sequence with the desired target replacement, join the list back together and remove jokers.
Here is my implementation in Python:
```
def multireplace(text, changes):
# find a possible joker character
for num in range(256):
joker = chr(num)
if text.find(joker) != -1:
continue
jokerUsable = True
for old,new in changes:
if old.find(joker) != -1 or new.find(joker) != -1:
jokerUsable = False
break
if jokerUsable:
break
# list of lists, each list is about one replacement pair,
# and its elements will be the indices where they have to be done
index_lists=[]
for old, new in changes:
indices = []
pattern = re.escape(old)
for m in re.finditer(pattern, text):
indices.append(m.start())
text = text.replace(old, joker*len(old))
index_lists.append(indices)
character_list = list(text)
for i in
Solution
Do it recursively. Split the string by the first "from" value, recurse with the rest of the changes for each resulting part, then join them back together with the "to" value.
(Note: this answer is based on an earlier answer that somehow got deleted)
def multireplace(text, changes):
if not changes:
return text
from_, to = changes[0]
parts = text.split(from_)
return to.join((multireplace(part, changes[1:]) for part in parts))- Advantage: Short and clean code
- Disadvantage: Uses linear stack with respect to the number of changes. Might lead to stack overflow in case of lots of replacement pairs.
(Note: this answer is based on an earlier answer that somehow got deleted)
Code Snippets
def multireplace(text, changes):
if not changes:
return text
from_, to = changes[0]
parts = text.split(from_)
return to.join((multireplace(part, changes[1:]) for part in parts))Context
StackExchange Code Review Q#64096, answer score: 2
Revisions (0)
No revisions yet.