patternpythonMinor
Is this escape unescape function pair correct?
Viewed 0 times
thisfunctionunescapecorrectpairescape
Problem
Is this example of an escape - unescape function correct?
ESCAPE = '/'
def escape(s):
rv = s
rv = rv.replace(ESCAPE, ESCAPE + ESCAPE)
rv = rv.replace('+', ESCAPE + 'PLUS')
rv = rv.replace('-', ESCAPE + 'MINUS')
return rv
def unescape(s):
rv = s
rv = rv.replace(ESCAPE + 'MINUS', '-')
rv = rv.replace(ESCAPE + 'PLUS', '+')
rv = rv.replace(ESCAPE + ESCAPE, ESCAPE)
return rvSolution
Performing string substitutions using multiple passes is almost always a bad idea, not only for performance, but for correctness. In this case, there is indeed a bug:
Therefore, you have to parse the input string in one pass:
In your original code and in the implementation above, there is a lot of repetition. If there are more symbols being escaped than just
unescape('//PLUS/PLUS') should produce '/PLUS+', but instead produces '/++'.Therefore, you have to parse the input string in one pass:
import re
def unescape(s):
def replacement(match):
return {
'/': '/',
'PLUS': '+',
'MINUS': '-',
}[match.group(1)]
return re.sub('/(/|PLUS|MINUS)', replacement, s)In your original code and in the implementation above, there is a lot of repetition. If there are more symbols being escaped than just
+ and -, then it makes sense to specify the mapping just once.ESCAPE_CHAR = '/'
ESCAPE = {
ESCAPE_CHAR: ESCAPE_CHAR,
'+': 'PLUS',
'-': 'MINUS',
}
ESCAPE_RE = re.compile('|'.join(map(re.escape, ESCAPE.keys())))
UNESCAPE = dict((v, k) for (k, v) in ESCAPE.items())
UNESCAPE_RE = re.compile(
re.escape(ESCAPE_CHAR) + '(' +
'|'.join(map(re.escape, UNESCAPE.keys())) +
')')
def escape(s):
return ESCAPE_RE.sub(lambda match: ESCAPE_CHAR + ESCAPE[match.group(0)], s)
def unescape(s):
return UNESCAPE_RE.sub(lambda match: UNESCAPE[match.group(1)], s)Code Snippets
import re
def unescape(s):
def replacement(match):
return {
'/': '/',
'PLUS': '+',
'MINUS': '-',
}[match.group(1)]
return re.sub('/(/|PLUS|MINUS)', replacement, s)ESCAPE_CHAR = '/'
ESCAPE = {
ESCAPE_CHAR: ESCAPE_CHAR,
'+': 'PLUS',
'-': 'MINUS',
}
ESCAPE_RE = re.compile('|'.join(map(re.escape, ESCAPE.keys())))
UNESCAPE = dict((v, k) for (k, v) in ESCAPE.items())
UNESCAPE_RE = re.compile(
re.escape(ESCAPE_CHAR) + '(' +
'|'.join(map(re.escape, UNESCAPE.keys())) +
')')
def escape(s):
return ESCAPE_RE.sub(lambda match: ESCAPE_CHAR + ESCAPE[match.group(0)], s)
def unescape(s):
return UNESCAPE_RE.sub(lambda match: UNESCAPE[match.group(1)], s)Context
StackExchange Code Review Q#57121, answer score: 8
Revisions (0)
No revisions yet.