HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonMinor

Converting HEX values to Unicode characters

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
unicodehexcharactersvaluesconverting

Problem

I have a small bot for social network, which accepts message "unicode XXXX XXXX XXXX ...", where XXXX - HEX values which I should convert to Unicode characters. Initially command accepted only one HEX value, and it worked fine and fast. After I modified code to accept multiple values, process slowed down (1-2 seconds response vs. 4-5 seconds).

Here is my code:

def unic(msg):
    if type(msg) == list:
        msg.pop(0) # removing "unicode" word
        try:
            sym = '' # final result
            correct = [] # correct HEX values to be converted
            for code in msg:
                try:
                    chr(int(code, 16))
                    correct.append(code)
                except:
                    pass
            if correct != []:
                for code in correct:
                    sym = sym + chr(int(code, 16))
            elif correct == []:
                return c_s.get('incorrect') # returning error message from common strings list
            if sym != '':
                return sym
        except:
            return c_s.get('incorrect')


What should I change here to accelerate process? Any suggestions are welcome.

Solution

Only use try-except when you need to. Your outer try should be useless.
You should also perform chr(int(code, 16)) once, as there is no gain to running it twice, it just costs cycles.

Your output is horrible, if there is no valid input you go from explicitly silencing it to returning an error.
And if somehow you have an array of items, but they don't get added to sym you return None.
You need to pick how your function works. Either remove invalid characters, or raise errors.

Doing the above removes half your code. I'd also change except to catch certain errors, e.g. ValueError, otherwise your program is prone to mask bugs.

Strings are immutable and so sym = sym + chr(int(code, 16)) can take a very long amount of time.
Instead build a list and ''.join it.

Finally return c_s.get('incorrect') is a massive red flag, remove these and raise exceptions instead.

This can get you:

def unicode_(msg):
    new_msg = []
    for char in msg:
        try:
            char = chr(int(char, 16))
        except ValueError:
            char = '?'
        new_msg.append(char)
    return ''.join(new_msg)

Code Snippets

def unicode_(msg):
    new_msg = []
    for char in msg:
        try:
            char = chr(int(char, 16))
        except ValueError:
            char = '?'
        new_msg.append(char)
    return ''.join(new_msg)

Context

StackExchange Code Review Q#147433, answer score: 7

Revisions (0)

No revisions yet.