patternpythonMinor
Python hexdump generator
Viewed 0 times
hexdumppythongenerator
Problem
I wrote the following hexdump generator function. How can I improve it?
Example:
Tested in CPython 3.6 on Windows 10.
FMT = '{} {} |{}|'
def hexdump_gen(byte_string, _len=16, n=0, sep='-'):
while byte_string[n:]:
col0, col1, col2 = format(n, '08x'), [], ''
for i in bytearray(byte_string[n:n + _len]):
col1 += [format(i, '02x')]
col2 += chr(i) if 31 < i < 127 else '.'
col1 += [' '] * (_len - len(col1))
col1.insert(_len // 2, sep)
yield FMT.format(col0, ' '.join(col1), col2)
n += _lenExample:
In[15]: byte_string = b'W\x9a9\x81\xc2\xb5\xb9\xce\x02\x979\xb5\x19\xa0' \
...: b'\xb9\xca\x02\x979\xb5\x19\xa0\xb9\xca\x02\x979' \
...: b'\xb5\x19\xa0\xb9\xca\x8c\x969\xfb\x89\x8e\xb9' \
...: b'\nj\xb19\x81\x18\x84\xb9\x95j\xb19\x81\x18\x84' \
...: b'\xb9\x95j\xb19\x81\x18\x84\xb9\x95j\xb19\x81\x18' \
...: b'\x84\xb9\x95j\xb19\x81\x18\x84\xb9\x95'
...:
In[16]: from hexdump import hexdump_gen
In[17]: for i in hexdump_gen(byte_string, n=32, sep=''):
...: print(i)
...:
00000020 8c 96 39 fb 89 8e b9 0a 6a b1 39 81 18 84 b9 95 |..9.....j.9.....|
00000030 6a b1 39 81 18 84 b9 95 6a b1 39 81 18 84 b9 95 |j.9.....j.9.....|
00000040 6a b1 39 81 18 84 b9 95 6a b1 39 81 18 84 b9 95 |j.9.....j.9.....|
Tested in CPython 3.6 on Windows 10.
Solution
I think your hexdump implementation looks pretty good. I have no immediate comments on the implementation. I will however comment on the implied requirements.
Hex Dumper Definition
Most hex dumpers that I am familiar with dump hex as a modulo of the stride length. The example you show implies that, but that is because your example uses
So I suggest you consider adding another parameter (let's call it base_addr) which is the address of the beginning of the byte array. And, then also consider adding fill at the beginning of the dump to allow it to align the dump with an even modulus of the stride length. Such that:
Would produce:
One way that could be done:
Symmetric parameters
The
So currently, the dumper always goes to the end of the byte array. From a symmetry perspective it would seem a good idea to also provide a terminal condition.
Hex Dumper Definition
Most hex dumpers that I am familiar with dump hex as a modulo of the stride length. The example you show implies that, but that is because your example uses
n=32, where 32 is an even modulus of the stride length (16). If you pass in different stride lengths, or pass in an n that is not an even modulus of the stride, the output doesn't (to my eye) look quite as nice.So I suggest you consider adding another parameter (let's call it base_addr) which is the address of the beginning of the byte array. And, then also consider adding fill at the beginning of the dump to allow it to align the dump with an even modulus of the stride length. Such that:
hexdump_gen(byte_string, base_addr=1, n=1, sep='')Would produce:
00000000 9a 39 81 c2 b5 b9 ce 02 97 39 b5 19 a0 b9 | .9.......9....|
00000010 ca 02 97 39 b5 19 a0 b9 ca 02 97 39 b5 19 a0 b9 |...9.......9....|
00000020 ca 8c 96 39 fb 89 8e b9 0a 6a b1 39 81 18 84 b9 |...9.....j.9....|
00000030 95 6a b1 39 81 18 84 b9 95 6a b1 39 81 18 84 b9 |.j.9.....j.9....|
00000040 95 6a b1 39 81 18 84 b9 95 6a b1 39 81 18 84 b9 |.j.9.....j.9....|
00000050 95 |. |One way that could be done:
def hexdump_gen(byte_string, _len=16, base_addr=0, n=0, sep='-'):
not_shown = [' ']
leader = (base_addr + n) % _len
next_n = n + _len - leader
while byte_string[n:]:
col0 = format(n + base_addr - leader, '08x')
col1 = not_shown * leader
col2 = ' ' * leader
leader = 0
for i in bytearray(byte_string[n:next_n]):
col1 += [format(i, '02x')]
col2 += chr(i) if 31 < i < 127 else '.'
trailer = _len - len(col1)
if trailer:
col1 += not_shown * trailer
col2 += ' ' * trailer
col1.insert(_len // 2, sep)
yield FMT.format(col0, ' '.join(col1), col2)
n = next_n
next_n += _lenSymmetric parameters
The
n parameter is an offset into the bytearray, which specifies where in the bytearray to start the dump. But there is no equivalent end address.So currently, the dumper always goes to the end of the byte array. From a symmetry perspective it would seem a good idea to also provide a terminal condition.
Code Snippets
hexdump_gen(byte_string, base_addr=1, n=1, sep='')00000000 9a 39 81 c2 b5 b9 ce 02 97 39 b5 19 a0 b9 | .9.......9....|
00000010 ca 02 97 39 b5 19 a0 b9 ca 02 97 39 b5 19 a0 b9 |...9.......9....|
00000020 ca 8c 96 39 fb 89 8e b9 0a 6a b1 39 81 18 84 b9 |...9.....j.9....|
00000030 95 6a b1 39 81 18 84 b9 95 6a b1 39 81 18 84 b9 |.j.9.....j.9....|
00000040 95 6a b1 39 81 18 84 b9 95 6a b1 39 81 18 84 b9 |.j.9.....j.9....|
00000050 95 |. |def hexdump_gen(byte_string, _len=16, base_addr=0, n=0, sep='-'):
not_shown = [' ']
leader = (base_addr + n) % _len
next_n = n + _len - leader
while byte_string[n:]:
col0 = format(n + base_addr - leader, '08x')
col1 = not_shown * leader
col2 = ' ' * leader
leader = 0
for i in bytearray(byte_string[n:next_n]):
col1 += [format(i, '02x')]
col2 += chr(i) if 31 < i < 127 else '.'
trailer = _len - len(col1)
if trailer:
col1 += not_shown * trailer
col2 += ' ' * trailer
col1.insert(_len // 2, sep)
yield FMT.format(col0, ' '.join(col1), col2)
n = next_n
next_n += _lenContext
StackExchange Code Review Q#161616, answer score: 3
Revisions (0)
No revisions yet.