patternMinor
Bitarithmetrics to Base X
Viewed 0 times
bitarithmetricsbasestackoverflow
Problem
I've got the following theoretical problem which puzzles me a bit:
I can obtain a string of n bytes (as octets, one byte = one octet = eight bits) of random data. I need to preserve the randomness while reducing the base from 256 to x where x is below 256 (and not 0, 1, 2, 4, 8, 16, 32, 64 or 128).
As I want to preserve the randomness, I don't want to cut-off (waste) any information from this string until I've obtained the number of chunks I need. This is for reason of randomness which can be a limited resource on the computer.
I had the idea to do this for base64 which is simple because I can just create 4 numbers out of a single byte (by shifting bits for example: encode64()). But how to do with a base like 254 for example? I can not cut off at bit-boundaries here, can I?
Do I probably need to create a number large enough out of base 2 based bits that can contain both bases? (This is one of the ideas I have so far).
Would be great to get some feedback, I normally paint pictures with such problems, however, just discovered this website here yesterday and I normally use Stackoverflow so I thought I give it a try :D
If you're interested in some non-theoretical background to my question, see "What is the meaning of the term “simple string” for the SALT string in Unix crypt using SHA-256 and SHA-512?", you might get an idea why I don't want to loose any information bits from the random source.
I can obtain a string of n bytes (as octets, one byte = one octet = eight bits) of random data. I need to preserve the randomness while reducing the base from 256 to x where x is below 256 (and not 0, 1, 2, 4, 8, 16, 32, 64 or 128).
As I want to preserve the randomness, I don't want to cut-off (waste) any information from this string until I've obtained the number of chunks I need. This is for reason of randomness which can be a limited resource on the computer.
I had the idea to do this for base64 which is simple because I can just create 4 numbers out of a single byte (by shifting bits for example: encode64()). But how to do with a base like 254 for example? I can not cut off at bit-boundaries here, can I?
Do I probably need to create a number large enough out of base 2 based bits that can contain both bases? (This is one of the ideas I have so far).
Would be great to get some feedback, I normally paint pictures with such problems, however, just discovered this website here yesterday and I normally use Stackoverflow so I thought I give it a try :D
If you're interested in some non-theoretical background to my question, see "What is the meaning of the term “simple string” for the SALT string in Unix crypt using SHA-256 and SHA-512?", you might get an idea why I don't want to loose any information bits from the random source.
Solution
You can use "arithmetic decoding". Interpret your random data as a random bit stream which encodes a random number between $0$ and $1$. Then write this number in base $B$.
A much simpler method is "rejection sampling". Suppose for example that $128 < B < 256$. Given a random byte $x$, if $0 \leq x < B$ then output $x$, otherwise reject. If $x$ is close to $256$ then this is pretty efficient. (To get higher efficiency, try the same trick with some power of $B$, i.e. output several digits at once.)
A much simpler method is "rejection sampling". Suppose for example that $128 < B < 256$. Given a random byte $x$, if $0 \leq x < B$ then output $x$, otherwise reject. If $x$ is close to $256$ then this is pretty efficient. (To get higher efficiency, try the same trick with some power of $B$, i.e. output several digits at once.)
Context
StackExchange Computer Science Q#13415, answer score: 6
Revisions (0)
No revisions yet.