patterncsharpMinor
Resizing a discrete uniform CSRNG distribution
Viewed 0 times
resizingdiscretecsrnguniformdistribution
Problem
I have a requirement to generate a uniform distribution of cryptographically secure random numbers. To generate these numbers I only know of the
And so I thought of three ways to do this. The first two ways were using
I'm currently using:
Since I tried using
```
private static IEnumerable RandomModulo(int chunkSize, int limit)
{
return RandomBytes(chunkSize).Select(n => n % limit);
}
private static IEnumerable RandomResize(int chunkSize, int limit)
{
var denom = 256.0/limit;
return RandomBytes(chunkSize).Select(n => (int)(n / denom));
}
private static void TestDistribution(int limit, int amount, int chunkSize, Func> fn)
{
var counter = new Dictionary();
foreach (var i in Enumerable.Range(0, limit))
{
counter[i] = 0;
}
foreach (var i in fn(chunkSize, limit).Take(amount))
{
counter[i] += 1;
}
Console.WriteL
RNGCryptoServiceProvider class, that has the method GetBytes(byte[] data). This is good, but I have to then change this discrete uniform distribution ranging from 0 - 256 to a different discrete uniform distribution ranging from 0 - limit.And so I thought of three ways to do this. The first two ways were using
% and /, which don't work if limit is not a divisor of 256. As this leads to a non-uniform distribution. However filtering all numbers that a not less than the limit works and gives a uniform distribution, however has terrible performance problems when limit is a small number.I'm currently using:
private static IEnumerable Cycle(object value=null)
{
while (true)
{
yield return value;
}
}
private static IEnumerable RandomBytes(int chunkSize)
{
return Cycle().SelectMany(unused =>
{
var b = new byte[chunkSize];
new RNGCryptoServiceProvider().GetBytes(b);
return b;
});
}
private static IEnumerable RandomFilter(int chunkSize, int limit)
{
return RandomBytes(chunkSize).Where(n => n (int)n);
}Since I tried using
% and /, below is the code that I used to check if there is a uniform distribution, and those two solutions.```
private static IEnumerable RandomModulo(int chunkSize, int limit)
{
return RandomBytes(chunkSize).Select(n => n % limit);
}
private static IEnumerable RandomResize(int chunkSize, int limit)
{
var denom = 256.0/limit;
return RandomBytes(chunkSize).Select(n => (int)(n / denom));
}
private static void TestDistribution(int limit, int amount, int chunkSize, Func> fn)
{
var counter = new Dictionary();
foreach (var i in Enumerable.Range(0, limit))
{
counter[i] = 0;
}
foreach (var i in fn(chunkSize, limit).Take(amount))
{
counter[i] += 1;
}
Console.WriteL
Solution
There are things you can do to improve the efficiency of your filter solution with small values as the
More efficient filtering
The most obvious one is actually given in the example of the function on the MSDN network - which is basically to extend the filter range to the highest multiple of limit, less than 256. I.e. if the limit is 20, then filter up to 240, and then the modulo is still uniformly distributed:
Copied from that page:
Statistically insignificant errors
You have another alternative too, which is to reduce any modulo bias to a statistically insignificant amount. You do this by using significantly bigger random numbers (bigger than 256). Combining multiple bytes for each output byte would accomplish this, for example, using 4 bytes. Using more bytes would further reduce any modulo bias:
Using 4 bytes of source for your modulo (with max limit 256) will reduce the bias error to about 1 in \$2^{24}\$ ... which would be hard to measure.
This version is also shown here on ideone: https://ideone.com/maiQ70
Removed
On the other hand, there's a better way, that still guarantees uniform distribution, but relies on knowing what the previous value was. Conceptually, this solution relies on having uniform "gaps" between results, and so you can add a uniform random gap to the previous value, and then do the modulo on that.
I implemented it as:
Note that it takes the byte-value as a "gap", adds it to the previous result, and then "wraps" it around in the
You can see the results here in ideone: https://ideone.com/BEbwsA
limit.More efficient filtering
The most obvious one is actually given in the example of the function on the MSDN network - which is basically to extend the filter range to the highest multiple of limit, less than 256. I.e. if the limit is 20, then filter up to 240, and then the modulo is still uniformly distributed:
Copied from that page:
private static bool IsFairRoll(byte roll, byte numSides)
{
// There are MaxValue / numSides full sets of numbers that can come up
// in a single byte. For instance, if we have a 6 sided die, there are
// 42 full sets of 1-6 that come up. The 43rd set is incomplete.
int fullSetsOfValues = Byte.MaxValue / numSides;
// If the roll is within this range of fair values, then we let it continue.
// In the 6 sided die case, a roll between 0 and 251 is allowed. (We use
// < rather than <= since the = portion allows through an extra 0 value).
// 252 through 255 would provide an extra 0, 1, 2, 3 so they are not fair
// to use.
return roll < numSides * fullSetsOfValues;
}Statistically insignificant errors
You have another alternative too, which is to reduce any modulo bias to a statistically insignificant amount. You do this by using significantly bigger random numbers (bigger than 256). Combining multiple bytes for each output byte would accomplish this, for example, using 4 bytes. Using more bytes would further reduce any modulo bias:
/* Convert 4 bytes to a long */
private static IEnumerable RandomLongs()
{
var csrng = new RNGCryptoServiceProvider();
return Cycle().Select(unused =>
{
var bytes = new byte[4];
csrng.GetBytes(bytes);
return (long)(BitConverter.ToUInt32( bytes, 0));
});
}
private static IEnumerable RandomExtended(int chunkSize, int limit)
{
var source = RandomBytes(chunkSize);
foreach (long lng in RandomLongs())
{
var nxt = lng % limit;
yield return (int)nxt;
}
}Using 4 bytes of source for your modulo (with max limit 256) will reduce the bias error to about 1 in \$2^{24}\$ ... which would be hard to measure.
This version is also shown here on ideone: https://ideone.com/maiQ70
Removed
On the other hand, there's a better way, that still guarantees uniform distribution, but relies on knowing what the previous value was. Conceptually, this solution relies on having uniform "gaps" between results, and so you can add a uniform random gap to the previous value, and then do the modulo on that.
I implemented it as:
private static IEnumerable RandomWrap(int chunkSize, int limit)
{
int prev = 0;
foreach (byte b in RandomBytes(chunkSize))
{
prev = (prev + b) % limit;
yield return prev;
}
}Note that it takes the byte-value as a "gap", adds it to the previous result, and then "wraps" it around in the
limit range.You can see the results here in ideone: https://ideone.com/BEbwsA
Code Snippets
private static bool IsFairRoll(byte roll, byte numSides)
{
// There are MaxValue / numSides full sets of numbers that can come up
// in a single byte. For instance, if we have a 6 sided die, there are
// 42 full sets of 1-6 that come up. The 43rd set is incomplete.
int fullSetsOfValues = Byte.MaxValue / numSides;
// If the roll is within this range of fair values, then we let it continue.
// In the 6 sided die case, a roll between 0 and 251 is allowed. (We use
// < rather than <= since the = portion allows through an extra 0 value).
// 252 through 255 would provide an extra 0, 1, 2, 3 so they are not fair
// to use.
return roll < numSides * fullSetsOfValues;
}/* Convert 4 bytes to a long */
private static IEnumerable<long> RandomLongs()
{
var csrng = new RNGCryptoServiceProvider();
return Cycle().Select(unused =>
{
var bytes = new byte[4];
csrng.GetBytes(bytes);
return (long)(BitConverter.ToUInt32( bytes, 0));
});
}
private static IEnumerable<int> RandomExtended(int chunkSize, int limit)
{
var source = RandomBytes(chunkSize);
foreach (long lng in RandomLongs())
{
var nxt = lng % limit;
yield return (int)nxt;
}
}private static IEnumerable<int> RandomWrap(int chunkSize, int limit)
{
int prev = 0;
foreach (byte b in RandomBytes(chunkSize))
{
prev = (prev + b) % limit;
yield return prev;
}
}Context
StackExchange Code Review Q#157042, answer score: 3
Revisions (0)
No revisions yet.