snippetpythonMinor
Quasi-random sequences: how to improve style and tests?
Viewed 0 times
quasirandomtestsimprovestylehowandsequences
Problem
The requirements for this one were (original SO question):
The solution was actually drafted by another user, this is my implementation (influenced by random.shuffle).
I'd like to receive any input you have on improving this code, from style to comments to tests and anything else you can think of.
- Generate a random-ish sequence of items.
- Sequence should have each item N times.
- Sequence shouldn't have serial runs longer than a given number (longest below).
The solution was actually drafted by another user, this is my implementation (influenced by random.shuffle).
from random import random
from itertools import groupby # For testing the result
try: xrange
except: xrange = range
def generate_quasirandom(values, n, longest=3, debug=False):
# Sanity check
if len(values) = longest:
serial = 0
guard = 0
# We got a serial run, break it
while source[j] == latest:
j = int(random() * (sourcelen - i)) + i
guard += 1
# We just hit an infinit loop: there is no way to avoid a serial run
if guard > 10:
print("Unable to avoid serial run, disabling asserts.")
debug = False
break
else:
serial = 0
latest = source[j]
# Move the picked value to source[i:]
source[i], source[j] = source[j], source[i]
# More sanity checks
check_quasirandom(source, values, n, longest, debug)
return source
def check_quasirandom(shuffled, values, n, longest, debug):
counts = []
# We skip the last entries because breaking runs in them get too hairy
for val, count in groupby(shuffled):
counts.append(len(list(count)))
highest = max(counts)
print('Longest run: %d\nMax run lenght:%d' % (highest, longest))
# Invariants
assert len(shuffled) == len(values) * n
for val in values:
assert shuffled.count(val) == n
if debug:
# Only checked if we were able to avoid a sequential run >= longest
assert highest <= longest
for x in xrange(10, 1000):
generate_quasirandom((0, 1, 2, 3), 1000, x//10, debug=True)I'd like to receive any input you have on improving this code, from style to comments to tests and anything else you can think of.
Solution
A couple of possible code improvements that I noticed:
This seems unnecessarily complicated to me. I mean, after a second of thinking the reader of this line will realize that
That being said, I don't see why you need to store the length of
This can be written as a list comprehension like this:
Using list comprehensions is usually considered more idiomatic python than building up a list iteratively. It is also often more efficient.
As Fred Nurk points out, the list comprehension can also be written as
which avoids the creation of a temporary list and is maybe a bit more readable.
To get a random integer between
(You'll also need to import
sourcelen = len(values) * nThis seems unnecessarily complicated to me. I mean, after a second of thinking the reader of this line will realize that
len(values) * n is indeed the length of source, but that's still one step of thinking more than would be required if you just did sourcelen = len(source) (after populating source of course).That being said, I don't see why you need to store the length of
source in a variable at all. Doing for i in xrange(len(source)): isn't really less readable or less efficient than doing for i in xrange(sourcelen):, so I'd just get rid of the variable altogether.source = []
for val in values:
source += [val] * nThis can be written as a list comprehension like this:
source = [x for val in values for x in [val]*n]Using list comprehensions is usually considered more idiomatic python than building up a list iteratively. It is also often more efficient.
As Fred Nurk points out, the list comprehension can also be written as
source = [val for val in values for _ in xrange(n)]which avoids the creation of a temporary list and is maybe a bit more readable.
j = int(random() * (sourcelen - i)) + iTo get a random integer between
x (inclusive) and y (exclusive), you can use random.randrange(x,y), so the above can be written as:j = randrange(i, len(source) - i)(You'll also need to import
randrange instead of random from the random module). This makes it more immediately obviously that j is a random number between i and len(source) - i and introduces less room for mistakes.Code Snippets
sourcelen = len(values) * nsource = []
for val in values:
source += [val] * nsource = [x for val in values for x in [val]*n]source = [val for val in values for _ in xrange(n)]j = int(random() * (sourcelen - i)) + iContext
StackExchange Code Review Q#219, answer score: 4
Revisions (0)
No revisions yet.