patternpythonMinor
Randomly split a vector into N bins of random lengths
Viewed 0 times
randomintorandomlysplitbinslengthsvector
Problem
As the title says, I want a function that takes a vector, as well as the number of bins, and splits the vector in that number of bins, with a minimum length of 1 for each bin.
I'm wondering if there's a cleaner way to split the vector into bins besides randomly selecting split locations. My method of splitting up the list given the location of the splits also seems pretty messy. Performance does matter, so I'm wondering if I should initialize the empty bins outside of the function.
I also don't want there to be any bias with regards to the size of the bins; they should all have the same size on average. I'm pretty sure this method isn't biased, however.
def split_into_bins(nbin, vector):
"""
Randomly split vector into nbin number of bins, each of random size
"""
permutation = list(np.random.permutation(vector))
# Location of the splits
splits = sorted(np.random.choice(range(1,len(vector)), nbin-1, replace=False))
# Initializing empty bins
bins = [[]]*nbin
start = 0
end = splits[0]
for i in range(nbin):
bins[i] = permutation[start:end]
start = end
try:
end = splits[i+1]
except IndexError:
end = len(vector)
return binsI'm wondering if there's a cleaner way to split the vector into bins besides randomly selecting split locations. My method of splitting up the list given the location of the splits also seems pretty messy. Performance does matter, so I'm wondering if I should initialize the empty bins outside of the function.
I also don't want there to be any bias with regards to the size of the bins; they should all have the same size on average. I'm pretty sure this method isn't biased, however.
Solution
np.random.choice can just take an int, so that can simplify your code - no need to build up a range():splits = sorted(1 + x
for x in np.random.choice(len(vector)-1, nbins-1, replace=True))When you build up the bins, you don't need a
try/except (you know up front what the size is). You just need to iterate over the actual splits:bins = []
last = 0
for split in splits:
bins.append(permutation[last:split])
last = split
bins.append(permutation[split:])Code Snippets
splits = sorted(1 + x
for x in np.random.choice(len(vector)-1, nbins-1, replace=True))bins = []
last = 0
for split in splits:
bins.append(permutation[last:split])
last = split
bins.append(permutation[split:])Context
StackExchange Code Review Q#110935, answer score: 5
Revisions (0)
No revisions yet.