snippetpythonMinor
How can I optimize this Monte Carlo simulation running at 10,000,000 iterations?
Viewed 0 times
thismontecaniterationssimulation000runningoptimizehowcarlo
Problem
I am writing this Monte Carlo simulation and I am facing this issue when running the code at 10,000,000 iterations. here is the code:
any suggestions/criticism would be appreciated.
import random as rnd
from time import time
#get user input on number of iterations
numOfIterations = raw_input('Enter the number of iterations: ')
numOfIterations = int(numOfIterations)
start = time()
#initialize bag (44 green, 20 blue, 15 yellow, 11 red, 2 white, 1 black
#a counter
#and question counter
bag = 44*'g'+ 20*'b' + 15*'y' + 11*'r' + 2*'w' + 'k'
counter = {'g':0, 'b':0,'y':0 ,'r':0,'w':0,'k':0}
question = {'one':0,'two':0,'three':0,'four':0,'five':0}
for i in range(0,numOfIterations):
for j in xrange(0,5):
draw = rnd.sample(bag,5)
for x in draw: counter[x]+=1
if counter['w'] >0 and counter['k'] >0: question['one']+=1
if counter['b'] > counter['r']: question['two']+=1
if counter['b'] > counter['y']: question['three']+=1
if counter['y'] > counter['r']: question['four']+=1
if counter['g'] < (counter['b']+counter['y']+counter['r']+counter['w']+counter['k']): question['five']+=1
for k in counter: counter[k] = 0
p1 = float(question['one'])/float(numOfIterations)
p2 = float(question['two'])/float(numOfIterations)
p3 = float(question['three'])/float(numOfIterations)
p4 = float(question['four'])/float(numOfIterations)
p5 = float(question['five'])/float(numOfIterations)
print 'Q1 \t Q2 \t Q3 \t Q4 \t Q5'
print str(p1)+'\t'+str(p2)+'\t'+str(p3)+'\t'+str(p4)+'\t'+str(p5)
end = time()
print 'it took ' +str(end-start)+ ' seconds'any suggestions/criticism would be appreciated.
Solution
import random as rndI dislike abbreviation like this, they make the code harder to read
from time import time
#get user input on number of iterations
numOfIterations = raw_input('Enter the number of iterations: ')
numOfIterations = int(numOfIterations)Any reason you didn't combine these two lines?
start = time()
#initialize bag (44 green, 20 blue, 15 yellow, 11 red, 2 white, 1 black
#a counter
#and question counter
bag = 44*'g'+ 20*'b' + 15*'y' + 11*'r' + 2*'w' + 'k'
counter = {'g':0, 'b':0,'y':0 ,'r':0,'w':0,'k':0}
question = {'one':0,'two':0,'three':0,'four':0,'five':0}Looking up your data by strings all the time is going to be somewhat slower. Instead, I'd suggest you keep lists and store the data that way.
for i in range(0,numOfIterations):Given that numOfIterations will be very large, its probably a good idea to use xrange here.
for j in xrange(0,5):You should generally put logic inside a function. That is especially true for any sort of loop as it will run faster in a function.
draw = rnd.sample(bag,5)
for x in draw: counter[x]+=1I dislike putting the contents of the loop on the same line. I think it makes it harder to read.
if counter['w'] >0 and counter['k'] >0: question['one']+=1
if counter['b'] > counter['r']: question['two']+=1
if counter['b'] > counter['y']: question['three']+=1
if counter['y'] > counter['r']: question['four']+=1
if counter['g'] < (counter['b']+counter['y']+counter['r']+counter['w']+counter['k']): question['five']+=1
for k in counter: counter[k] = 0
p1 = float(question['one'])/float(numOfIterations)
p2 = float(question['two'])/float(numOfIterations)
p3 = float(question['three'])/float(numOfIterations)
p4 = float(question['four'])/float(numOfIterations)
p5 = float(question['five'])/float(numOfIterations)Don't create five separate variables, create a list. Also, if you add the line
from __future__ import division at the beginning of the file then dividing two ints will produce a float. Then you don't need to convert them to floats here.print 'Q1 \t Q2 \t Q3 \t Q4 \t Q5'
print str(p1)+'\t'+str(p2)+'\t'+str(p3)+'\t'+str(p4)+'\t'+str(p5)See if you had p1 a list, this would be much easier
end = time()
print 'it took ' +str(end-start)+ ' seconds'For speed improvements you want to look at using numpy. It allows implementing efficient operations over arrays.
In this precise case I'd use a multinomial distribution and solve the problem analytically rather then using monte carlo.
Code Snippets
import random as rndfrom time import time
#get user input on number of iterations
numOfIterations = raw_input('Enter the number of iterations: ')
numOfIterations = int(numOfIterations)start = time()
#initialize bag (44 green, 20 blue, 15 yellow, 11 red, 2 white, 1 black
#a counter
#and question counter
bag = 44*'g'+ 20*'b' + 15*'y' + 11*'r' + 2*'w' + 'k'
counter = {'g':0, 'b':0,'y':0 ,'r':0,'w':0,'k':0}
question = {'one':0,'two':0,'three':0,'four':0,'five':0}for i in range(0,numOfIterations):for j in xrange(0,5):Context
StackExchange Code Review Q#6311, answer score: 8
Revisions (0)
No revisions yet.