patternpythonMinor
Sieve of Eratosthenes: making it quicker
Viewed 0 times
makingquickersieveeratosthenes
Problem
I was thinking about doing problem 23 of Project Euler. It includes the difficulty where you have to factorize primes. I had already done that in problems I solved earlier, but it was only necessary to have a prime list until a relative small number.
So I had already implemented the algorithm when I took a look at this problem. I stored numbers in a list, flagged the ones that weren't prime numbers and removed them from the list, al of this in Python. It worked well for small lists, but when I went big, the memory usage was just too much.
I still wanted to use the algorithm of Eratosthenes, but I still needed a solution to my memory issue. So I decided, I don't know if it is the best solution, to write the list to a file and to constantly rewrite this file.
I succeeded in making this algorithm, but it's not really fast. To find primes up to the number 20 000 it took my program 405 seconds, or 6'45".
Is there any way I can improve this implementation? No file rewriting, cashing data in the RAM, etc.?
SieveCash.py (Python 3)
```
import tempfile
import os
import sys
def createSieve(outputFile, maxp, logging=0):
"""Creates a sieve and stores one prime per line in file outputFile."""
# Write a list of all numbers to check to a file
myNumFile = createBeginNumberList(maxp, outputFile)
# Do the looping
rfh = open(myNumFile, "r")
try:
totalRead = 0
for i in range(maxp+1):
# log if necessary
if logging > 0 and i % logging == 0:
print("Checking number", i, "(" + str(int(i/(maxp+1)*100)) + "%)")
# do thingy
bytesRead, myNum = readNumber(rfh)
if bytesRead == 0:
break
totalRead += bytesRead
if myNum != 0:
rfh.close()
rfh = None
rewriteFileWithoutMultiples(myNumFile, myNum, totalRead)
rfh = open(myNumFile, "r")
rfh.seek(totalRead)
So I had already implemented the algorithm when I took a look at this problem. I stored numbers in a list, flagged the ones that weren't prime numbers and removed them from the list, al of this in Python. It worked well for small lists, but when I went big, the memory usage was just too much.
I still wanted to use the algorithm of Eratosthenes, but I still needed a solution to my memory issue. So I decided, I don't know if it is the best solution, to write the list to a file and to constantly rewrite this file.
I succeeded in making this algorithm, but it's not really fast. To find primes up to the number 20 000 it took my program 405 seconds, or 6'45".
Is there any way I can improve this implementation? No file rewriting, cashing data in the RAM, etc.?
SieveCash.py (Python 3)
```
import tempfile
import os
import sys
def createSieve(outputFile, maxp, logging=0):
"""Creates a sieve and stores one prime per line in file outputFile."""
# Write a list of all numbers to check to a file
myNumFile = createBeginNumberList(maxp, outputFile)
# Do the looping
rfh = open(myNumFile, "r")
try:
totalRead = 0
for i in range(maxp+1):
# log if necessary
if logging > 0 and i % logging == 0:
print("Checking number", i, "(" + str(int(i/(maxp+1)*100)) + "%)")
# do thingy
bytesRead, myNum = readNumber(rfh)
if bytesRead == 0:
break
totalRead += bytesRead
if myNum != 0:
rfh.close()
rfh = None
rewriteFileWithoutMultiples(myNumFile, myNum, totalRead)
rfh = open(myNumFile, "r")
rfh.seek(totalRead)
Solution
Writing to disk is going to be really slow. Don't do that.
If you are running out of memory, try using the standard array module or the numpy library. Both of those provide efficient arrays for holding large numbers of values. Your only running out of memory because python's lists don't store their values in a memory efficient manner.
If you are running out of memory for some reason even taking that into account, post that code here and we'll be glad to point out how to improve the memory usage.
If you are running out of memory, try using the standard array module or the numpy library. Both of those provide efficient arrays for holding large numbers of values. Your only running out of memory because python's lists don't store their values in a memory efficient manner.
If you are running out of memory for some reason even taking that into account, post that code here and we'll be glad to point out how to improve the memory usage.
Context
StackExchange Code Review Q#6477, answer score: 3
Revisions (0)
No revisions yet.