HiveBrain v1.2.0
Get Started
← Back to all entries
patternjavascriptMinor

Sieve of Eratosthenes performance; Scala very slow compared to Node.js

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
nodesieveslowscalaperformanceverycomparederatosthenes

Problem

I am new to Scala so it might show. I am learning it for one of my classes and have to write a performance benchmark. I have written the Sieve of Eratosthenes in both Node.js and Scala, and my Node.js version runs much much quicker for n larger than 100,000. At n = 100,000 the Scala version runs in around a minute where as the Node.js one runs in milliseconds.

As far as I know the Scala version is implemented correctly as it finds the first few primes correctly.

The big disparity in time leads me to believe it's an algorithmic issue and the two pieces of code likely have different time complexities, but looking at the two pieces of code I don't see any stark differences other than iterating the list a couple extra times in the scala version (with the zipIndex and mapping).

Any help would be appreciated.

This is my Scala method

// Sieve of Eratosthenes
// pseudo code from https://en.wikipedia.org/wiki/Sieve_of_Eratosthenes
def sieve (n: Integer) = {
// Let A be an array of Boolean values, indexed by integers 2 to n,
// initially all set to true.
var A = new ListBuffer[Boolean]()
var i = 2
for (i b }.map{ case (b, i) => i + 2 };
}


This is my Node.js code:

let _ = require('lodash');

// Sieve of Eratosthenes
// https://en.wikipedia.org/wiki/Sieve_of_Eratosthenes

function int (str, base=10) {
return parseInt(str, base);
}

// Input: an integer n > 1.
let n = int(process.argv[2]);

// Let A be an array of Boolean values, indexed by integers 2 to n,
// initially all set to true.
let A = _.range(n+1)
A[0] = false
A[1] = false

// for i = 2, 3, 4, ..., not exceeding √n:
// if A[i] is true:
// for j = i2, i2+i, i2+2i, i2+3i, ..., not exceeding n:
// A[j] := false.
for(let i = 2; i el);

Solution

Lists vs. Arrays

The Scala version of your code ran surprisingly slower because you used a ListBuffer. In Scala you can think of a List as a traditional linked-list. Lists don't provide random access to their contents. That is, accessing an object at index n requires O(n) time. This really slowed down your code because you index into A in both your inner and outer for-loop.

To fix this problem you can simply use a different data structure like an Array.

object Sieve extends App {
  val n = 100000
  val a = Array.fill(n + 1)(true)
  a(0)  = false
  a(1)  = false
  val k = scala.math.sqrt(n).toInt

  for (i <- 2 to k; if a(i)) 
    for (j <- i * i to n by i)
      a(j) = false

  val res = a.zipWithIndex.filter(_._1).map(_._2)
  assert(res.length == 9,592)
}


Other Notes

  • _._1 represents accessing the first element in a tuple



  • Tuples are one indexed



  • So if we have a tuple t we can access the 5th element with t._5



  • The first underscore is what is known as placeholder syntax. For example:



val a = List.range(0, 9)
    val b = a.map(i => i + 1) 
    val c = a.map(_ + 1)       // placeholder syntax


  • In the above code snippet b and c are equivalent



  • When we are using a map operation on a List of tuples we can combine the tuple element access operator (t._N) and placeholder syntax. As a result we get code that looks something like:



val res = a.zipWithIndex.filter(_._1).map(_._2)

Code Snippets

object Sieve extends App {
  val n = 100000
  val a = Array.fill(n + 1)(true)
  a(0)  = false
  a(1)  = false
  val k = scala.math.sqrt(n).toInt

  for (i <- 2 to k; if a(i)) 
    for (j <- i * i to n by i)
      a(j) = false

  val res = a.zipWithIndex.filter(_._1).map(_._2)
  assert(res.length == 9,592)
}
val a = List.range(0, 9)
    val b = a.map(i => i + 1) 
    val c = a.map(_ + 1)       // placeholder syntax
val res = a.zipWithIndex.filter(_._1).map(_._2)

Context

StackExchange Code Review Q#157493, answer score: 6

Revisions (0)

No revisions yet.