HiveBrain v1.2.0
Get Started
← Back to all entries
patternModerate

Kosaraju in Scala

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
scalakosarajustackoverflow

Problem

I started coding in Scala some time ago, and also learning some fundamental algorithms in CS.

Here's a terribly slow implementation of Kosaraju algorithm to find strongly connected components in a graph.

I'm looking for three things:

  • Learning how to use Scala data structures



  • Learning how to implement this algorithm in \$O(m+n)\$ time



  • Learning more about Scala's best practices, and how to implement them.



The input style is the following. The first number is a vertex, the second a directed connection in the graph:

1 2
2 3
3 1
3 4
5 4
6 4
8 6
6 7
7 8


Original implementation:

```
import scala.io.Source
import util.Random.nextInt
import scala.collection.{mutable, immutable}
import scala.runtime.ScalaRunTime._
import scala.util.control.Breaks._

object Kosaraju {
val usage = """
Usage: scala Kosaraju.scala [filename]
"""

var t: Int = 0
var s: Int = 0
var scc: Int = 0
var finish_list: mutable.HashMap[Int, Int] = mutable.HashMap[Int,Int]()
var explored_list: List[Int] = List()
var leader: mutable.HashMap[Int,mutable.Buffer[Int]] = mutable.HashMap[Int,mutable.Buffer[Int]]()
var scc_map: mutable.HashMap[Int,mutable.Buffer[Int]] = mutable.HashMap[Int,mutable.Buffer[Int]]()

def main (args: Array[String]) {
if (args.length != 1) {
println(usage)
return
}
val filename = args.toList(0)
val edges: mutable.Buffer[Array[Int]] =
Source.fromFile(filename).getLines().map(_.split(" ").map(_.toInt)).toBuffer

//edges.foreach(e => println(e(0).toString + ' ' + e(1).toString))

var double_adj_list: mutable.HashMap[Int, mutable.Buffer[mutable.Buffer[Int]]] =
mutable.HashMap[Int,mutable.Buffer[mutable.Buffer[Int]]]()

println("Building double adjacency list....")

edges.foreach { e =>
if (! double

Solution

The first thing that I see is a fundamental issue that would be a problem regardless of the programming language used - long procedures.

The folks who study the human aspects of software development have strong evidence suggesting that readability is critical to making code easier to understand and therefore maintain. One of the critical factors in readability is length - longer procedures are harder to read/understand.

One good way to improve readability is to extract out parts of a long method that do one specific thing.

Take, for example, these lines from the beginning of the main function:

if (args.length != 1) {
  println(usage)
  return
}
val filename = args.toList(0)
val edges: mutable.Buffer[Array[Int]] = 
  Source.fromFile(filename).getLines().map(_.split(" ").map(_.toInt)).toBuffer


These lines load the edges from a file. Yes, they also validate the parameter list as part of that, but mainly they load the edges. So you could extract that into a function that looked something like this.

def loadEdges(args: Array[String]): Option[mutable.Buffer[Array[Int]]] = {
  if (args.length == 1) { 
      val filename = args(0)
      Source.fromFile(filename).getLines().map(_.split(" ").map(_.toInt)).toBuffer
  } else {
      println(usage)
      None
  }
}


As a learning exercise, you should go through the rest of the code looking for other blocks that do one thing.

So in main, the beginning might look something like this:

def main (args: Array[String]) {
  val edges = loadEdges(args) match {
    case None => return
    case Some(e) => e
  }


Someone reading this can see that the first thing that happens is that the edges are loaded. It says so right there. Much clearer.

I did make edges a val rather than a var, because it is never updated, and I left off the type because it's kind of long and ugly and I'm not entirely convinced that putting it in actually says anything useful. Some organizations have coding standards that require every variable to be fully defined so Your Mileage May Vary.

You could also use a for comprehension, especially if main gets significantly shorter. That would look something like this:

def main(args: Array[String]) {
  for (edges <- loadEdges(args)) {
    // the rest of the algorithm goes here
    // edges is a "variable" defined only within the scope of the "loop"
  }
}

Code Snippets

if (args.length != 1) {
  println(usage)
  return
}
val filename = args.toList(0)
val edges: mutable.Buffer[Array[Int]] = 
  Source.fromFile(filename).getLines().map(_.split(" ").map(_.toInt)).toBuffer
def loadEdges(args: Array[String]): Option[mutable.Buffer[Array[Int]]] = {
  if (args.length == 1) { 
      val filename = args(0)
      Source.fromFile(filename).getLines().map(_.split(" ").map(_.toInt)).toBuffer
  } else {
      println(usage)
      None
  }
}
def main (args: Array[String]) {
  val edges = loadEdges(args) match {
    case None => return
    case Some(e) => e
  }
def main(args: Array[String]) {
  for (edges <- loadEdges(args)) {
    // the rest of the algorithm goes here
    // edges is a "variable" defined only within the scope of the "loop"
  }
}

Context

StackExchange Code Review Q#101339, answer score: 13

Revisions (0)

No revisions yet.