HiveBrain v1.2.0
Get Started
← Back to all entries
patternjavaMinor

Multi-threaded domain status checking in Java

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
multithreadedcheckingstatusjavadomain

Problem

I'm tinkering with the idea of crafting a search engine in my spare time. More of a learning experience than anything at this point, but still a project. A key aspect of this system is checking whether a domain is live or not. That's what this code is trying to do (and is succeeding).

It's written in Java, and using outside classes means that the code is pre-obfuscated! Yay!

```
import java.util.ArrayList;
import java.util.Iterator;

public class BaseCheckDriver extends Thread{

public static void main(String[] args) {
long start = System.currentTimeMillis(), end;

String query = (args.length > 0 && args[0].equals(false) ?
"select * from ... where is_live is null" :
"select * from ...");

int numThreads = 16; // can be changed to however many

ArrayList results = Database.query(query, null);

Database.update("update ... set is_live = null where is_live is not null limit " + (results.size() + 1), null); // gets around "safe updates" and allows for easy monitoring

// distribute results to lists
ArrayList> listContainer = new ArrayList>();
for(int i = 0;i());

for(Object[] row : results){
int addTo = 0;
for(int i=1;i threadContainer = new ArrayList();
for(int i = 0;i results;

public BaseCheckDriver(ArrayList results){
this.results = results;
}

public void run(){
System.out.println(Thread.currentThread().getName() + " Started!");

long start = System.currentTimeMillis(), end;

Iterator resultIterator = results.iterator();
while(resultIterator.hasNext()) Indexer.indexBase(resultIterator.next());

end = System.currentTimeMillis();

System.out.println(Thread.currentThread().getName() + " Done!");
System.out.println("\tTotal execution time: " + (end - start) + "ms");
System.out.println("\tAverage execution time: " + ((end - start) / resul

Solution

Use constants to store constant values

int numThreads = 16;  // can be changed to however many


Instead of that, try

private static final int NUMBER_OF_THREADS = 16;  // can be changed to however many


Now you can easily see that it is a constant value. Also, moving it outside the function makes it available to other methods if you want to use it. Or you could leave it inside the method without the private modifier.

Favor interfaces over implementations

ArrayList results = Database.query(query, null);


As a general rule in Java, when defining the type of a variable, you want to use the interface rather than the implementation. That way if you wanted to change Database.query to return a LinkedList rather than an ArrayList, you could.

List results = Database.query(query, null);


You also might want to consider storing something other than a generic Object array. But that's set in the Database.query method.

Don't forget what you know

for(Object[] row : results){
            int addTo = 0;
            for(int i=1;i<listContainer.size();i++)
                if(listContainer.get(i).size() < listContainer.get(i - 1).size()) addTo = i;
            listContainer.get(addTo).add(row);
        }


You don't have to calculate the correct place every time. You can simply take turns:

int addTo = 0;
        for (Object[] row : results) {
            if (addTo >= listContainer.size()) {
                addTo = 0;
            }

            listContainer.get(addTo).add(row);
            addTo++;
        }


At first glance, this may seem like more code, but notice that it eliminates an entire for loop. Also, the increased number of lines is accompanied by a decrease in the code density. I could get the code length down to one fewer lines following the same pattern as the original. However, one statement per line is generally easier to read and follow.

Note that you could also do this with an iterator.

Iterator> current = listContainer.iterator();
        for (Object[] row : results) {
            if (!current.hasNext()) {
                current = listContainer.iterator();
            }

            current.next().add(row);
        }


That's a little more straightforward about what it is doing.

Note: if you switch to having the threads load new URLs whenever they finish, this will be unnecessary. I think that the point is valid regardless though. There are other circumstances when you will have to do things like this.

Naming

ArrayList threadContainer = new ArrayList();


Previously you used listContainer to indicate a container of lists that hold something else. Your threadContainer is just some threads.

List threads = new ArrayList();


So just name it threads. That's at least as clear about what the variable holds. And shorter.

Code Snippets

int numThreads = 16;  // can be changed to however many
private static final int NUMBER_OF_THREADS = 16;  // can be changed to however many
ArrayList<Object[]> results = Database.query(query, null);
List<Object[]> results = Database.query(query, null);
for(Object[] row : results){
            int addTo = 0;
            for(int i=1;i<listContainer.size();i++)
                if(listContainer.get(i).size() < listContainer.get(i - 1).size()) addTo = i;
            listContainer.get(addTo).add(row);
        }

Context

StackExchange Code Review Q#92517, answer score: 5

Revisions (0)

No revisions yet.