HiveBrain v1.2.0
Get Started
← Back to all entries
snippetjavaMinor

Read Twitter users from a text file

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
filereadtextuserstwitterfrom

Problem

I have this code for reading Twitter users from a text file represented by integers in the format "int (space) int", where the first int is the ID of
the user doing the following and the second int is the user being followed. It works for small files but takes a really long time with larger files. What can I do to make it more efficient?

users is an ArrayList

public void open() {
    try {
        if (file.exists()) {
            FileReader fileReader = new FileReader(file);
            BufferedReader bufferedReader = new BufferedReader(fileReader);
            String line;
            String[] strings;
            while ((line = bufferedReader.readLine()) != null) {
                strings = line.split(" ");
                String user1 = strings[0];
                String user2 = strings[1];

                if (!users.toString().contains(user1)) {
                    TwitterUser tUser1 = new TwitterUser(
                            Integer.parseInt(user1));
                    users.add(tUser1);
                    TwitterUser tUser2 = new TwitterUser(
                            Integer.parseInt(user2));
                    tUser1.follow(tUser2);
                } else {
                    TwitterUser target = null;
                    for (TwitterUser i : users)
                        if (i.toString().equals(user1))
                            target = i;
                    TwitterUser tUser3 = new TwitterUser(
                            Integer.parseInt(user2));
                    target.follow(tUser3);

                }

            }
            fileReader.close();
        }

    } catch (IOException e) {
        e.printStackTrace();
    }
}

Solution

users.toString().contains(user1)


This is not a very optimal way of checking if a user from the file is in your users object. For starters, ArrayList.toString() will generate the String representation every time, and you may run into false positives. For example, if the toString() representation of a TwitterUser contains a comma, which is the delimiter used by ArrayList:

String user1 = user1.toString(); // = "John, Doe"
String user2 = user2.toString(); // = "Jane, Doe"
// Just to make sure we are using the ArrayList implementation
List users = new ArrayList<>(Arrays.asList(user1, user2));
String allUsers = users.toString(); // = "[John, Doe, Jane, Doe]"
allUsers.contains("Doe, Jane"); // = true


Sure, you are actually comparing numeric values, but consider this then:

String user11 = user11.toString(); // = "11"
String user22 = user22.toString(); // = "22"
// Just to make sure we are using the ArrayList implementation
List users = new ArrayList<>(Arrays.asList(user11, user22));
String allUsers = users.toString(); // = "[11, 22]"
allUsers.contains("1"); // = true


The point here is that ideally, your TwitterUser class should be constructed first so that you can use one of its method to test for equivalence, using equals(Object) for example:

public class TwitterUser {

    // ...

    public boolean equals(Object o) {
        // assuming id is a primitive value
        return o instanceof TwitterUser && ((TwitterUser) o).id == id;
    }
}


Again, you should avoid using toString()-based comparison below for testing equivalence due to the 'false positives' point. Furthermore, you should not skimp on braces ({ ... }), and break early once you have a match:

for (TwitterUser i : users) {
    TwitterUser twitterUser = // ...
    if (i.equals(twitterUser)) {
        target = i;
        break;
    }
}


Hold your horses! If you have implemented TwitterUser.equals() to compare by the ID, you wouldn't even need the for-loop:

TwitterUser user = new TwitterUser(Integer.parseInt(user1));
if (!users.contains(user)) {
    users.add(user);
    user.follow(new TwitterUser(Integer.parseInt(user2)));
} else {
    TwitterUser currentUser = users.get(users.indexOf(user));
    currentUser.follow(new TwitterUser(Integer.parseInt(user2)));
}


To make the performance better, you can consider using a Set instead of a List.

To avoid the huge nested if-block after if (file.exists()), consider return-ing early:

public void open() {
    if (!file.exists()) {
        return;
    }
    try {
        // ...
    }
}


Ideally, the File object should be passed to this method so that the method callers know what exact file is being opened here.

edit: If you claim that the processing still tends to infinity, consider reworking your process:

  • Read all the lines from the file (even if it runs into the millions, it's just two numbers per line).



  • Use a number of background threads to process the lines concurrently. You'll need a Runnable/Callable implementation, an ExecutorService and a thread-safe Collection implementation. This is probably more suited for another question/answer though...

Code Snippets

users.toString().contains(user1)
String user1 = user1.toString(); // = "John, Doe"
String user2 = user2.toString(); // = "Jane, Doe"
// Just to make sure we are using the ArrayList implementation
List<String> users = new ArrayList<>(Arrays.asList(user1, user2));
String allUsers = users.toString(); // = "[John, Doe, Jane, Doe]"
allUsers.contains("Doe, Jane"); // = true
String user11 = user11.toString(); // = "11"
String user22 = user22.toString(); // = "22"
// Just to make sure we are using the ArrayList implementation
List<String> users = new ArrayList<>(Arrays.asList(user11, user22));
String allUsers = users.toString(); // = "[11, 22]"
allUsers.contains("1"); // = true
public class TwitterUser {

    // ...

    public boolean equals(Object o) {
        // assuming id is a primitive value
        return o instanceof TwitterUser && ((TwitterUser) o).id == id;
    }
}
for (TwitterUser i : users) {
    TwitterUser twitterUser = // ...
    if (i.equals(twitterUser)) {
        target = i;
        break;
    }
}

Context

StackExchange Code Review Q#107986, answer score: 4

Revisions (0)

No revisions yet.