snippetjavaMinor
Read Twitter users from a text file
Viewed 0 times
filereadtextuserstwitterfrom
Problem
I have this code for reading Twitter users from a text file represented by integers in the format "int (space) int", where the first
the user doing the following and the second
int is the ID ofthe user doing the following and the second
int is the user being followed. It works for small files but takes a really long time with larger files. What can I do to make it more efficient?users is an ArrayListpublic void open() {
try {
if (file.exists()) {
FileReader fileReader = new FileReader(file);
BufferedReader bufferedReader = new BufferedReader(fileReader);
String line;
String[] strings;
while ((line = bufferedReader.readLine()) != null) {
strings = line.split(" ");
String user1 = strings[0];
String user2 = strings[1];
if (!users.toString().contains(user1)) {
TwitterUser tUser1 = new TwitterUser(
Integer.parseInt(user1));
users.add(tUser1);
TwitterUser tUser2 = new TwitterUser(
Integer.parseInt(user2));
tUser1.follow(tUser2);
} else {
TwitterUser target = null;
for (TwitterUser i : users)
if (i.toString().equals(user1))
target = i;
TwitterUser tUser3 = new TwitterUser(
Integer.parseInt(user2));
target.follow(tUser3);
}
}
fileReader.close();
}
} catch (IOException e) {
e.printStackTrace();
}
}Solution
users.toString().contains(user1)This is not a very optimal way of checking if a user from the file is in your
users object. For starters, ArrayList.toString() will generate the String representation every time, and you may run into false positives. For example, if the toString() representation of a TwitterUser contains a comma, which is the delimiter used by ArrayList:String user1 = user1.toString(); // = "John, Doe"
String user2 = user2.toString(); // = "Jane, Doe"
// Just to make sure we are using the ArrayList implementation
List users = new ArrayList<>(Arrays.asList(user1, user2));
String allUsers = users.toString(); // = "[John, Doe, Jane, Doe]"
allUsers.contains("Doe, Jane"); // = trueSure, you are actually comparing numeric values, but consider this then:
String user11 = user11.toString(); // = "11"
String user22 = user22.toString(); // = "22"
// Just to make sure we are using the ArrayList implementation
List users = new ArrayList<>(Arrays.asList(user11, user22));
String allUsers = users.toString(); // = "[11, 22]"
allUsers.contains("1"); // = trueThe point here is that ideally, your
TwitterUser class should be constructed first so that you can use one of its method to test for equivalence, using equals(Object) for example:public class TwitterUser {
// ...
public boolean equals(Object o) {
// assuming id is a primitive value
return o instanceof TwitterUser && ((TwitterUser) o).id == id;
}
}Again, you should avoid using
toString()-based comparison below for testing equivalence due to the 'false positives' point. Furthermore, you should not skimp on braces ({ ... }), and break early once you have a match:for (TwitterUser i : users) {
TwitterUser twitterUser = // ...
if (i.equals(twitterUser)) {
target = i;
break;
}
}Hold your horses! If you have implemented
TwitterUser.equals() to compare by the ID, you wouldn't even need the for-loop:TwitterUser user = new TwitterUser(Integer.parseInt(user1));
if (!users.contains(user)) {
users.add(user);
user.follow(new TwitterUser(Integer.parseInt(user2)));
} else {
TwitterUser currentUser = users.get(users.indexOf(user));
currentUser.follow(new TwitterUser(Integer.parseInt(user2)));
}To make the performance better, you can consider using a
Set instead of a List.To avoid the huge nested
if-block after if (file.exists()), consider return-ing early:public void open() {
if (!file.exists()) {
return;
}
try {
// ...
}
}Ideally, the
File object should be passed to this method so that the method callers know what exact file is being opened here.edit: If you claim that the processing still tends to infinity, consider reworking your process:
- Read all the lines from the file (even if it runs into the millions, it's just two numbers per line).
- Use a number of background threads to process the lines concurrently. You'll need a
Runnable/Callableimplementation, anExecutorServiceand a thread-safeCollectionimplementation. This is probably more suited for another question/answer though...
Code Snippets
users.toString().contains(user1)String user1 = user1.toString(); // = "John, Doe"
String user2 = user2.toString(); // = "Jane, Doe"
// Just to make sure we are using the ArrayList implementation
List<String> users = new ArrayList<>(Arrays.asList(user1, user2));
String allUsers = users.toString(); // = "[John, Doe, Jane, Doe]"
allUsers.contains("Doe, Jane"); // = trueString user11 = user11.toString(); // = "11"
String user22 = user22.toString(); // = "22"
// Just to make sure we are using the ArrayList implementation
List<String> users = new ArrayList<>(Arrays.asList(user11, user22));
String allUsers = users.toString(); // = "[11, 22]"
allUsers.contains("1"); // = truepublic class TwitterUser {
// ...
public boolean equals(Object o) {
// assuming id is a primitive value
return o instanceof TwitterUser && ((TwitterUser) o).id == id;
}
}for (TwitterUser i : users) {
TwitterUser twitterUser = // ...
if (i.equals(twitterUser)) {
target = i;
break;
}
}Context
StackExchange Code Review Q#107986, answer score: 4
Revisions (0)
No revisions yet.