patternjavaMinor
BFS implementation to find connected components taking too long
Viewed 0 times
bfslongtoocomponentsconnectedfindimplementationtaking
Problem
This is in continuation to a question I asked here. Given the total number of nodes (employees) and the adjacency list (friendship amongst employees), I need to find all the connected components.
```
public class Main {
static HashMap> friendShips;
public static void main(String[] args) throws IOException {
BufferedReader in= new BufferedReader(new InputStreamReader(System.in));
String dataLine = in.readLine();
String[] lineParts = dataLine.split(" ");
int employeeCount = Integer.parseInt(lineParts[0]);
int friendShipCount = Integer.parseInt(lineParts[1]);
friendShips = new HashMap>();
for (int i = 0; i employees = new HashSet();
for (int i = 1; i > friendBuckets = bucketizeEmployees(employees);
System.out.println(friendBuckets.size());
}
public static void mapFriends(String friendA, String friendB, Map> friendsShipMap) {
if (friendsShipMap.containsKey(friendA)) {
friendsShipMap.get(friendA).add(friendB);
} else {
Set friends = new HashSet();
friends.add(friendB);
friendsShipMap.put(friendA, friends);
}
}
public static Vector> bucketizeEmployees(Set employees) {
Vector> friendBuckets = new Vector>();
while (!employees.isEmpty()) {
String employee = getHeadElement(employees);
Set connectedEmployeesBucket = getConnectedFriends(employee);
friendBuckets.add(connectedEmployeesBucket);
employees.removeAll(connectedEmployeesBucket);
}
return friendBuckets;
}
private static Set getConnectedFriends(String friend) {
Set connectedFriends = new HashSet();
connectedFriends.add(friend);
Set queuedFriends = new LinkedHashSet();
if (friendShips.get(friend) != null) {
queuedFriends.addAll(friendShips.get(friend));
}
while (!queuedFr
```
public class Main {
static HashMap> friendShips;
public static void main(String[] args) throws IOException {
BufferedReader in= new BufferedReader(new InputStreamReader(System.in));
String dataLine = in.readLine();
String[] lineParts = dataLine.split(" ");
int employeeCount = Integer.parseInt(lineParts[0]);
int friendShipCount = Integer.parseInt(lineParts[1]);
friendShips = new HashMap>();
for (int i = 0; i employees = new HashSet();
for (int i = 1; i > friendBuckets = bucketizeEmployees(employees);
System.out.println(friendBuckets.size());
}
public static void mapFriends(String friendA, String friendB, Map> friendsShipMap) {
if (friendsShipMap.containsKey(friendA)) {
friendsShipMap.get(friendA).add(friendB);
} else {
Set friends = new HashSet();
friends.add(friendB);
friendsShipMap.put(friendA, friends);
}
}
public static Vector> bucketizeEmployees(Set employees) {
Vector> friendBuckets = new Vector>();
while (!employees.isEmpty()) {
String employee = getHeadElement(employees);
Set connectedEmployeesBucket = getConnectedFriends(employee);
friendBuckets.add(connectedEmployeesBucket);
employees.removeAll(connectedEmployeesBucket);
}
return friendBuckets;
}
private static Set getConnectedFriends(String friend) {
Set connectedFriends = new HashSet();
connectedFriends.add(friend);
Set queuedFriends = new LinkedHashSet();
if (friendShips.get(friend) != null) {
queuedFriends.addAll(friendShips.get(friend));
}
while (!queuedFr
Solution
-
First of all, two things to read or search for: Cluster analysis (just a hint, I'm not an expert about that) and Linked: The New Science of Networks by Albert-László Barabási.
Barabási in his book shows that networks usually have some nodes which have a lot more connections than the others. The distribution is not the same in the real world as the sample shell script generates.
-
The code is quite good, I like your variable and method names and separated methods. I wonder why haven't anyone reviewed it yet.
-
I'd use a simple
-
In the
could be a Queue. It has a
-
-
Should I always use the private access modifier for class fields?; Item 13 of Effective Java 2nd Edition: Minimize the accessibility of classes and members.
-
Instead of
So, it could be removed.
-
This method calls
It's confusing: what is the difference between an employee and friend? Are they the same?
-
The following is the same:
A guard clause would be even better.
-
The
Note that insertion order is not affected if an element is re-inserted into the set.
-
The following pattern occurs more than once in the code:
It might be a microoptimization, but if you profile the code and the results shows it as a bottleneck the following structure is the same:
-
A few guard clause would help to make
First of all, two things to read or search for: Cluster analysis (just a hint, I'm not an expert about that) and Linked: The New Science of Networks by Albert-László Barabási.
Barabási in his book shows that networks usually have some nodes which have a lot more connections than the others. The distribution is not the same in the real world as the sample shell script generates.
-
The code is quite good, I like your variable and method names and separated methods. I wonder why haven't anyone reviewed it yet.
-
Vector> friendBuckets = bucketizeEmployees(employees);I'd use a simple
List or ArrayList here. Vector is considered obsolete.-
In the
getConnectedFriends method theSet queuedFriends = new LinkedHashSet();could be a Queue. It has a
poll method. As far as I tested it's faster than the currently used iterator-based remove.-
public class Main {Main isn't a good class name. Everyone can have a Main. What's it purpose? Try to find a more descriptive name.-
static HashMap> friendShips;HashMap reference types should be simply Map. See: Effective Java, 2nd edition, Item 52: Refer to objects by their interfacesShould I always use the private access modifier for class fields?; Item 13 of Effective Java 2nd Edition: Minimize the accessibility of classes and members.
-
static HashMap> friendShips;Instead of
Map> you could use Guava's Multimap (doc, javadoc) which was designed exactly for that. It would reduce the size of the mapFriends method:public static void mapFriends(final String friendA, final String friendB,
final Multimap friendsShipMap) {
friendsShipMap.put(friendA, friendB);
}So, it could be removed.
-
public static Vector> bucketizeEmployees(Set employees) {
...
}This method calls
getConnectedFriends(employee), which is the following:private static Set getConnectedFriends(String friend) {
...
}It's confusing: what is the difference between an employee and friend? Are they the same?
-
if (friendShips.get(friend) != null) {The following is the same:
if (friendShips.containsKey(friend)) {A guard clause would be even better.
if (!friendShips.containsKey(friend)) {
return connectedFriends;
}-
if (!connectedFriends.contains(directFriend) && !queuedFriends.contains(directFriend)) {
queuedFriends.add(directFriend);
}The
!queuedFriends.contains(directFriend) condition is unnecessary, it's a set which can't contain elements twice and adding an already added element to a LinkedHashSet doesn't modify anything. From the javadoc:Note that insertion order is not affected if an element is re-inserted into the set.
-
The following pattern occurs more than once in the code:
if (map.containsKey(key)) {
String value = map.get(key);
...
}
...It might be a microoptimization, but if you profile the code and the results shows it as a bottleneck the following structure is the same:
String value = map.get(key);
if (value != null) {
...
}
...-
A few guard clause would help to make
getConnectedFriends more flatten:while (!queuedFriends.isEmpty()) {
final String poppedFriend = getHeadElement(queuedFriends);
connectedFriends.add(poppedFriend);
if (!friendShips.containsKey(poppedFriend)) {
continue;
}
for (final String directFriend: friendShips.get(poppedFriend)) {
if (connectedFriends.contains(directFriend)) {
continue;
}
queuedFriends.add(directFriend);
}
}Code Snippets
Vector<Set<String>> friendBuckets = bucketizeEmployees(employees);Set<String> queuedFriends = new LinkedHashSet<String>();public class Main {static HashMap<String, Set<String>> friendShips;static HashMap<String, Set<String>> friendShips;Context
StackExchange Code Review Q#23700, answer score: 3
Revisions (0)
No revisions yet.