patternjavaMinor
Java: Given a list of edges, build a tree and return the root
Viewed 0 times
theedgesreturngivenjavarootandlistbuildtree
Problem
In an interview I was asked to write Java code to build a tree and return the root, given a list of edges. It was a fairly open ended question where the interviewer left all the decisions up to me.
I used to following logic: edges of trees are directional, so I made a Pair class which has a start node and an end node. I go over the list of Pairs and I made a HashMap from it in a way that the each key in the map is kept as a node and then the value in the map is the set of its children. Then for each key I iterate over the sets of children and see if that node is a child of any other node. If it is then it is not the node we are looking for, if a node isn't in any children set then it is the root.
There are of course edge cases where there can be a cycle within the tree, so let's say some node connects to the root. Or another case is that the root might be fine, but there are cycles within the children nodes. Or there is another case where there can be multiple distinct trees in the list of edges.
I tried to code this up in the following manner and I have tried to add comments wherever applicable so that the code can be understood more:
```
import java.util.*;
public class TreeBuilder {
public static Node returnRoot(List input){
if(input == null || input.size() == 0){
throw new IllegalArgumentException("Input is null or doesn't contain any elements");
}
// Create a list of roots, in case there are multiple roots present
List roots = new ArrayList<>();
// This boolean checks if there are edges pointing to a particular node
boolean isPresent = false;
HashMap> adjList = buildList(input);
// iterate over each key of the hash map (which is the tree) and check if there are any edges pointing to it
for(Node keyItr: adjList.keySet()) {
for (HashSet setItr : adjList.values()) {
if (setItr.contains(keyItr)) {
I used to following logic: edges of trees are directional, so I made a Pair class which has a start node and an end node. I go over the list of Pairs and I made a HashMap from it in a way that the each key in the map is kept as a node and then the value in the map is the set of its children. Then for each key I iterate over the sets of children and see if that node is a child of any other node. If it is then it is not the node we are looking for, if a node isn't in any children set then it is the root.
There are of course edge cases where there can be a cycle within the tree, so let's say some node connects to the root. Or another case is that the root might be fine, but there are cycles within the children nodes. Or there is another case where there can be multiple distinct trees in the list of edges.
I tried to code this up in the following manner and I have tried to add comments wherever applicable so that the code can be understood more:
```
import java.util.*;
public class TreeBuilder {
public static Node returnRoot(List input){
if(input == null || input.size() == 0){
throw new IllegalArgumentException("Input is null or doesn't contain any elements");
}
// Create a list of roots, in case there are multiple roots present
List roots = new ArrayList<>();
// This boolean checks if there are edges pointing to a particular node
boolean isPresent = false;
HashMap> adjList = buildList(input);
// iterate over each key of the hash map (which is the tree) and check if there are any edges pointing to it
for(Node keyItr: adjList.keySet()) {
for (HashSet setItr : adjList.values()) {
if (setItr.contains(keyItr)) {
Solution
Major point number 1: if the question is:
build a tree and return the root
Do we expect a single data node as a result? Or de we expect a root node of a tree, from which we can traverse the entire tree if we want?
I would expect an actual tree node as the result.
The major questions I would then ask the interviewers is: Can I assume that the edges form a binairy tree? Or should we allow an arbitrary amount of child nodes? Should it be an ordered tree? Can I assume correct input (and document it ofcourse) or should I check if the input will form a valid tree?
They might answer that you can choose for yourself. In which case I would just choose the easiest one to implement and explain what would might have to change if the requirements were different.
So let's say ordering doesn't matter, and that a tree can have multiple children. Then the Node class could look something like this:
That way we can just go over all your
The actual treebuilding implementation would then look something like this:
I'd also have answers ready to the alternative requirements I mentioned earlier. For example: if we should not allow multiple root nodes as input we just need to add the following check:
right before returning the root node.
Another example: if the ordering is important. I would define Comparator for
Now that we got the interview question handled let's look at your code style.
Interface > Concrete Implementation
You did it correctly on this line:
Where you use the more general
should be
The only point where you want to use the concrete implementation (for example
comments should say why, not what
Comments that say what you do are redundant. If they are not, then you should probably rename some variables/methods/... so that they become redundant. And once redundant they should be removed.
For example:
What value does this comment really add to your code? Isn't it clear enough what you do here?
Another examle:
Now I got 2 problems with this comment. First is that I still don't really understood what the variable means. And second that it shows that the name is badly chosen. What is present if the variable is true? Or what is absent if it isn't?
What happens if we rename it to
encapsulation + immutability
Immutability is nice. It often makes things easier to reason about and prevents silly mistakes. Some purists would even go all the way and make everything immutable, I usually stick with: "start immutable, and make it mutable if it's logical to change the state".
A
build a tree and return the root
Do we expect a single data node as a result? Or de we expect a root node of a tree, from which we can traverse the entire tree if we want?
I would expect an actual tree node as the result.
The major questions I would then ask the interviewers is: Can I assume that the edges form a binairy tree? Or should we allow an arbitrary amount of child nodes? Should it be an ordered tree? Can I assume correct input (and document it ofcourse) or should I check if the input will form a valid tree?
They might answer that you can choose for yourself. In which case I would just choose the easiest one to implement and explain what would might have to change if the requirements were different.
So let's say ordering doesn't matter, and that a tree can have multiple children. Then the Node class could look something like this:
public class Node {
private final String data;
private List children = new ArrayList<>();
public Node(String data){
this.data = data;
}
public void addChild(Node node){
children.add(node);
}
public List getChildren(){
return children;
}
public String getData(){
return data;
}
}That way we can just go over all your
Pairs of Nodes and add the second one to the first to actually build our tree. The only thing left after that is to figure out which is the root node.The actual treebuilding implementation would then look something like this:
/**
* This method will attempt to build a tree from the given edges.
* The result is the root node of the tree. (i.e. the Node that is not a child node of any other Node).
* If there are multiple root Nodes it is undefined which one will be returned.
* In case there are no root nodes an IllegalArgumentException is thrown.
*/
public static Node buildTree(List edges) {
Set rootNodes = new HashSet<>();
Set childNodes = new HashSet<>();
for(Pair pair: edges){
pair.start.addChild(pair.end);
rootNodes.remove(pair.end);
childNodes.add(pair.end);
if (!childNodes.contains(pair.start)) {
rootNodes.add(pair.start);
}
}
if (rootNodes.isEmpty()) {
throw new IllegalArgumentException("Input pairs contain a cycle with the root");
}
return rootNodes.iterator().next();
}I'd also have answers ready to the alternative requirements I mentioned earlier. For example: if we should not allow multiple root nodes as input we just need to add the following check:
if(rootNodes.size()>1) {
throw new IllegalArgumentException("Input pairs contains multiple ("+ roots.size() + ") roots");
}right before returning the root node.
Another example: if the ordering is important. I would define Comparator for
Node and use Collections.sort() right after adding a node (if it wasn't the first node added).Now that we got the interview question handled let's look at your code style.
Interface > Concrete Implementation
You did it correctly on this line:
List roots = new ArrayList<>();Where you use the more general
List as type for roots. But the same applies for Set over HashSet or Map over HashMap. For exampleHashMap> adjList = buildList(input);should be
Map> adjList = buildList(input);The only point where you want to use the concrete implementation (for example
ArrayList is when you actually instantiate the list (like you did correctly in your roots variable).comments should say why, not what
Comments that say what you do are redundant. If they are not, then you should probably rename some variables/methods/... so that they become redundant. And once redundant they should be removed.
For example:
// check if there are cycles within the children of this tree and return the root
checkChildCycles(adjList);
return roots.get(0);What value does this comment really add to your code? Isn't it clear enough what you do here?
Another examle:
// This boolean checks if there are edges pointing to a particular node
boolean isPresent = false;Now I got 2 problems with this comment. First is that I still don't really understood what the variable means. And second that it shows that the name is badly chosen. What is present if the variable is true? Or what is absent if it isn't?
What happens if we rename it to
hasChildNode? Or even better, invert the logic and call it isRootNode. That way, once we find a parent node of the node we're checking we say: isRootNode = false. And The following check also reads naturally:if (isRootNode) {
roots.add(keyItr);
}encapsulation + immutability
Immutability is nice. It often makes things easier to reason about and prevents silly mistakes. Some purists would even go all the way and make everything immutable, I usually stick with: "start immutable, and make it mutable if it's logical to change the state".
A
Code Snippets
public class Node {
private final String data;
private List<Node> children = new ArrayList<>();
public Node(String data){
this.data = data;
}
public void addChild(Node node){
children.add(node);
}
public List<Node> getChildren(){
return children;
}
public String getData(){
return data;
}
}/**
* This method will attempt to build a tree from the given edges.
* The result is the root node of the tree. (i.e. the Node that is not a child node of any other Node).
* If there are multiple root Nodes it is undefined which one will be returned.
* In case there are no root nodes an IllegalArgumentException is thrown.
*/
public static Node buildTree(List<Pair> edges) {
Set<Node> rootNodes = new HashSet<>();
Set<Node> childNodes = new HashSet<>();
for(Pair pair: edges){
pair.start.addChild(pair.end);
rootNodes.remove(pair.end);
childNodes.add(pair.end);
if (!childNodes.contains(pair.start)) {
rootNodes.add(pair.start);
}
}
if (rootNodes.isEmpty()) {
throw new IllegalArgumentException("Input pairs contain a cycle with the root");
}
return rootNodes.iterator().next();
}if(rootNodes.size()>1) {
throw new IllegalArgumentException("Input pairs contains multiple ("+ roots.size() + ") roots");
}List<Node> roots = new ArrayList<>();HashMap<Node, HashSet<Node>> adjList = buildList(input);Context
StackExchange Code Review Q#162292, answer score: 4
Revisions (0)
No revisions yet.