gotchaModerate
Algorithm to find diameter of a tree using BFS/DFS. Why does it work?
Viewed 0 times
whybfsdfsalgorithmdiameterusingdoesfindworktree
Problem
This link provides an algorithm for finding the diameter of an undirected tree using BFS/DFS. Summarizing:
Run BFS on any node s in the graph, remembering the node u discovered last. Run BFS from u remembering the node v discovered last. d(u,v) is the diameter of the tree.
Why does it work ?
Page 2 of this provides a reasoning, but it is confusing. I am quoting the initial portion of the proof:
Run BFS on any node s in the graph, remembering the node u discovered last. Run BFS from u remembering the node v discovered last. d(u,v) is the diameter of the tree.
Correctness: Let a and b be any two nodes such that d(a,b) is the diameter of the tree. There is a unique path from a to b. Let t be the first node on that path discovered by BFS. If the paths $p_1$ from s to u and $p_2$ from a to b do not share edges, then the path from t to u includes s. So
$d(t,u) \ge d(s,u)$
$d(t,u) \ge d(s,a)$
....(more inequalities follow ..)
The inequalities do not make sense to me.
Run BFS on any node s in the graph, remembering the node u discovered last. Run BFS from u remembering the node v discovered last. d(u,v) is the diameter of the tree.
Why does it work ?
Page 2 of this provides a reasoning, but it is confusing. I am quoting the initial portion of the proof:
Run BFS on any node s in the graph, remembering the node u discovered last. Run BFS from u remembering the node v discovered last. d(u,v) is the diameter of the tree.
Correctness: Let a and b be any two nodes such that d(a,b) is the diameter of the tree. There is a unique path from a to b. Let t be the first node on that path discovered by BFS. If the paths $p_1$ from s to u and $p_2$ from a to b do not share edges, then the path from t to u includes s. So
$d(t,u) \ge d(s,u)$
$d(t,u) \ge d(s,a)$
....(more inequalities follow ..)
The inequalities do not make sense to me.
Solution
All parts of proving the claim hinge on 2 crucial properties of trees with undirected edges:
Choose an arbitrary tree node $s$. Assume $u, v \in V(G)$ are nodes with $d(u,v) = diam(G)$. Assume further that the algorithm finds a node $x$ starting at $s$ first, some node $y$ starting at $x$ next. wlog $d(s,u) \geq d(s,v)$. note that $d(s,x) \geq d(s,y)$ must hold, unless the algorithm's first stage wouldn't end up at $x$. We will see that $d(x,y) = d(u,v)$.
The most general configuration of all nodes involved can be seen in the following pseudo-graphics ( possibly $s = z_{uv}$ or $s = z_{xy}$ or both ):
we know that:
1) and 2) imply $\, \\ d(u,v) = d(z_{uv},v) + d(z_{uv},u) \\ \qquad\geq d(z_{uv},x) + d(z_{uv},y) = d(x,y) + 2\, d(z_{uv}, z_{xy}) \\ \qquad\qquad\geq d(x,y)$.
3) and 4) imply $\, \\ d(z_{xy},y) + d(s,z_{xy}) + d(z_{xy},x) \\ \qquad\geq d(s,z_{uv}) + d(z_{uv},u) + d(v,z_{uv}) + d(z_{uv},z_{xy}) \qquad\qquad\qquad\qquad \\ \, $ equivalent to $\, \\ d(x,y) = d(z_{xy},y) + d(z_{xy},x) \\ \qquad\geq 2*\,d(s,z_{uv}) + d(v,z_{uv}) + d(u,z_{uv}) \\ \qquad\qquad\geq d(u,v)$.
therefore $d(u,v) = d(x,y)$.
analogue proofs hold for the alternative configurations
and
these are all possible configurations. in particular, $x \not\in path(s,u), x \not\in path(s,v)$ due to the result of stage 1 of the algorithm and $y \not\in path(x,u), y \not\in path(x,v)$ due to stage 2.
- 1-connectedness (ie. between any 2 nodes in a tree there is exactly one path)
- any node can serve as the root of the tree.
Choose an arbitrary tree node $s$. Assume $u, v \in V(G)$ are nodes with $d(u,v) = diam(G)$. Assume further that the algorithm finds a node $x$ starting at $s$ first, some node $y$ starting at $x$ next. wlog $d(s,u) \geq d(s,v)$. note that $d(s,x) \geq d(s,y)$ must hold, unless the algorithm's first stage wouldn't end up at $x$. We will see that $d(x,y) = d(u,v)$.
The most general configuration of all nodes involved can be seen in the following pseudo-graphics ( possibly $s = z_{uv}$ or $s = z_{xy}$ or both ):
(u) (x)
\ /
\ /
\ /
( z_uv )---------( s )----------( z_xy )
/ \
/ \
/ \
(v) (y)we know that:
- $d(z_{uv},y) \leq d(z_{uv},v)$. otherwise $d(u,v)
- $d(z_{uv},x) \leq d(z_{uv},u)$. otherwise $d(u,v)
- $d(s,z_{xy}) + d(z_{xy},x) \geq d(s,z_{uv}) + d(z_{uv},u)$, otherwise stage 1 of the algorithm wouldn't have stopped at $x$.
- $d(z_{xy},y) \geq d(v,z_{uv}) + d(z_{uv},z_{xy})$, otherwise stage 2 of the algorithm wouldn't have stopped at $y$.
1) and 2) imply $\, \\ d(u,v) = d(z_{uv},v) + d(z_{uv},u) \\ \qquad\geq d(z_{uv},x) + d(z_{uv},y) = d(x,y) + 2\, d(z_{uv}, z_{xy}) \\ \qquad\qquad\geq d(x,y)$.
3) and 4) imply $\, \\ d(z_{xy},y) + d(s,z_{xy}) + d(z_{xy},x) \\ \qquad\geq d(s,z_{uv}) + d(z_{uv},u) + d(v,z_{uv}) + d(z_{uv},z_{xy}) \qquad\qquad\qquad\qquad \\ \, $ equivalent to $\, \\ d(x,y) = d(z_{xy},y) + d(z_{xy},x) \\ \qquad\geq 2*\,d(s,z_{uv}) + d(v,z_{uv}) + d(u,z_{uv}) \\ \qquad\qquad\geq d(u,v)$.
therefore $d(u,v) = d(x,y)$.
analogue proofs hold for the alternative configurations
(u) (x)
\ /
\ /
\ /
( s )---------( z_uv )----------( z_xy )
/ \
/ \
/ \
(v) (y)and
(x) (u)
/ \
/ \
/ \
( s )---------( z_xy )----------( z_uv )
\ /
\ /
\ /
(y) (v)these are all possible configurations. in particular, $x \not\in path(s,u), x \not\in path(s,v)$ due to the result of stage 1 of the algorithm and $y \not\in path(x,u), y \not\in path(x,v)$ due to stage 2.
Code Snippets
(u) (x)
\ /
\ /
\ /
( z_uv )---------( s )----------( z_xy )
/ \
/ \
/ \
(v) (y)(u) (x)
\ /
\ /
\ /
( s )---------( z_uv )----------( z_xy )
/ \
/ \
/ \
(v) (y)(x) (u)
/ \
/ \
/ \
( s )---------( z_xy )----------( z_uv )
\ /
\ /
\ /
(y) (v)Context
StackExchange Computer Science Q#22855, answer score: 14
Revisions (0)
No revisions yet.