patternMinor

Why say that breadth-first search runs in time $O(|V|+|E|)$?

Submitted by: @import:stackexchange-cs·Mar 10, 2026·

Viewed 0 times

search-algorithms algorithm-analysis cs stackoverflow

whysearchbreadthsaytimefirstthatruns

Problem

It's often stated (e.g., in Wikipedia) that the running time of breadth-first search (BFS) on a graph $G=(V,E)$ is $O(|V|+|E|)$. However, any connected graph has $|V|\leq |E|+1$ and, even in a non-connected graph, BFS will never look at a vertex outside the component that contains the start vertex. That component contains at most $|E|$ edges, so it contains at most $|E|+1$ vertices, and those are the only ones that the algorithm will visit.

This means that $|V|+|E|\leq 2|E|+1$, so why don't we say that the running time is just $O(|E|)$?

This came up in comments on a question about the running time of Disjkstra's algorithm.

Solution

BFS is usually described something like the following (from Wikipedia).

1  procedure BFS(G,start_v):
 2      let Q be a queue
 3      label start_v as discovered
 4      Q.enqueue(start_v)
 5      while Q is not empty
 6          v = Q.dequeue()
 7          if v is the goal:
 8              return v
 9          for all edges from v to w in G.adjacentEdges(v) do
10             if w is not labeled as discovered:
11                 label w as discovered
12                 w.parent = v
13                 Q.enqueue(w)

The issue is a somewhat subtle one: it's hiding in line 3! The question is, what data structure are we going to use to store which vertices have been discovered?

The simplest solution is to use a Boolean array with one entry per vertex. In this case, we must initialize every element of the array to false and this takes time $\Theta(|V|)$. This applies for every graph, even if there are no edges at all, so we can't assume any relationship between $|V|$ and $|E|$ and we get a running time of $O(|V|+|E|)$.

Can we avoid having a data structure with $\Theta(|V|)$ initialization time? Our first attempt might be to use a linked list. However, now testing if a vertex has been discovered (line 10) takes time linear in the number of vertices visited, instead of constant time as before. This means that the running time becomes $O(|V|\,|E|)$, which is much worse in the worst case. (Note that we don't want to rewrite that as $O(|E|^2)$ since that's even worse: it could be as bad as $|V|^4$, whereas $|V|\,|E|\leq |V|^3$.)

Using a dynamically resized array would allow us to keep the list sorted, so now look-ups would only take time $O(\log|V|)$ but that still gives a running time of only $O(|E|\log|V|)$, which is still worse than standard.

Finally, we could use a dynamically sized hash table: start with a table of constant size $c$ and double it every time it gets half-full. This means that the final size of the table is at most twice the number of vertices that are discovered before the algorithm terminates, and this is at most $|E|+1$ because we never discover anything outside the component of the start vertex. Furthermore, the total amount of work done copying the hash table to expand it is at most $c + 2c + 4c + \dots + 2|E|\leq 4|E|$. Lookups and insertions to the hash table are amortized $O(1)$ so we do indeed obtain a running time of $O(|E|)$.

So $O(|E|)$ is possible, but would want to do that in a real implementation? I would say probably not. Unless you have reason to believe that your input graphs will have many small components, the overhead of maintaining the hash table is going to add a noticeable constant factor to the running time. Growing the hash table could take time $4|E|$ and lookups will require you to compute the hash function and, on average, look at more than one slot in the table. The poor cache performance of hash tables might also hurt you on a real computer. In most cases with the standard array implementation, the $O(|E|)$ part is the dominant term of the $O(|V|+|E|)$ running time so it's not worth using a hash table to remove the dominated term, given the practical cost of doing this.

Code Snippets

1  procedure BFS(G,start_v):
 2      let Q be a queue
 3      label start_v as discovered
 4      Q.enqueue(start_v)
 5      while Q is not empty
 6          v = Q.dequeue()
 7          if v is the goal:
 8              return v
 9          for all edges from v to w in G.adjacentEdges(v) do
10             if w is not labeled as discovered:
11                 label w as discovered
12                 w.parent = v
13                 Q.enqueue(w)

Context

StackExchange Computer Science Q#112546, answer score: 9

Revisions (0)

No revisions yet.