HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Ideal value of d in a d-ary heap for Dijkstra's algorithm

Submitted by: @import:stackexchange-cs··
0
Viewed 0 times
idealdijkstraheapvaluealgorithmaryfor

Problem

I stumbled upon the following statement:


By using a $ d $-ary heap with $ d = m/n $, the total times for these two types of operations may be balanced against each other, leading to a total time of $ O(m \log_{m/n} n) $ for the algorithm, an improvement over the $ O(m \log n) $ running time of binary heap versions of these algorithms whenever the number of edges is significantly larger than the number of vertices

Applications of d-ary heaps

I don't understand why we choose to have a heap where nodes have exactly $ m/n $ children to speed up Dijkstra's algorithm. Remove min takes overall $ O(n \log_d n) $ time and decrease takes $ O(m \log_d n) $, so total runtime is $ O(m \log_d n) $.

What I don't understand is say we have $ m=3, n=1 $, $ m/n $ gives 3, but $ O(m \log_3 n) $ is slower than $ O(m \log_4 n) $, so why not choose 4 as value of $ d $ instead? What motivates taking $ m/n $?

Thanks!

Solution

Please note that $ m \leq n(n-1)/2 $, thus $m/n < n/2$. So the example $m=3, n=1$ does not happen.

First, the article clearly claims that this improvement is achieved whenever $ m $ is larger than $ n $,


By using a $d$-ary heap with $d = m/n$, ..., an improvement over the O(m
log n) running time of binary heap versions of these algorithms
whenever the number of edges is significantly larger than the number
of vertices.

Secondly, the article claims that $d$ must create a balance between $m$ and $n$ in order to improve the running time,


By using a $d$-ary heap with $d = m/n$, the total times for these
two types of operations may be balanced against each other...

Generally, we need $n$ min-extract and $m$ decrease-key operations. In the case of $d=2$, each of these operations takes $O(\log n)$ time. However, when you increase $d$, the decrease-key operation becomes faster, but min-extract can become slower. Because, within each min-extract, we also need to run min-heapify that depends on the number of children of the parent in each level.

In the extreme case, if you consider $ d=n $ then $m \log_n n < m \log_{m/n} n$, but the running time won't be better in practice. Because you need at most $ n $ min-extract operations, each includes a min-heapify. Therefore, you need $ n $ min-heapify operations, that will cost $ O(n) $ in this case (because $d=n$ and there are $n$ numbers in the next level of your tree). Thus the running time of your algorithms becomes $O(n^2)$ ($n$ times running min-heapify).

Context

StackExchange Computer Science Q#55611, answer score: 4

Revisions (0)

No revisions yet.