patternMinor
Confusion with analysis of hashing with chaining
Viewed 0 times
withanalysischainingconfusionhashing
Problem
I was attending a class on analysis of hash tables implemented using chaining, and the professor said that:
In a hash table in which collisions are resolved by
chaining, an search (successful or unsuccessful) takes average-case time θ(1 + α),
under the assumption of simple uniform hashing.
and
The worst-case time for searching is θ(n) plus the time to
compute the hash function.
These are quoted from out textbook, ITA. Here α is the load factor, which is equal to n/m where n is the total number of elements to be inserted to the hash table and m is the size of the hash table (which is a constant for each implementation).
My confusion arises from the thinking that
θ(1 + α) = θ(α) = θ(n/m) = θ(n)
So I claimed that θ(1 + α) = θ(n), ie the average case and worst case time complexities are the same and the professor disagrees with me.
She asked me to find out by myself why I am wrong. I did some brainstorming and the only deduction I can make is that either I am wrong in assuming that m is a constant or asymptotic notations can't be used to compare between worst-case and average-case running time in this case. Please help me find what is wrong here. Also I don't understand why θ(1 + α) is not written simply as θ(α) and also why the professor insists on the value of α being less than, equal to, or greater than 1
In a hash table in which collisions are resolved by
chaining, an search (successful or unsuccessful) takes average-case time θ(1 + α),
under the assumption of simple uniform hashing.
and
The worst-case time for searching is θ(n) plus the time to
compute the hash function.
These are quoted from out textbook, ITA. Here α is the load factor, which is equal to n/m where n is the total number of elements to be inserted to the hash table and m is the size of the hash table (which is a constant for each implementation).
My confusion arises from the thinking that
θ(1 + α) = θ(α) = θ(n/m) = θ(n)
So I claimed that θ(1 + α) = θ(n), ie the average case and worst case time complexities are the same and the professor disagrees with me.
She asked me to find out by myself why I am wrong. I did some brainstorming and the only deduction I can make is that either I am wrong in assuming that m is a constant or asymptotic notations can't be used to compare between worst-case and average-case running time in this case. Please help me find what is wrong here. Also I don't understand why θ(1 + α) is not written simply as θ(α) and also why the professor insists on the value of α being less than, equal to, or greater than 1
Solution
The reason that we are using $\Theta(1+\alpha)$ rather than $\Theta(\alpha)$ is that $\alpha$ could be very small. Imagine for example that the hidden constant in both $\Theta$s is one, and that $\alpha=0.000001$. There is a large difference between $1.000001$ and $0.000001$.
Regarding your wrong deduction, note that the first paragraph mentions average case and the second worst case. It's true that the average case is at most the worst case, but not the other way around.
Regarding your wrong deduction, note that the first paragraph mentions average case and the second worst case. It's true that the average case is at most the worst case, but not the other way around.
Context
StackExchange Computer Science Q#53892, answer score: 4
Revisions (0)
No revisions yet.