patternMinor
Why can't hash tables provide O(n) sorting?
Viewed 0 times
sortingwhycantablesprovidehash
Problem
Since a sufficiently large hash table takes constant time to both insert and retrieve data, should it not be possible to sort an array by simply inserting each element into the hash table, and then retrieving them in order?
You just insert each number into the hash table, and remember the lowest and highest number inserted. Then for each number in that range, in order, test if it is present in the hash table.
If the array being sorted contains no gaps between values (i.e. it can be [1,3,2] but NOT [1,3,4]), this should give you O(N) time complexity.
Is this correct? I don't think I've ever heard of hash tables being used this way - am I missing something? Or are the restrictions (numeric array with no gaps) too much for it to be practically useful?
You just insert each number into the hash table, and remember the lowest and highest number inserted. Then for each number in that range, in order, test if it is present in the hash table.
If the array being sorted contains no gaps between values (i.e. it can be [1,3,2] but NOT [1,3,4]), this should give you O(N) time complexity.
Is this correct? I don't think I've ever heard of hash tables being used this way - am I missing something? Or are the restrictions (numeric array with no gaps) too much for it to be practically useful?
Solution
The reason you've never heard of hash tables being used like this is that hash tables are either "too much" or "not enough" in this situation.
-
If the range of elements being sorted is small, then you can use counting sort, or something similar. But for counting sort, you would almost certainly want to use a simple array rather than a hash table. If you know the max and min values of the numbers, then the array can be of size max-min+1, and value x would be associated with index x-min. By using an array, you avoid the extra complications of hash tables. Those extra complications are buying you nothing in this application, so hash tables are "too much".
Notice that your "no gaps" restriction ensures that the range of numbers is small (no larger than the number of elements in your original input).
-
However, if gaps can be present then the range of elements is potentially MUCH larger than the number of elements in your original input. Then your running time is dominated by the size of that range, NOT by the size of your original input. Hash tables do not help you deal with those gaps efficiently, so in this case hash tables are "not enough".
-
If the range of elements being sorted is small, then you can use counting sort, or something similar. But for counting sort, you would almost certainly want to use a simple array rather than a hash table. If you know the max and min values of the numbers, then the array can be of size max-min+1, and value x would be associated with index x-min. By using an array, you avoid the extra complications of hash tables. Those extra complications are buying you nothing in this application, so hash tables are "too much".
Notice that your "no gaps" restriction ensures that the range of numbers is small (no larger than the number of elements in your original input).
-
However, if gaps can be present then the range of elements is potentially MUCH larger than the number of elements in your original input. Then your running time is dominated by the size of that range, NOT by the size of your original input. Hash tables do not help you deal with those gaps efficiently, so in this case hash tables are "not enough".
Context
StackExchange Computer Science Q#43709, answer score: 7
Revisions (0)
No revisions yet.