patternMinor

Linear sorting algorithm

Submitted by: @import:stackexchange-cs·Mar 10, 2026·

Viewed 0 times

sorting time-complexity cs stackoverflow

sortinglinearalgorithm

Problem

I'm trying to understand sorting algorithms. And I'm thinking about a totally different way to tackle sorting: Use array indexes to sort a given set of integers.

Suppose We want to sort a list of positive integers (for now). Let's call it toSortArray

Create a boolean array that has a length equal to maximum integer in toSortArray + 1. Lets call it boolArray. All items in the array are false initially.

Loop through toSortArray and for each integer item, set the boolean at index item true: boolArraty[toSortArray[i]] = true;

Loop through boolArray and if boolArray[i] is true, set the first element of toSortArray to i

Here is the pseudo-code if the algorithm is not clear:

var boolArray = new bool[toSortArray.Max() + 1];

    for (int i = 0; i < toSortArray.Length; i++)
    {
        boolArray[toSortArray[i]] = true;
    }

    var index = 0;
    for (int i = 0; i < boolArray.Length; i++)
    {
        if (boolArray[i])
        {
            toSortArray[index] = i;
            index++;
        }
    }

First, Does every body agree that time complexity is O(N)?

Second, Am I missing something about this algorithm (As I don't think I'm the first one thinking about it)?

Solution

This is algorithm called bucket sort, integer sort, counting sort, or Pigeonhole sort (naming is a bit inconsistent across sources and/or different names refer to slight variations of the same idea).

Its time complexity is $O(m)$ (assuming distinct values) where $m$ is the largest among the $n$ integers you are trying to sort. Your algorithm takes $O(n)$ time as long as $m=O(n)$, for example if you know you that you want to sort a large-enough subset of the integers from $1$ to $n$.

If the integers are not distinct, you can replace booleans with counters to obtain an algorithm with complexity $O(m + n)$. In general, you can sort arbitrary items with integer keys by keeping an array of lists, where the $i$-th list will contain all items with key $i$. The output is just concatenation of these lists (and hence this sorting algorithm is stable as long as you append new elements at the end of the lists).

If you apply the above (stable) algorithm multiple times on your input, where the $j$-th execution sorts the integers w.r.t. the $j$-th least significant digit you obtain Radix Sort.

For example, in base $10$, you'd need $\lceil \log_{10} m \rceil$ iterations, thus obtaining an algorithm with time complexity $O( n \log m )$, which is in $O(n \log n)$ even when $m$ is exponentially larger in $n$ (on a unit-cost RAM machine). For a general base $b$, the time needed would be $O((b+n) \log_b m) = O((b+n) \frac{\log m}{\log b})$, which is optimized (in an asymptotic sense) by choosing $b = \Theta(n)$, resulting in a time complexity of $O(n \frac{\log m}{\log n})$.

Context

StackExchange Computer Science Q#116451, answer score: 3

Revisions (0)

No revisions yet.