snippetcppMinor

Radix sort (LSD) and counting sort

Submitted by: @import:stackexchange-codereview·Mar 10, 2026·

Viewed 0 times

countinglsdradixandsort

Problem

I'm reading CLRS, and to practice, I rolled my version of radix sort and counting sort. I was looking at some reference implementations, particularly the LSD one from Rosetta Code, and mine performs significantly better (~10 times), especially on 64 bit inputs and if the maximum input range is known.

I think one place that could be improved is the creation of res inside counting sort. It'd be faster if I could somehow use reserve as right now I default initialize everything and then assign over them. However, I can't grow it using push_back since in the following loop, res is grown randomly, rather than linearly as required by push_back. I don't know of any data structure that can have a runtime set size with unintialized values that also gets optimized nicely in the copy line after the loop.

I experimented with variable length arrays such as T res[n] and it works up to the stack limit. However, limiting input size to the size of my stack is not acceptable. Anyway, this change did not have a noticeable impact on performance.

Suggestions for improvements would be appreciated.

Repository of algorithms

Counting sort

// counting sort, assumes each input is integral between 0 to k
// O(n) if k = O(n)
template 
void cnt_sort(Iter begin, Iter end, size_t k, Op op) {
    vector counts(k);   // init to 0
    for (auto i = begin; i != end; ++i) // count # elems == i
        ++counts[op(*i)]; 
    for (size_t i = 1; i  res(distance(begin, end)); // doing useless initialization here
    for (auto j = end;;) {
        --j;
        res[--counts[op(*j)]] = *j;
        if (j == begin) break;
    }
    copy(res.begin(), res.end(), begin); // compiler optimizes this out
}

Radix sort

```
// radix sort, more practical than counting sort
// O(d(n + k)) running time where d is # digits, k is size of digit
class Digit_cmp { // functor for comparing a "digit" (particular bits)
const long long mask; // 0..63 bitfield to test against
const size_t to_sh

Solution

I found a list a few things that you could improve, but nothing that would significantly change the way your sorts work:

-
At the end of the counting sort, you could use std::move instead of std::copy to move the elements to the original collection. For simple integers, it shouldn't change anything but it could make a significant difference if you try to sort big integers.

-
There is a more idiomatic way to compute sizeof(typename Iter::value_type)*CHAR_BIT in the standard library thanks to std::numeric_limits::digits, so you can actually rewrite rdx_sort as follows:

template    // range of input not known, just use max ex. 32 bits for ints
void rdx_sort(Iter begin, Iter end) {
    int bits {std::numeric_limits::digits};
    rdx_sort(begin, end, bits);
}

Any number library (like a big number library) is allowed to specialize std::numeric_limits and every decent numbers library does so, so I shouldn't make your code less portable. By the way, the returned value is not always int, so the best thing you could do is to use a type alias to make things clearer:

template    // range of input not known, just use max ex. 32 bits for ints
void rdx_sort(Iter begin, Iter end) {
    using value_type = typename Iter::value_type
    value_type bits {std::numeric_limits::digits};
    rdx_sort(begin, end, bits);
}

-
Be careful of wild assumptions:

const long long mask; // 0..63 bitfield to test against

long long is not guaranteed to be a 64-bit integer, even if it is more often the case than not. If you want to avoid problems, you should use std::int64_t from `

. This type only exists if the architecture actually has a 64-bit integer, bug I guess that you will have more problems anyway if your implementation does not.

const std::int64_t mask; // 0..63 bitfield to test against


- 
It seems that you are using

using namespace std;

, which is something you should really avoid in a header-only library like yours. Using this directive will import every name from

std::

 into the global namespace for anyone using your library and you will for sure end with name clashes at some point. Consider

std::-qualifying everything from the C and C++ standard libraries (except macros of course), even std::size_t

: some implementations don't import the names from the C standard library into the global namespace.

- 
Dropping random vowels from function names isn't going to help anymore. Consider using full names like

counting_sort and radix_sort` instead.

Code Snippets

template <typename Iter>   // range of input not known, just use max ex. 32 bits for ints
void rdx_sort(Iter begin, Iter end) {
    int bits {std::numeric_limits<typename Iter::value_type>::digits};
    rdx_sort(begin, end, bits);
}

template <typename Iter>   // range of input not known, just use max ex. 32 bits for ints
void rdx_sort(Iter begin, Iter end) {
    using value_type = typename Iter::value_type
    value_type bits {std::numeric_limits<value_type>::digits};
    rdx_sort(begin, end, bits);
}

const long long mask; // 0..63 bitfield to test against

const std::int64_t mask; // 0..63 bitfield to test against

Context

StackExchange Code Review Q#70359, answer score: 3

Revisions (0)

No revisions yet.