patterncppMinor
Melsort — Sorting with encroaching lists
Viewed 0 times
sortingwithmelsortencroachinglists
Problem
For the nth time already I come back with yet another alien sorting algorithm. Today I will present Melsort, implemented from the description I found in the paper Encroaching Lists as a Measure of Presortedness by Skiena.
Melsort
Melsort is a two-step algorithm: first it constructs a set of encroaching lists, then it merges them all until there is only one sorted list left. The way it creates list is as follows: it takes the first element to sort and puts it in the first list, then for every other element it takes it, compares it to the head and the tail of the first list; if it is lesser than the head, it prepends it to the head of this list, and if it is greater than the tail, it appends it to the end of the list. If the element to sort is neither lesser than the head nor greater than the tail, it tries to perform the same checks and insertions with the second list, then the third, etc... If it can't fit any of the existing lists, it creates a new one and appends the element to it.
To make it clearer, imagine that we want to sort the following list: \$\{1, 9, 2, 3, 4, 6, 7, 0, 5, 8\}\$. The resulting lists \$S_i\$ will be \$S_1 = \{0, 1, 9\}\$, \$S_2 = \{2, 3, 4, 6, 7, 8\}\$ and \$S_3 = \{5\}\$. Then melsort will merge \$S_3\$ into \$S_2\$, then the resulting \$S_2\$ into \$S_1\$ and the resulting list will be sorted.
Note that for every resulting list, we have \$head(S_n) \le head(S_{n+1})\$ and \$tail(S_n) \ge tail(S_{n+1})\$. The authors decide to call the lists « encroaching » because of these specific patterns.
Pseudocode
I wish I could copy and paste here the pseudo-code present in the paper, but I'm not sure about the legal ramifications... However, the paper describing neatsort also provides some pseucode for melsort, and I guess there isn't a problem with that one, so here we go:
```
Input: A list X of length n.
Output: the ordered version of the input list.
listCount := 1;
put X_1 in list_1;
for i := 2 to n do
for j := 1 to listCount do
Melsort
Melsort is a two-step algorithm: first it constructs a set of encroaching lists, then it merges them all until there is only one sorted list left. The way it creates list is as follows: it takes the first element to sort and puts it in the first list, then for every other element it takes it, compares it to the head and the tail of the first list; if it is lesser than the head, it prepends it to the head of this list, and if it is greater than the tail, it appends it to the end of the list. If the element to sort is neither lesser than the head nor greater than the tail, it tries to perform the same checks and insertions with the second list, then the third, etc... If it can't fit any of the existing lists, it creates a new one and appends the element to it.
To make it clearer, imagine that we want to sort the following list: \$\{1, 9, 2, 3, 4, 6, 7, 0, 5, 8\}\$. The resulting lists \$S_i\$ will be \$S_1 = \{0, 1, 9\}\$, \$S_2 = \{2, 3, 4, 6, 7, 8\}\$ and \$S_3 = \{5\}\$. Then melsort will merge \$S_3\$ into \$S_2\$, then the resulting \$S_2\$ into \$S_1\$ and the resulting list will be sorted.
Note that for every resulting list, we have \$head(S_n) \le head(S_{n+1})\$ and \$tail(S_n) \ge tail(S_{n+1})\$. The authors decide to call the lists « encroaching » because of these specific patterns.
Pseudocode
I wish I could copy and paste here the pseudo-code present in the paper, but I'm not sure about the legal ramifications... However, the paper describing neatsort also provides some pseucode for melsort, and I guess there isn't a problem with that one, so here we go:
```
Input: A list X of length n.
Output: the ordered version of the input list.
listCount := 1;
put X_1 in list_1;
for i := 2 to n do
for j := 1 to listCount do
Solution
I have not benchmarked this, so take this with a grain of salt.
In the merge step you are iterating over the lists repeatedly. If you consider a pathological case that results in \$\frac{n}{2}\$ lists of \$2\$ elements each. Then on the first step you reduce to \$\frac{n}{4}\$ lists of \$4\$ elements each if I read your code correctly, then \$\frac{n}{8}\$ lists and so forth.
This means you will do \$\log_2\left(n\right)\$ passes and in each pass you will iterate once over all \$n\$ elements. Giving you \$\mathcal{O}\left(n\cdot\log_2\left(n\right)\right)\$ steps through the linked list.
Iterating through all elements in a linked list is \$\mathcal{O}\left(n\right)\$ just as it is for a vector. But in my experience the constant factor for iterating through a list is is a lot larger than for a vector, I'm not talking about 10-20% but more like 300-2000% larger.
This is due to the difference in locality of reference, for a vector it is contiguous memory accesses and the CPU's prefetcher is going to ❤❤❤ you for it. However for a list you have to chase the pointers and worst case is you get a cache miss on every pointer (can happen if your list size exceeds the CPU cache size) in the list resulting in a \$\approx200\$ cycle delay on each node. A custom allocation strategy for list nodes can mitigate this to some degree but it is like putting a band-aid on a stab wound, you need stitches.
My gut feeling is that the \$\mathcal{O}\left(n\right)\$ insertion time at the head of a vector in the encroaching phase is going to be recovered and then some in the merge phase due to higher locality of reference. But as always you should benchmark to find out.
In fact I think you might be able to use
In the merge step you are iterating over the lists repeatedly. If you consider a pathological case that results in \$\frac{n}{2}\$ lists of \$2\$ elements each. Then on the first step you reduce to \$\frac{n}{4}\$ lists of \$4\$ elements each if I read your code correctly, then \$\frac{n}{8}\$ lists and so forth.
This means you will do \$\log_2\left(n\right)\$ passes and in each pass you will iterate once over all \$n\$ elements. Giving you \$\mathcal{O}\left(n\cdot\log_2\left(n\right)\right)\$ steps through the linked list.
Iterating through all elements in a linked list is \$\mathcal{O}\left(n\right)\$ just as it is for a vector. But in my experience the constant factor for iterating through a list is is a lot larger than for a vector, I'm not talking about 10-20% but more like 300-2000% larger.
This is due to the difference in locality of reference, for a vector it is contiguous memory accesses and the CPU's prefetcher is going to ❤❤❤ you for it. However for a list you have to chase the pointers and worst case is you get a cache miss on every pointer (can happen if your list size exceeds the CPU cache size) in the list resulting in a \$\approx200\$ cycle delay on each node. A custom allocation strategy for list nodes can mitigate this to some degree but it is like putting a band-aid on a stab wound, you need stitches.
My gut feeling is that the \$\mathcal{O}\left(n\right)\$ insertion time at the head of a vector in the encroaching phase is going to be recovered and then some in the merge phase due to higher locality of reference. But as always you should benchmark to find out.
In fact I think you might be able to use
std::deque that has \$\mathcal{O}\left(1\right)\$ insertion at the front and random access. However I believe that std::deque is implemented as a list of vectors, in which case the size of the vectors is important, this is one of these cases where a custom data structure might benefit you.Context
StackExchange Code Review Q#124980, answer score: 3
Revisions (0)
No revisions yet.