patterncppModerate
Simple comparison of sorting algorithms in C++
Viewed 0 times
algorithmssortingsimplecomparison
Problem
I know this has been done a million times before, but this is my implementation of bubble-sort, insertion-sort, merge-sort, heap-sort, and quicksort. Any feedback on how to make it better would be most appreciated. Also, I seem to have a large number of functions in the code before
edit: Also, I don't explicitly use pointers, is there a way that I could do so to be more efficient?
``
//pre-sorted lists and merges them to a single sorted list. This is called on by merge_sort,
//which also recursively calls itself.
void merge(int list[], int p, int q, int r)
{
//n1 and n2 are the lengths of the pre-sorted sublists, list[p..q] and list[q+1..r]
int n1=q-p+1;
int n2=r-q;
//copy these pre-sorted lists to L and R
int L[n1+1];
int R[n2+1];
for(int i=0;i list.nodes[index])
{
largest = left;
} else
{
largest = index;
}
main(). Is it preferable to put these in another file, a header file perhaps? How would I do that?edit: Also, I don't explicitly use pointers, is there a way that I could do so to be more efficient?
``
#include
#include
#include
//Is this less offensive than using the entire std namespace?
using std::cout;
using std::endl;
//These little functions are used by the heap-sort algorithm
#define PARENT(i) ((i - 1) / 2)
#define LEFT(i) (2 * i + 1)
#define RIGHT(i) (2 * i + 2)
//First comes bubble-sort, the most brute-force sorting method.
//Bubble-sort is a simple sorting algorithm that repeatedly steps
//through the list to be sorted, compares each pair of adjacent items
//and swaps them if they are in the wrong order
void bubble_sort(int list[], int size)
{
int temp;
for(int i=0; ii; j--)
{
if(list[j]-1 and list[i]>key)
{
list[i+1]=list[i];
i=i-1;
}
list[i+1]=key;
}
}
//Merge-sort is much faster than insertion-sort in general, and works by
//dividing the array successively into smaller arrays, sorting them, and then
//merging the results. merge_sort is written as two functions, merge` which takes two//pre-sorted lists and merges them to a single sorted list. This is called on by merge_sort,
//which also recursively calls itself.
void merge(int list[], int p, int q, int r)
{
//n1 and n2 are the lengths of the pre-sorted sublists, list[p..q] and list[q+1..r]
int n1=q-p+1;
int n2=r-q;
//copy these pre-sorted lists to L and R
int L[n1+1];
int R[n2+1];
for(int i=0;i list.nodes[index])
{
largest = left;
} else
{
largest = index;
}
Solution
Using
Yes. But still on the lazy size. Is it that hard to type 5 extra characters.
Prefer functions to macros.
If you are going to treat them as functions may as well define them like functions:
Macros stomp all over the namespace/scope rules (and thus are slightly dangerous). They don't work well with any parameters that are complex (because they just use text substitution).
void bubble_sort(int list[], int size)
Sure bubble sort is the most brute force and worst complexity on average. But for a small number of values it is usually the fastest (as it has the lowest overhead). Check out your graphs when the number of values you want to sort is in the range [1-100].
Also bubble sort has a best case of
Ranges in C++
Ranges in C++ are usually done from beginning to one past the end. This convention is so ingrained that when you don't use it you get people noticing and wondering why.
Why not have q by one past the end of the first range and r by one past the end of the second range.
Then your list splits nicely as
It also makes working out sizes easier as
If you think this way then you can often take advantage of the standard algorithms (which are organized like this).
Use iterators for your interface.
In C++ the interface between storage and algorithms is done via iterators. This allows you to perform your algorithm on different types of container without changing the code.
So here I would have used:
Merge Not optimal
This is not the optimal implementation of the merge algorithm. Once one side has been completely merged you can move the content of the other rather than continuing to test values.
I would have written merge like this:
Avoid using dynamic allocation
Use C++ fstream
Prefer to use C++ fstream object (it is excepion safe unlike fopen/fclose).
Now admittedly the C++ stream operators are much much much more verbose then the C code for printing. But the main advantage is that they are TYPE SAFE so you have a much less chance of getting it wrong (though modern compilers actually check this in C now).
To mitigate the verbosity you can use boost::format see Which C I/O library should be used in C++ code?
A basic translation into C++
```
std::ofstream resultsfile("results-comparison_sort-noBS.dat");
for(int j=0;j< npoints;j++) {
resultsfile << boost::format("%5e \t %10.2f \t %10.2f \t %10.2f \t %10.2f \t %10.2f \n")
% nplist[j]
% bubble_timelist[j]
% insertion_timelist[j]
% merge_timelist[j]
% heap_timelist[j]
% quick_tim
//Is this less offensive than using the entire std namespace?
using std::cout;
using std::endl;Yes. But still on the lazy size. Is it that hard to type 5 extra characters.
Prefer functions to macros.
//These little functions are used by the heap-sort algorithm
#define PARENT(i) ((i - 1) / 2)
#define LEFT(i) (2 * i + 1)
#define RIGHT(i) (2 * i + 2)If you are going to treat them as functions may as well define them like functions:
//These little functions are used by the heap-sort algorithm
inline int parent(int i) {return ((i - 1) / 2);}
inline int left(int i) {return (2 * i + 1);}
inline int right(int i) {return (2 * i + 2);}Macros stomp all over the namespace/scope rules (and thus are slightly dangerous). They don't work well with any parameters that are complex (because they just use text substitution).
void bubble_sort(int list[], int size)
Sure bubble sort is the most brute force and worst complexity on average. But for a small number of values it is usually the fastest (as it has the lowest overhead). Check out your graphs when the number of values you want to sort is in the range [1-100].
Also bubble sort has a best case of
O(n) you forgot to add this standard optimization for quick exit when the data is already sorted.Ranges in C++
Ranges in C++ are usually done from beginning to one past the end. This convention is so ingrained that when you don't use it you get people noticing and wondering why.
//n1 and n2 are the lengths of the pre-sorted sublists, list[p..q] and list[q+1..r]Why not have q by one past the end of the first range and r by one past the end of the second range.
Then your list splits nicely as
list[p..q)
list[q..r)It also makes working out sizes easier as
n1 = q - p;
n2 = r - q;If you think this way then you can often take advantage of the standard algorithms (which are organized like this).
Use iterators for your interface.
In C++ the interface between storage and algorithms is done via iterators. This allows you to perform your algorithm on different types of container without changing the code.
void merge(int list[], int p, int q, int r)So here I would have used:
template
void merge(I p, I q, I r)Merge Not optimal
//Merge the L and R lists
int i=0;
int j=0;
for(int k=p; k<=r; k++)
{
if (L[i]<=R[j])
{
list[k]=L[i];
i++;
} else
{
list[k]=R[j];
j++;
}
}
}This is not the optimal implementation of the merge algorithm. Once one side has been completely merged you can move the content of the other rather than continuing to test values.
I would have written merge like this:
template
void merge(I p, I q, I r)
{
int leftSize = std::distance(p, q);
int rightSize = std::distance(q, r);
int L[leftSize]; // Technically not legal but most compilers support it.
int R[rightSize]; // Normally use vectors here. But I am using the same
// technique as shown by the OP
std::move(p, q, L);
std::move(q, r, R);
int left = 0;
int right = 0;
I d = p;
while(left < leftSize && right < rightSize)
{
(*d) = std::move((L[left] <= R[right])
? L[left++]
: R[right++]);
++d;
}
// Note only one of these copies will actually do anything.
std::move(L + left, L + leftSize, d);
std::move(R + right, R + rightSize, d);
}Avoid using dynamic allocation
int *rlist1= new int[npointsmax];
int *rlist2= new int[npointsmax];
int *rlist3= new int[npointsmax];
int *rlist4= new int[npointsmax];
int *rlist5= new int[npointsmax];- Prefer to use automatic variables.
- Prefer standard containers to raw arrays.
Use C++ fstream
FILE * resultsfile;
resultsfile=fopen("results-comparison_sort-noBS.dat","w");
for(int j=0;j< npoints;j++) fprintf(resultsfile, "%5e \t %10.2f \t %10.2f \t %10.2f \t %10.2f \t %10.2f \n",nplist[j], bubble_timelist[j], insertion_timelist[j], merge_timelist[j], heap_timelist[j], quick_timelist[j]);
fclose(resultsfile);Prefer to use C++ fstream object (it is excepion safe unlike fopen/fclose).
Now admittedly the C++ stream operators are much much much more verbose then the C code for printing. But the main advantage is that they are TYPE SAFE so you have a much less chance of getting it wrong (though modern compilers actually check this in C now).
To mitigate the verbosity you can use boost::format see Which C I/O library should be used in C++ code?
A basic translation into C++
```
std::ofstream resultsfile("results-comparison_sort-noBS.dat");
for(int j=0;j< npoints;j++) {
resultsfile << boost::format("%5e \t %10.2f \t %10.2f \t %10.2f \t %10.2f \t %10.2f \n")
% nplist[j]
% bubble_timelist[j]
% insertion_timelist[j]
% merge_timelist[j]
% heap_timelist[j]
% quick_tim
Code Snippets
//Is this less offensive than using the entire std namespace?
using std::cout;
using std::endl;//These little functions are used by the heap-sort algorithm
#define PARENT(i) ((i - 1) / 2)
#define LEFT(i) (2 * i + 1)
#define RIGHT(i) (2 * i + 2)//These little functions are used by the heap-sort algorithm
inline int parent(int i) {return ((i - 1) / 2);}
inline int left(int i) {return (2 * i + 1);}
inline int right(int i) {return (2 * i + 2);}//n1 and n2 are the lengths of the pre-sorted sublists, list[p..q] and list[q+1..r]list[p..q)
list[q..r)Context
StackExchange Code Review Q#87085, answer score: 15
Revisions (0)
No revisions yet.