snippetjavaMinor
External bottom up merge sort too slow
Viewed 0 times
mergetooslowbottomsortexternal
Problem
My external sort function works but it is way too slow. I think the bottleneck is the merge function. How can I make it faster. Is there a better merge function I can use for merging into a file using random access file objects and buffers?
private static void merge(String a, String b,
int l, int m, int r)throws FileNotFoundException, IOException
{
RandomAccessFile a1f = new RandomAccessFile(a,"rw");
RandomAccessFile b1f = new RandomAccessFile(b,"rw");
a1f.seek(l*4);
b1f.seek(l*4);
for(int i = l; i<m;i++){
a1f.seek(i*4);
b1f.seek(i*4);
int num = a1f.readInt();
b1f.writeInt(num);
}
for(int j=m; j<r; j++){
a1f.seek((m + r - j - 1)*4);
b1f.seek(j*4);
int num = a1f.readInt();
b1f.writeInt(num);
}
int i = l;
int j = r - 1;
//for (int i = l; i < m; i++) b[i] = a[i];
//for (int j = m; j < r; j++) b[j] = a[m + r - j - 1];
//if (b[j] < b[i]) a[k] = b[j--];
//else a[k] = b[i++];
for (int k = l; k < r; k++){
b1f.seek(j*4);
int b_j = b1f.readInt();
b1f.seek(i*4);
int b_i = b1f.readInt();
if(b_j<b_i){
b1f.seek((j--)*4);
a1f.seek(k*4);
a1f.writeInt(b1f.readInt());
}
else{
b1f.seek((i++)*4);
a1f.seek(k*4);
a1f.writeInt(b1f.readInt());
}
}
a1f.close();
b1f.close();
}Solution
The most glaring problem in your code here, is that you open and close the file on each call to the
I recommend that you re-work that part of the solution, then re-post as a 'follow on', but then also include the calling part of the function in to your question so that more of the overall structure of your solution can be reviewed.
The variables
I struggle to understand the difference between the
I would recommend instead having two read-poitns on the b file and then write back to the A. The extra copy is slowing you down too. There's no need to copy from a-to b if you can jsut swap them...
merge function. You should instead open it once in the calling function, and pass in the already-open files to the merge method. The open and close operaions are both slow, and should be minimized.I recommend that you re-work that part of the solution, then re-post as a 'follow on', but then also include the calling part of the function in to your question so that more of the overall structure of your solution can be reviewed.
The variables
a, b, l, r, m i, j, and k are also all horrible names... you should rename them to be 'self-documenting'. They should be called what they are. The variable l is particularly problematic because it is so easy to confuse with 1.I struggle to understand the difference between the
a and b files. It appears that you are copying the first chunk of data from the a-file to the b-file, and then merging from two different places in the b file back to the a-file.I would recommend instead having two read-poitns on the b file and then write back to the A. The extra copy is slowing you down too. There's no need to copy from a-to b if you can jsut swap them...
Context
StackExchange Code Review Q#66369, answer score: 4
Revisions (0)
No revisions yet.