patternbashMinor
Copying into two different folders on a machine simultaneously
Viewed 0 times
intofoldersdifferenttwosimultaneouslycopyingmachine
Problem
I am trying to copy files from
If the files are not in
I am copying the files in parallel using GNU parallel library and it is working fine. I am currently copying two files in parallel.
Earlier, I was copying the
Now I decided to copy files in
The script is working fine for me, but I am trying to see is there any better way of doing the same thing using GNU Parallel instead of using
machineB and machineC into machineA as I am running my shell script on machineA.If the files are not in
machineB, then it should be there in machineC for sure. I will try copying the files from machineB first, and if it is not in machineB, then I will try copying the same files from machineC.I am copying the files in parallel using GNU parallel library and it is working fine. I am currently copying two files in parallel.
Earlier, I was copying the
PRIMARY_PARTITION files in PRIMARY folder using GNU parallel and once that was done. I was then copying the SECONDARY_PARTITION files in the SECONDARY folder using same GNU parallel so it is sequential as of now with regards to the PRIMARY and SECONDARY folders.Now I decided to copy files in
PRIMARY and SECONDARY folder simultaneously.#!/bin/bash
export PRIMARY=/test01/primary
export SECONDARY=/test02/secondary
readonly FILERS_LOCATION=(machineB machineC)
export FILERS_LOCATION_1=${FILERS_LOCATION[0]}
export FILERS_LOCATION_2=${FILERS_LOCATION[1]}
PRIMARY_PARTITION=(550 274 2 546 278) # this will have more file numbers
SECONDARY_PARTITION=(1643 1103 1372 1096 1369 1568) # this will have more file numbers
export dir3=/testing/snapshot/20140103
do_Copy() {
el=$1
PRIMSEC=$2
scp david@$FILERS_LOCATION_1:$dir3/new_weekly_2014_"$el"_200003_5.data $PRIMSEC/. || scp david@$FILERS_LOCATION_2:$dir3/new_weekly_2014_"$el"_200003_5.data $PRIMSEC/.
}
export -f do_Copy
parallel --retries 10 -j 5 do_Copy {} $PRIMARY ::: "${PRIMARY_PARTITION[@]}" &
parallel --retries 10 -j 5 do_Copy {} $SECONDARY ::: "${SECONDARY_PARTITION[@]}" &
wait
echo "All files copied."The script is working fine for me, but I am trying to see is there any better way of doing the same thing using GNU Parallel instead of using
&.Solution
I am trying to see is there any better way of doing the same thing using GNU Parallel instead of using
Essentially you have two different commands and two different parameter lists:
These are two file-sets, two distinct operations. I don't think there's an easy way to do this with a single
organize the inputs in pairs, like this:
This way it will work with one command.
But constructing the argument list dynamically from your list of filenames will be tricky and potentially error prone.
The original solution is simple and easier to understand,
which are preferable properties.
Exporting variables
You don't need to export
You only need to export
because these will be used by
which will be called by
Notice that the read-only
You could just as well set the
Improving
In this method,
you have two
except the destination part.
To avoid the duplication,
you could just loop over the possible destinations,
and break out of the loop as soon as a copy was successful:
If the destination files might be exactly the same as the local files,
then you'll be better off using
to avoid unnecessary file transfers.
&.Essentially you have two different commands and two different parameter lists:
- Copy to PRIMARY, these files: 550 274 2 546 278
- Copy to SECONDARY, these files: 1643 1103 1372 1096 1369 1568
These are two file-sets, two distinct operations. I don't think there's an easy way to do this with a single
parallel process. A not-so-easy way can be:organize the inputs in pairs, like this:
parallel -N2 do_Copy {1} {2} ::: 550 $PRIMARY 274 $PRIMARY 1643 $SECONDARY # and so onThis way it will work with one command.
But constructing the argument list dynamically from your list of filenames will be tricky and potentially error prone.
The original solution is simple and easier to understand,
which are preferable properties.
Exporting variables
You don't need to export
PRIMARY and SECONDARY.You only need to export
FILERS_LOCATION_1 and FILERS_LOCATION_2,because these will be used by
do_Copy,which will be called by
parallel (and also needs to be exported, as you correctly did).Notice that the read-only
FILERS_LOCATION array is pointless.You could just as well set the
FILERS_LOCATION_* variables directly:export FILERS_LOCATION_1=machineB
export FILERS_LOCATION_2=machineCImproving
do_CopyIn this method,
you have two
scp commands that are almost duplicates,except the destination part.
To avoid the duplication,
you could just loop over the possible destinations,
and break out of the loop as soon as a copy was successful:
do_Copy() {
el=$1
PRIMSEC=$2
for host in $FILERS_LOCATION_1 $FILERS_LOCATION_2; do
echo "scp david@$host:$dir3/new_weekly_2014_"$el"_200003_5.data $PRIMSEC/." && break
done
}If the destination files might be exactly the same as the local files,
then you'll be better off using
rsync -u instead of scp,to avoid unnecessary file transfers.
Code Snippets
parallel -N2 do_Copy {1} {2} ::: 550 $PRIMARY 274 $PRIMARY 1643 $SECONDARY # and so onexport FILERS_LOCATION_1=machineB
export FILERS_LOCATION_2=machineCdo_Copy() {
el=$1
PRIMSEC=$2
for host in $FILERS_LOCATION_1 $FILERS_LOCATION_2; do
echo "scp david@$host:$dir3/new_weekly_2014_"$el"_200003_5.data $PRIMSEC/." && break
done
}Context
StackExchange Code Review Q#51168, answer score: 2
Revisions (0)
No revisions yet.