Recently I have come through an interesting problem. I needed to download a raw copy of the HDD image located on a remote server (about 180GB) connected via 1 Mbps link. Network connection dropped frequently so the requirement was to reestablish connection automatically, without my intervention.
Definition of Terms
Server - a remote computer with an IP address 172.17.100.5/16 which contains a raw copy of the HDD image - a file /root/ubuntu.iso.
Client - a local computer that copies a raw copy of the HDD image from the server.
Below is my how-to which helped me to fulfill a task. I hope it might be useful to you.
1. Create Multiple Archive Files
The idea is to create a compressed archive file and to split it to multiple sequential chunks in order to make transfer of files less depended on network outages due to an unreliable link.
$ tar cvf - ubuntu.iso | gzip -9 - | split -b 10M -d - ./disk/ubuntu.tar.gz.
The command tar creates a tar archive from a file ubuntu.iso and send it to a standard output instead to the file. The command gzip compress everything from a standard input using the best compression ratio (parameter -9) and send it to the standard output. The command split reads from the standard input and split one large archive file to multiple 10M sequential pieces with numbered suffix (parameter -d). Chunks are saved into the directory disk.
We will put a tar command to the script pack.sh and a secure copy command scp helps us to copy a script to a remote server into to the root directory.
$ scp -rv pack.sh firstname.lastname@example.org:/root
Login to the server using ssh secure shell and start the script with a command below. The command nohup ensures that script keeps running in the background also in case SSH session is dropped.
# nohup bash ./script &
2. Generate Private and Public RSA Key and Copy Public Key to Server
First we generate public and private keys on a client with ssh-keygen command.
$ ssh-keygen -t rsa -P ""
-t type of key to create
-P passphrase (blank).
The command generates a public key id_rsa.pub and a private key id_rsa and saves the both keys into a directory ~/.ssh. Let's copy our public key to a remote server with the ssh-copy-id command.
$ ssh-copy-id -i ~/.ssh/id_rsa.pub email@example.com
-i path to a public key on a client
Now we should be able to connect to a remote server with ssh using a public key authentication (without entering a password).
3. Copy Files with Rsync
Rsync is a command for synchronizing and copying directories both locally and remotely. We will use it for downloading our archive chunk files. For us copying files with rsync command is a preferable copy method comparing to copying chunks with scp command. The command scp overwrites already copied files on a client when the copying is restarted (in order to download the rest of files e.g. after a network outage).
Rsync works differently. For instance when a file is only partially downloaded during a network outage, the command rsync started with a parameter --partially ensures that a file is kept on the disk. A parameter --append ensures that rsync downloads the rest of the file after network connection is restored.
Here is a script copy.sh that we are going to run on the client. The script keeps copying files with rsync command while a return value of the rsync command is not zero.
-a append data onto shorter files
-e specify the remote shell to use (ssh)
--partial keep partially transferred files
--progress show progress during file transfer
4. Merge and Extract Downloaded Files
The last step consists of merging chunks located in a directory files on the client using the cat command. The output from the cat is sent to the tar command which reads data from the standard input and extracts and decompress the archive file. As a result a file ubuntu.iso is created.
$ cat ./files/ubuntu.tar.gz.* | tar zxvf -