Friday, December 20, 2013

How-to: Create a gzipped tarball via SSH

I recently needed to remotely backup a server and store the data on another system. However my complication was that there wasn't enough space on the server to create the backup then transfer it to the remote system. I thought about using scp but that creates a lot of overhead as each file transfer is essentially a new SSH session. The crux was that I needed to keep the data secure but transfer within a short amount of time. I thought about using netcat, but that would mean the data would be going across the wire unencrypted even if I compressed it using gzip.

I ultimately decided that I needed SSH, tar and gzip. I popped up the manual page for tar and found that I can send the output to stdout (standard out). I immediately knew this was the best solution for the problem; I tinkered around a bit and developed a command that would allow me to use SSH, tar and gzip. The way it works is you send the tar command through SSH, tell tar to output via stdout and redirect the output from SSH to a gzipped tarball. The only downside is I had to temporarily enable root access via SSH; this is required if you're going to archive the whole file system. You can see how I accomplished the remote backup using the command below:

$ ssh root@remote.host.com 'tar -czvf - / --exclude=/dev --exclude=/proc --exclude=/var/run --exclude=/tmp --exclude=/sys --exclude=/usr/lib' > my-server-backup.tar.gz

I'll break down the command for you:

$ ssh root@remote.host.com 'command'

With SSH you can run commands remotely and the output will be displayed in your local terminal. The tar command:

 tar -czvf - / --exclude=/dev --exclude=/proc --exclude=/var/run --exclude=/tmp --exclude=/sys --exclude=/usr/lib

We invoke tar on the remote machine, telling it to create a gzipped tarball and list the files as it compresses them (czvf). Normally where we would specify the archive's file name we place a hyphen (-) which tells tar to output the archive to stdout (standard out, our screen). We want to backup the whole file system so we tell tar to start at the root of the directory tree (/) but we don't need libraries, device descriptors, process or temp files; thus we exclude specific directories with the exclude flag (--exclude=/dir/to/leave/behind). The last part of the command is important:

 > my-server-backup.tar.gz

We redirect the output to a file on the system receiving the archive; this is what the greater-than symbol does. To store the output into a file we specify a file name (my-server-backup.tar.gz); the ".tar.gz" extension identifies the file as a gzipped-tarball. To look at it without the confusion of the tar command:

 $ ssh root@remote.host.com 'command' > output.file

You can essentially perform the same function with any other command, so long as it supports outputting the data via stdout. At any rate, if you were to run the tar command all you (the user) would see is the file listing, like such:

$ ssh sysadmin@my.server.com 'tar -czvf - Downloads/' > sysadmin-downloads.tar.gz
Downloads/
Downloads/file-a.txt
Downloads/file-b.txt
Downloads/file-c.txt

In the above example, I made a backup of the Downloads directory that resides in the sysadmin user's home directory. You can even check to make sure the file is of the appropriate format using the file command:

$ file sysadmin-downloads.tar.gz
sysadmin-downloads.tar.gz: gzip compressed data, from Unix, last modified: Fri Dec 20 17:53:28 2013

There you have it! A remote, compressed backup streamed to your server or workstation over SSH. Until next time, my friends ...