tar

tar(1) is the GNU tape archiver. It takes several files or directories and creates one large file. This allows you to compress an entire directory tree, which is impossible by just using gzip or bzip2. tar has many command line options, which are explained in its man page. This section will just cover the most common uses of tar.

The most common use for tar is to decompress and unarchive a package that you've downloaded from a web site or ftp site. Most files will come with a .tar.gz extension. This is commonly known as a “tarball”. It means that several files were archived using tar and then compressed using gzip. You might also see this listed as a .tar.Z file. It means the same thing, but this is usually encountered on older Unix systems.

Alternatively, you might find a .tar.bz2 file somewhere. Kernel source is distributed as such because it is a smaller download. As you might have guessed, this is several files archived with tar and then bzipped.

You can get to all the files in this archive by making use of tar and some command line arguments. Unarchiving a tarball makes use of the -z flag, which means to first run the file through gunzip and decompress it. The most common way to decompress a tarball is like so:

   $ tar -xvzf hejaz.tar.gz

That's quite a few options. So what do they all mean? The -x means to extract. This is important, as it tells tar exactly what to do with the input file. In this case, we'll be splitting it back up into all the files that it came from. -v means to be verbose. This will list all the files that are being unarchived. It is perfectly acceptable to leave this option off, if somewhat boring. Alternatively, you could use -vv to be very verbose and list even more information about each file being unarchived. The -z option tells tar to run hejaz.tar.gz through gunzip first. And finally, the -f option tells tar that the next string on the command line is the file to operate on.

There are a few other ways to write this same command. On older systems lacking a decent copy of GNU tar, you might see it written like so:

   $  gzip -dc hejaz.tar.gz | tar -xvf -

This command line will unzip the file and send the output to tar. Since gzip will write its output to standard out if told to do so, this command will write the decompressed file to standard out. The pipe then sends it to tar for unarchiving. The “-” means to operate on standard input. It will unarchive the stream of data that it gets from gzip and write that to the disk.

Another way to write the first command line is to leave off the dash before the options, like so:

   $ tar xvzf hejaz.tar.gz

You might also encounter a bzipped archive. The version of tar that comes with Slackware Linux can handle these the same as gzipped archives. Instead of the -x command line option, you'd use -y:

   $ tar -xvyf foo.tar.bz2

It is important to note that tar will place the unarchived files in the current directory. So, if you had an archive in /tmp that you wanted to decompress into your home directory, there are two options. First, the archive could be moved into your home directory and then run through tar. Or, you could specify the path to the archive file on the command line:

   $ tar -xvzf /tmp/bar.tar.gz

The contents of the archive would be dumped into your home directory, and the original compressed archive file will still be in /tmp.

The second most common operation with tar is making your own archives. Making an archive is no more complicated than unarchiving other files; it just takes a different set of command lines options.

To create a compressed tar archive of all the files in the current directory (including any subdirectories and their files), you would use tar like so:

   $ tar -cvzf archive.tar.gz .

In this command line, the -c option tells tar to create an archive, while the -z option runs the resulting archive file through gzip to compress it. archive.tar.gz is the file that you want to create. You can call it anything you want, and if you include a full path name, it will put the archive in that directory. Here is an example of that:

   $ tar -cvzf /tmp/archive.tar.gz .

The archive would then go into /tmp. You can also list all the file and directories that you want to be included in the archive by listing them at the end of the command. In this case, the . is the directory to include in the archive. This could easily be replaced with a list of various files, or whatever you want to archive.