how to create a compressed archive (tar and gzip) of a folder in linux

Creating a compressed archive of folders and files have multiple uses, such as backing up and storing files or easier and faster transfer of file over networks. A compressed archive is mainly a combination of two separate processes: archiving and compressing. This allows for two things: compression which saves space and archiving which bundles multiple files (usually related) into a single bundle.

You can choose just the compressing or the archiving process depending on your requirements. If you have a single file, then archiving is redundant and just a compression will suffice. If you have a large number of files that are small in size, then archiving is good but compression may or may not yield a whole lot of disk space savings.

There are several utilities in Linux that allow you to create compressed archives. The most popular ones that are used are zip or tar. You can also use these commands to create compressed archive of a single or multiple folders along with the files within them.

Using tar

tar is a an archiving utility while gzip is a compressing utility. These two can be used in conjunction to create a compressed archive of the folder.

bash$ tar -cfvz alldocs.tar.gz /home/saloon/documents/ /home/saloon/docs/ /home/saloon/imgdocs/

You can specify multiple files and folders as arguments and all the specified folders are compressed into a single archive. There are several options that are used in the above command and are important.

  • -c or –create : create a new archive file
  • -f : use the archive file name as specified in the arguments
  • -v or –verbose: show activity and progress while creating archive
  • -z or –gzip: filter the archive through gzip for compression

Unarchiving or extracting the compressed archive again can be done by the tar command as well. You have to specify the –extract or -x option in lieu of the -c in the above example, to specify that you want to extract the contents.

bash$ tar -xfvz alldocs.tar.gz

The tar command can be used with various compression algorithmns in lieu to the gzip mentioned above. Some of the other popular compression alternatives are:

  • -j or –bzip2: Use bzip2 for compression. This usually results in a file with extension .tar.bz2 or .tb2
  • -J or –xz: Use xz for compression. The file extension usually used is .tar.xz or .txz or .tbz
  • –lzip: lzip is used for compression. The extensions can be .tar.lzip or .lz
  • –lzop: lzop is used for compression. The extensions commonly used are .tar.lzop or .lzo
  • -Z or –compress: the archive is filtered through compress. The file is usually created with an extension .tar.Z or .tZ
  • -z or –gzip: the gzip is used for compression. The extensions commonly used are .tar.gz, .tgz or .taz

When extracting the compressed archive, you will need to use the appropriate option to uncompress as well. tar does provide an option called –auto-compress or -a that will use the suffix or the file extension to determine the compression program that was used or should be used. You can use this option with both -c or -x option, provided the file has the correct and well known extension.

Each of the above compression algorithms have its own advantages and disadvantages. You can choose the one that is most suitable for your needs.

Using Zip

While using tar and gzip works well on Unix and Linux systems, you might have difficulty with MS Windows. Zip is a much more popular alternative with windows operating systems and is also supported in Linux.

Zip also has the option to have different compression levels and options for different compress speeds. Ideally, the faster the speed the lower the compression ratio.

zip -9 -r alldocs.zip /home/saloon/documents/

The options used in the above example allow you to compress at different levels and recursively.

  • -r or –recurse-paths: Recursively archives the folder, including all files and sub-directories
  • -#: where # denotes an integer between 0 and 9. It denotes the compression speed. 0 denotes no compression, while 1 indicates the fastest compression speed but a smaller compression ratio. 9 is the slowest but achieves the maximum compression.

To unzip a compressed archive, you will use the unzip command. Though unzip have various options, the default and simplest way to unzip is to use it without any options.

bash$ unzip alldocs.zip

This will extract all the files to the current directory. In order to extract the files into a specific folder, you will use the -d option.

bash$ unzip alldocs.zip -d /home/saloon/unzipfolder/

Another popular option you have in Linux to create a compressed archive is Roshal Archive or rar.