Linux.com

GZ or BZ2?

Link to this post 12 Sep 10

Hi. I'd like to adopt a default compression algorithm to my files. It has to be the most linux friendly, have the highest compression rate and security. Instantly I thought in tar.gz and tar.bz2. But I don't know their main differences. Witch one should I chose?

Link to this post 13 Sep 10

A simple comparison review of linux compression tools is at http://wj32.wordpress.com/2008/04/27/comparison-of-compression-programs-on-linux-2625tar/ , you also have xz (http://tukaani.org/xz/) and 7zip which is not installed in most base installations.

Here is what I ran to test various compress algorithms.

File: linux-2.6.36-rc4
Uncompressed Size: 454M

tar.gz
Time: 25.546s
New Size: 88M
Compression Percent: 80.6%

7zip
Time: 4m 0.327s
New Size: 61M
Compression Percent: 86.6%

xz
Time: 6m 6.009s
New Size: 59M
Compression Percent: 87.0%

tar.bz2
Time: 1m 41.552s
New Size: 52M
Compression Percent: 88.5%

By what I have heard xz is supposed to be much better than bz2 or gz, but it may just be my archive contents that restricted the compression. In the end you will have to create compressed archives with the various algorithms on a realistic sample and decide which best files your speed, compression and system load needs.

Link to this post 14 Sep 10

Scifer wrote:

It has to be the most linux friendly, have the highest compression rate and [b]security[/b].
What exactly do you refer to when you mention security? Do you want to password encrypt these archives, or are you looking for a format that is somewhat more resistant to corruption?

I know that the format rar is can add a defined amount of redundancy to an archive, thus making the archive more robust when it comes to corruption; the catch is that the rar format isn't open. I don't think that GNU tar has the same feature built in, but third party utilities might provide something similar.

When it comes to efficiency, all tests I have seen so far indicate that the compression ratio of bz2 is superior to that of it's older counterpart gz, but that efficiency boost comes at the price of requiring more CPU time. I don't have anything to contribute regarding other formats (xz, 7z, etc); see Matthews excellent answer for some raw numbers on this issue.

Link to this post 14 Sep 10

mfillpot wrote:

A simple comparison review of linux compression tools is at wj32.wordpress.com/2008/04/27/comparison...ms-on-linux-2625tar/ , you also have xz (tukaani.org/xz/) and 7zip which is not installed in most base installations.

Your help is greatly appreciated. By what I understood from Tel's comment in wj32 review is that bz2 has a higher compression ratio while gz has higher decompression speed. Which is exactly as I suspected. And considering that both are the most commonly used in linux, I'm picking them as my default compression formats.

Link to this post 14 Sep 10

jabirali wrote:

What exactly do you refer to when you mention [i]security[/i]? Do you want to password encrypt these archives, or are you looking for a format that is somewhat more resistant to corruption?

By security I meant encryption, redundancy and recovery features.

Link to this post 14 Sep 10

jabirali wrote:

the catch is that the [b]rar[/b] format isn't open.

So it's off my list.

Who we are ?

The Linux Foundation is a non-profit consortium dedicated to the growth of Linux.

More About the foundation...

Frequent Questions

Join / Linux Training / Board