Parallel Compression
From DuncanWiki
With the advent of multicore processors being commonplace parallelized software becomes much more relevant.
Contents |
Why Should I Care?
The basic idea is that if you have a two core processor (Intel Core 2 Duo, for example) you can get something done TWICE AS FAST if it's parallelized. An Intel Core 2 Quad would be 4 times, and so on.
Compression is incredibly CPU intensive and very time consuming if you use something like bzip2 which nets you great gains on certain types of files over gzip but with severe CPU overhead. During my day-to-day we sometimes come across systems with a filesystem full of logs, oftentimes hundreds of gigabytes. bzip2 will compress ~33% better than gzip therefore giving us much more room to save compressed logfiles. This is a process that can take hours, if not days. A parallel compression program can cut that time by 75% on most of our systems & sometimes even as high as 88.5% on our 8 core systems.
Parallel Compression Software
Parallel gzip (pigz)
- Homepage : http://www.zlib.net/pigz
- Author : Mark Adler
- RPM spec file : my .spec file is now a part of the official tarball
- In 2.1.6 the spec that comes with the tarball has MSDOS carriage returns which breaks rpmbuild, not sure why they were added. I've emailed Mark & the next release will no longer have the improper carriage returns.
- If 2.1.6 is still the current release you'll want to do the following :
cd /tmp wget http://www.zlib.net/pigz/pigz-2.1.6.tar.gz tar xvfz pigz-2.1.6.tar.gz dos2unix pigz-2.1.6/pigz.spec tar cvfz pigz-2.1.6.tar.gz pigz-2.1.6 rpmbuild -ta pigz-2.1.6.tar.gz
Parallel bzip2 (pbzip2)
- Homepage : http://compression.ca/pbzip2
- Author : Jeff Gilchrist
- RPM specfile : None needed, this is a part of Fedora. You can recompile the SRPM for CentOS.

