Backup compression format: pbzip2

Hello
It would be handy to have pbzip2 compression format in Virtualmin Virtual Servers module.
http://compression.ca/pbzip2/

Regards
Pawel

Status: 
Closed (fixed)

Comments

Currently Virtualmin supports gzip, zip and bzip2 .. what advantage does pbzip2 have? I would prefer not to support a new backup format unless it is way better than the alternatives.

Advantages of pbzip2 are given on project webpage. I wrote its address. In summary:

"PBZIP2 is a parallel implementation of the bzip2 block-sorting file compressor that uses pthreads and achieves near-linear speedup on SMP machines...
It provides near-linear speedup when used on true multi-processor machines and 5-10% speedup on Hyperthreaded machines. The output is fully compatible with the regular bzip2 data so any files created with pbzip2 can be uncompressed by bzip2 and vice-versa."

Many servers at the moment are SMP so pbzip2 can significantly decrease backup time.

Regards
Pawel

So pbzip2 has the same backup format as bzip2?

Jamie, everything is written on the project's page. Shall I copy answers for you?
Yes, format is the same, but SPEED is almost (N * CPU cores)
There is another version by Intel: http://software.intel.com/en-us/articles/a-parallel-bzip2/ - different from distributed with Debian, but shows the idea of paralleling of bzip2 is right.

Ok, in that case it would be pretty easy to support in Virtualmin.

Actually, if you wanted you could use pbzip2 right now .. just remove bzip2 from your system, and replace /usr/bin/bzip2 and bunzip2 with links to the parallel versions. I would be interested to know if that works for you - it would be a good test if they are really compatible at the command-line level.

Linking to pbzip2 to bzip2 doesn't work - backups in Virtualmin aren't compressed. I think if you try on any other system you'll have the same results.

What do you mean by no compression is being done? Is the resulting backup file in bz2 format, just not compressed?

I've taken steps:
1. Installed pbzip2
2. Made link as bzip2
3. Changed compression format in VM to bzip2
4. Ran scheduled backup. With gzip it was taking about 1h30 and 7 GB, with this trick 9m30 and 14 kB. I didn't debug this - hadn't time and willingness. Just switched again to gzip to make backup works.

So unfortunately I can't help you more than giving a (good) idea.

I played around with pbzip2 some more, and it seems to work fine for me ... so I have added an option to Virtualmin to use it, for inclusion in the 3.81 release. You will be able to turn this on at the same page where you select the bzip2 backup format.

Automatically closed -- issue fixed for 2 weeks with no activity.

I had previously installed pbzip2 with command: apt-get install pbzip2
Now I've updated VM to 3.81. There is an option to choose between bzip2 and pbzip2 (this one selected). After "rechecking and refreshing configuration" an error was shown:
"The command pbunzip2 needed to create or restore backups is not installed."
1. Is pbUNzip2 really needed to CREATE backups?! According to http://compression.ca/pbzip2/ there is no pbunzip2 for Debian 5.x - just pbzip2 with option -d for decompressing files
2. With above settings none backup is made (backup file size is 0 B)?!

Is this VM 3.81 or my system bug?

Regards
Pawel

That's a bug that will be fixed in Virtualmin 3.82. The pbunzip2 command does exist, but only on some systems. Virtualmin incorrectly assumes that it is always available.

The work-around is to run :

ln -s /usr/bin/pbzip2 /usr/bin/pbunzip2

Virtualmin 3.82 will fix this properly by using pbzip2 -d when pbunzip2 is missing.

After setting this link and refreshing configuration there is no error anymore.
BUT compression with pbzip2 still doesn't work - backup takes just a few seconds and filesize is 0 B.
I think this is other bug on Debian systems.

Can you manually compress with pbzip2 from the command line? Try something like :

tar cf - /etc | pbzip2 -c >/tmp/etc.tar.bz2

Such command gives error:
Invalid command line! Aborting...

What does pbzip2 --help output?

I've installed pbzip2 by apt-get from standard Debian repositories

# pbzip2 --help
Parallel BZIP2 v1.0.2 - by: Jeff Gilchrist [http://compression.ca]
[July 25, 2007] (uses libbzip2 by Julian Seward)

Invalid command line! Aborting...

Usage: pbzip2 [-1 .. -9] [-b#cdfklp#qrtV]
-b# : where # is the file block size in 100k (default 9 = 900k)
-c : output to standard out (stdout)
-d : decompress file
-f : force, overwrite existing output file
-k : keep input file, don't delete
-l : load average determines max number processors to use
-p# : where # is the number of processors (default: autodetect)
-r : read entire input file into RAM and split between processors
-t : test compressed file integrity
-v : verbose mode
-V : display version info for pbzip2 then exit
-1 .. -9 : set BWT block size to 100k .. 900k (default 900k)

Example: pbzip2 -b15vk myfile.tar
Example: pbzip2 -p4 -r -5 myfile.tar second*.txt
Example: pbzip2 -d myfile.tar.bz2

So according to the help page the -c option is supported ... but pbzip2 fails when it is used??

I can give you access to my system via Support Ticket so you can check what you need to know

Sure, that would be great ..

Ok, I looked into this some more. It seems that the version of pbzip2 on Ubuntu 8.04 is only 1.0.2 , which does not support compression to STDOUT .. and Virtualmin needs this ability (which gzip and bzip2 have) to make backups.

The fix is to download, compile and install a newer version of pbzip2. Virtualmin 3.82 will add extra validation to prevent this broken version of pbzip2 from being selected.

Automatically closed -- issue fixed for 2 weeks with no activity.