OpenVZ precreated OS templates are tarballs of a pre-installed Linux distributions. While there are other ways to create a container, the easiest one is to take such a tarball and extract its contents. This is what takes 99.9% of
To save some space and improve download speeds, those tarballs are compacted with good ol'
centos-6-x86.tar.gz: 203M
centos-6-x86.tar.xz: 122M
centos-6-x86.tar: 554M
So, why don't we switch to xz which apparently looks way better? Well, there are other criteria to optimize for, except for file size and download speed. In fact, the main optimization target is container creation speed! I just ran a quick non-scientific test on my notebook in order to proof my words, measuring the time it takes to run
time tar xf tar.gz: ~7 seconds
time tar xf tar.xz: ~13 seconds
See, it takes twice the time if we switch to xz! Note that this ratio doesn't change much when I switched from fast SSD to (relatively slow) rotating hard disk drive:
time tar xf tar.gz: ~8 seconds
time tar xf tar.xz: ~16 seconds
Note, while I call it non-scientific, I still ran each test at least three times, with proper syncs, rms and cache drops in between.
Now, do we want to trade a double increase of container creation time for saving 80 MB of disk space. We sure don't!
vzctl create command execution.To save some space and improve download speeds, those tarballs are compacted with good ol'
gzip tool. For example, CentOS 6 template tar.gz is about 200 MB in size, while uncompacted tar would be about 550 MB. But why don't we use more efficient compacting tools, such as bzip2 or xz? Say, the same CentOS 6 tarball, compressed by xz, is as lightweight as 120 MB! Here are the numbers again:centos-6-x86.tar.gz: 203M
centos-6-x86.tar.xz: 122M
centos-6-x86.tar: 554M
So, why don't we switch to xz which apparently looks way better? Well, there are other criteria to optimize for, except for file size and download speed. In fact, the main optimization target is container creation speed! I just ran a quick non-scientific test on my notebook in order to proof my words, measuring the time it takes to run
tar xf on a tarball:time tar xf tar.gz: ~7 seconds
time tar xf tar.xz: ~13 seconds
See, it takes twice the time if we switch to xz! Note that this ratio doesn't change much when I switched from fast SSD to (relatively slow) rotating hard disk drive:
time tar xf tar.gz: ~8 seconds
time tar xf tar.xz: ~16 seconds
Note, while I call it non-scientific, I still ran each test at least three times, with proper syncs, rms and cache drops in between.
Now, do we want to trade a double increase of container creation time for saving 80 MB of disk space. We sure don't!


Comments
Fedora switched to xz for their rpm packages some time ago although I'm not sure if RHEL6 is using it or not. I'm guessing RHEL7 will though... so compatibility isn't an issue there. I believe the vzctl package has xz as a dependency... so again, I don't think compatibility is much of an issue.
If you don't have the OS Template in question already downloaded when you are creating a new container, downloading an additional +80MB can slow it down too. Of course, once it is downloaded, that doesn't come into play anymore. I don't really care if it takes 30 seconds or a minute to create a container. If it becomes a concern, then I can always convert my .xz files to .gz on my own host.
When you are talking about much larger files in the multiple GB range, saving a few hundred megabytes on an .xz vs. a .gz seems like a no brainer to me.
As you probably know, kernel.org dropped .bz2 in favor of .xz but they still provide .gz. Ideally you could offer both and let the consumer decide... and then see how that plays out. I'd guess that people will generally pick the smaller download size if given a choice... which may re-enforce the idea that humans are dumb and prefer short-term beneficial solutions over long-term.