It has been almost two years since we wrote about effective live migration with ploop write tracker. It's time to write some more about it, since we have managed to make ploop live migration yet more effective, by means of pretty simple optimizations. But let's not jump to resolution yet, it's a long and interesting story to tell.
As you know, live migration is not quite live, although it looks that way to a user. There is a short time period, usually a few seconds, during which the container being migrated is frozen. This time (shown if -t or -v option to vzmigrate --live is used) is what needs to be optimized, making it as short as possible. In order to do that, one needs to dig into details on what's happening when a container is frozen.
Typical timings obtained via vzmigrate -t --live look like this. We ran a few iterations migrating a container back and forth between two OpenVZ instances (running inside Parallels VMs on the same physical machine), so there are a few columns at the right side.
Apparently, the first suspect to look at is that "undump + resume". Basically, it shows timing of vzctl restore command. Why it is so slow? Apparently, ploop mount takes some noticeable time. Let's dig deeper into that process.
First, implement timestamps in ploop messages, raise the log level and see what is going on here. Apparently, adding deltas is not instant, takes any time from 0.1 second to almost a second. After some more experiments and thinking it becomes obvious that since ploop kernel driver works with data in delta files directly, bypassing the page cache, it needs to force those files to be written to the disk, and this costly operation happens while container is frozen. Is it possible to do it earlier? Sure, we just need to force write the deltas we just copied before suspending a container. Easy, just call fsync(), or yet better fdatasync(), since we don't really care about metadata being written.
Unfortunately, there is no command line tool to do fsync or fdatasync, so we had to write one and call it from vzmigrate. Is it any better now? Yes indeed, delta adding times went down to from tenths to hundredths of a second.
Except for the top delta, of course, which we migrate using ploop copy. Surely, we can't fsync it before suspending container, because we keep copying it after. Oh wait... actually we can! By adding an fsync before CT suspend, we force the data be written on disk, so the second fsync (which happens after everything is copied) will take less time. This time is shown as "Pcopy after suspend".
The problem is that ploop copy consists of two sides -- the sending one and the receiving one -- which communicate over a pipe (with ssh as a transport). It's the sending side which runs the command to freeze the container, and it's the receiving side which should do fsync, so we need to pass some sort of "do the fsync" command. Yet better, do it without breaking the existing protocol, so nothing bad will happen in case there is an older version of ploop on the receiving side.
The "do the fsync" command ended up being a data block of 4 bytes, you can see the patch here. Older version will write these 4 bytes to disk, which is unnecessary but OK do to, and newer version will recognize it as a need to do fsync.
The other problem is that sending side should wait for fsync to finish, in order to proceed with CT suspend. Unfortunately, there is no way to solve this one with a one-way pipe, so the sending side just waits for a few seconds. Ugly as it is, this is the best way possible (let us know if you can think of something better).
To summarize, what we have added is a couple of fsyncs (it's actually fdatasync() since it is faster), and here are the results:
As you see, both "pcopy after suspend" and "undump + resume" times decreased, shaving off about a second of time, which gives us about 25% improvement. Now, take into account that tests were done on an otherwise idle nodes with mostly idle containers, we suspect that the benefit will be more apparent with I/O load. Let checking if this statement is true will be your homework for today!
vzctl 4.6 build hit download.openvz.org (and its many mirrors around the world) last week. Let's see what's in store, shall we?
First and foremost, I/O limits, but the feature is already described in great details. What I want to add is the feature was sponsored by GleSYS Internet Services AB, and is one of the first results of OpenVZ partnership program in action. This program is a great opportunity for you to help keep the project up and running, and also experience our expert support service just in case you'd need it. Think of it as a two-way support option. Anyway, I digress. What was it about? Oh yes, vzctl.
Second, improvements to UBC setting in VSwap mode. Previously, if you set RAM and swap, all other UBC parameters not set explicitly were set to unlimited. Now they are just left unset (meaning that the default in-kernel setting is used, whatever it is). Plus, in addition to physpages and swappages, vzctl sets lockedpages and oomguarpages (to RAM), vmguarpages (to RAM+swap).
Plus, there is a new parameter vm_overcommit, and it works in the following way -- if set, it is used as a multiplier to ram+swap to set privvmpages. In layman terms, this is a ratio of real memory (ram+swap) to virtual memory (privvmpages). Again, physpages limits RAM, and physpages+swappages limits real memory used by a container. On the other side, privvmpages limits memory allocated by a container. While it depends on the application, generally not all allocated memory is used -- sometimes allocated memory is 5 or 10 times more than used memory. What vm_overcommit gives is a way to set this gap. For example, command
vzctl set $CTID --ram 2G --swap 4G --vm_overcommit 3 --save
is telling OpenVZ to limit container $CTID with 2 GB of RAM, 4 GB of swap, and (2+4)*3, i.e. 18 GB of virtual memory. That means this container can allocate up to 18 GB of memory, but can't use more than 6 GB. So, vm_overcommit is just a way to set privvmpages implicitly, as a function of physpages and swappages. Oh, and if you are lost in all those *pages, we have extensive documentation at http://openvz.org/UBC.
A first version of vzoversell utility is added. This is a proposed vzmemcheck replacement for VSwap mode. Currently it just summarizes RAM and swap limits for all VSwap containers, and compares it to RAM and swap available on the host. Surely you can oversell RAM (as long as you have enough swap), but sum of all RAM+swap limits should not exceed RAM+swap on the node, and the main purpose of this utility is to check that constraint.
vztmpl-dl got a new --list-orphans option. It lists all local templates that are not available from the download server(s) (and therefore can't be updated by vztmpl-dl). Oh, by the way, since vzctl 4.5 you can use vztmpl-dl --update-all to refresh all templates (i.e. download an updated template, if a newer version is available from a download server). For more details, see vztmpl-dl(8) man page.
vzubc got some love, too. It now skips unlimited UBCs by default, in order to improve the signal to noise ratio. If you want old behavior, (i.e. all UBCs), use -v flag.
Surely, there's a bunch of other fixes and improvements, please read the changelog if you want to know it all. One thing in particular worth mentioning, it's a hack to vzctl console. As you might know, in OpenVZ container's console is sort of eternal, meaning you can attach to it before a container is even started, and it keeps its state even if you detach from it. That creates a minor problem, though -- if someone run, say, vim in console, then detaches and reattaches, vim is not redrawing anything and the console shows nothing. To workaround, one needs to press Ctrl-L (it is also recognized by bash and other software). But it's a bit inconvenient to do that every time after reattach. Although, this is not required if terminal size has changed (i.e. you detach from console, change your xterm size, then run vzctl console again), because in this case vim is noting the change and redraws accordingly. So what vzctl now does after reattach is telling the underlying terminal its size twice -- first the wrong size (with incremented number of rows), then the right size (the one of the terminal vzctl is running in). This forces vim (or whatever is running on container console) to redraw.
Finally, new vzctl (as well as other utilities) are now in our Debian wheezy repo at http://download.openvz.org/debian, so Debian users are now on par with those using RPM-based distros, and can have latest and greatest stuff as soon as it comes out. Same that we did with kernels some time ago.
Today we are releasing a somewhat small but very important OpenVZ feature: per-container disk I/O bandwidth and IOPS limiting.
OpenVZ have I/O priority feature for a while, which lets one set a per-container I/O priority -- a number from 0 to 7. This is working in a way that if two similar containers with similar I/O patterns, but different I/O priorities are run on the same system, a container with a prio of 0 (lowest) will have I/O speed of about 2-3 times less than that of a container with a prio of 7 (highest). This works for some scenarios, but not all.
So, I/O bandwidth limiting was introduced in Parallels Cloud Server, and as of today is available in OpenVZ as well. Using the feature is very easy: you set a limit for a container (in megabytes per second), and watch it obeying the limit. For example, here I try doing I/O without any limit set first:
root@host# vzctl set 777 --iolimit 3M --save
UB limits were set successfully
Setting iolimit: 3145728 bytes/sec
CT configuration saved to /etc/vz/conf/777.conf
root@host# vzctl enter 777
root@CT:/# cat /dev/urandom | pv -c - >/bigfile3
39.1MB 0:00:10 [ 3MB/s] [ <=> ]
^C
If you run it yourself, you'll notice a spike of speed at the beginning, and then it goes down to the limit. This is so-called burstable limit working, it allows a container to over-use its limit (up to 3x) for a short time.
In the above example we tested writes. Reads work the same way, except when read data are in fact coming from the page cache (such as when you are reading the file which you just wrote). In this case, no actual I/O is performed, therefore no limiting.
Second feature is I/O operations per second, or just IOPS limit. For more info on what is IOPS, go read the linked Wikipedia article -- all I can say here is for traditional rotating disks the hardware capabilities are pretty limited (75 to 150 IOPS is a good guess, or 200 if you have high-end server class HDDs), while for SSDs this is much less of a problem. IOPS limit is set in the same way as iolimit (vzctl set $CTID --iopslimit NN --save), although measuring its impact is more tricky.</o>
A shiny new vzctl 4.4 was released just today. Let's take a look at its new features.
As you know, vzctl was able to download OS templates automatically for quite some time now, when vzctl create --ostemplate was used with a template which is not available locally. Now, we have just moved this script to a standard /usr/sbin place and added a corresponding vztmpl-dl(8) man page. Note you can use the script to update your existing templates as well.
Next few features are targeted to make OpenVZ more hassle-free. Specifically, this release adds a post-install script to configure some system aspects, such as changing some parameters in /etc/sysctl.conf and disabling SELinux. This is something that has to be done manually before, so it was described in OpenVZ installation guide. Now, it's just one less manual step, and one less paragraph from the Quick installation guide.
Another "make it easier" feature is automatic namespace propagation from the host to the container. Before vzctl 4.4 there was a need to set a nameserver for each container, in order for DNS to work inside a container. So, the usual case was to check your host's /etc/resolv.conf, find out what are nameservers, and set those using something like vzctl set $CTID --nameserver x.x.x.x --nameserver y.y.y.y --save. Now, a special value of "inherit" can be used instead of a real nameserver IP address to instruct vzctl get IPs from host's /etc/resolv.conf and apply them to a container. Same applies to --searchdomain option / SEARCHDOMAIN config parameter.
Now, since defaults for most container parameters can be set in global OpenVZ configuration file /etc/vz/vz.conf, if it contains a line like NAMESERVER=inherit, this becomes a default for all containers not having nameserver set explicitly. Yes, we added this line to /etc/vz/vz.conf with this release, meaning all containers with non-configured nameservers will automatically get those from the host. If you don't like this feature, remove the NAMESERVER= line from /etc/vz/vz.conf.
Another small new feature is ploop-related. When you start (or mount) a ploop-based container, fsck for its inner filesystem is executed. This mimics the way a real server works -- it runs fsck on boot. Now, there is a 1/30 or so probability that fsck will actually do filesystem check (it does that every Nth mount, where N is about 30 and can be edited with tune2fs). For a large container, fsck could be a long operation, so when we start containers on boot from the /etc/init.d/vz initscript, we skip such check to not delay containers start-up. This is implemented as a new --skip-fsck option to vzctl start.
Thanks to our user and contributor Mario Kleinsasser, vzmigrate is now able to migrate containers between boxes with different VE_ROOT/VE_PRIVATE values. Such as, if one server runs Debian with /var/lib/vz and another is CentOS with /vz, vzmigrate is smart enough to note that and do proper conversion. Thank you, Mario!
Another vzmigrate enhancement is option -f/--nodeps which can be used to disable some pre-migration checks. For example, in case of live migration destination CPU capabilities (such as SSE3) are cross-checked against the ones of the source server, and if some caps are missing, migration is not performed. In fact, not too many applications are optimized to use all CPU capabilities, therefore there are moderate chances that live migration can be done. This --nodeps option is exactly for such cases -- i.e. you can use it if you know what you do.
This is more or less it regarding new features. Oh, it makes sense to note that default OS template is now centos-6-x86, and NEIGHBOR_DEVS parameter is commented out by default, because this increases the chances container networking will work "as is".
Fixes? There are a few -- to vzmigrate, vzlist, vzctl convert, vzctl working on top of upstream kernel (including some fixes for CRIU-based checkpointing), and build system. Documentation (those man pages is updated to reflect all the new options and changes.
A list of contributors to this vzctl release is quite impressive, too -- more than 10 people.
As always, if you find a bug in vzctl, please report it to bugzilla.openvz.org.
We have updated vzctl, ploop and vzquota recently (I wrote about vzctl here). Some changes in packaging are tricky, so let me explain why and give some hints.
For RHEL5-based kernel users (i.e. ovzkernel-2.6.18-028stabXXX) and earlier kernels
Since ploop is only supported in RHEL6 kernel, we have removed ploop dependency from vzctl-4.0 (ploop library is loaded dynamically when needed and if available). Since you have earlier vzctl version installed, you also have ploop installed. Now you can remove it, at the same time upgrading to vzctl-4.0. That "at the same time" part is done via yum shell:
That should fix it. In the meantime, think about upgrading your systems to RHEL6-based kernel which is better in terms of performance, features, and speed of development.
For RHEL6-based users (i.e. vzkernel-2.6.32-042stabXXX)
New ploop library (1.5) requires very recent RHEL6-based kernel kernel (version 2.6.32-042stab061.1 or later) and is not supposed to work with earlier kernels. To protect ploop from earlier kernels, its packaging says "Conflicts: vzkernel < 2.6.32-042stab061.1" which usually prevents ploop 1.5 installation on systems having those kernels.
To fix this conflict, make sure you run the latest kernel, and then remove the old ones:
OpenVZ project is 7 years old as of last month. It's hard to believe the number, but looking back, we've done a lot of things together with you, our users.
One of the main project goals was (and still is) to include the containers support upstream, i.e. to vanilla Linux kernel. In practice, OpenVZ kernel is a fork of the Linux kernel, and we don't like it that way, for a number of reasons. The main ones are:
We want everyone to benefit from containers, not just ones using OpenVZ kernel. Yes to world domination!
We'd like to concentrate on new features, improvements and bug fixes, rather than forward porting our changes to the next kernel.
So, we were (and still are) working hard to bring in-kernel containers support upstream, and many key pieces are already there in the kernel -- for example, PID and network namespaces, cgroups and memory controller. This is the functionality that lxc tool and libvirt library are using. We also use the features we merged into upstream, so with every new kernel branch we have to port less, and the size of our patch set decreases.
CRIU
One of such features for upstream is checkpoint/restore, an ability to save running container state and then restore it. The main use of this feature is live migration, but there are other usage scenarios as well. While the feature is present in OpenVZ kernel since April 2006, it was never accepted to upstream Linux kernel (nor was the other implementation proposed by Oren Laadan).
For the last year we are working on CRIU project, which aims to reimplement most of the checkpoint/restore functionality in userspace, with bits of kernel support where required. As of now, most of the additional kernel patches needed for CRIU are already there in kernel 3.6, and a few more patches are on its way to 3.7 or 3.8. Speaking of CRIU tools, they are currently at version 0.2, released 20th of September, which already have limited support for checkpointing and restoring an upstream container. Check criu.org for more details, and give it a try. Note that this project is not only for containers -- you can checkpoint any process trees -- it's just the container is better because it is clearly separated from the rest of the system.
One of the most important things about CRIU is we are NOT developing it behind the closed doors. As usual, we have wiki and git, but most important thing is every patch is going through the public mailing list, so everyone can join the fun.
vzctl for upstream kernel
We have also released vzctl 4.0 recently (25th of September). As you can see by the number, it is a major release, and the main feature is support for non-OpenVZ kernels. Yes it's true -- now you can have a feeling of OpenVZ without installing OpenVZ kernel. Any recent 3.x kernel should work.
As with OpenVZ kernel, you can use ready container images we have for OpenVZ (so called "OS templates") or employ your own. You can create, start, stop, and delete containers, set various resource parameters such as RAM and CPU limits. Networking (aside from routed-based) is also supported -- you can either move a network interface from host system to inside container (--netdev_add), or use bridged setup (--netif_add). I personally run this stuff on my Fedora 17 desktop using stock F17 kernel -- it just works!
Having said all that, surely OpenVZ kernel is in much better shape now as it comes for containers support -- it has more features (such as live container shapshots and live migration), better resource management capabilities, and overall is more stable and secure. But the fact that the kernel is now optional makes the whole thing more appealing (or so I hope).
You can find information on how to setup and start using vzctl at vzctl for upstream kernel wiki page. The page also lists known limitations are pointers to other resources. I definitely recommend you to give it a try and share your experience! As usual, any bugs found are to be reported to OpenVZ bugzilla.
Managing user beancounters is not for the faint of heart, I must say. One has to read through a lot of documentation and understand all this stuff pretty well. Despite the fact we have made a great improvement recently, a feature called VSwap, many people still rely on traditional beancounters.
This post is about a utility I initially wrote for myself in May 2011. Surely I have added it to vzctl, it is available since vzctl 3.0.27. Simply speaking, it is just a replacement for cat /proc/user_beancounters, and of course it can do much more than cat.
Here's a brief list of vzubc features:
Shows human-readable held, maxheld, barrier, limit, and fail counter for every beancounter, fitting into standard 80-columns terminal (unlike /proc/user_beancounters on an x86_64 system)
Values that are in pages (such as physpages) are converted to bytes
Long values are then converted to kilo-, mega-, gigabytes etc, similar to -h flag for ls or df
For held and maxheld, it shows how close the value to the barrier and the limit, in per cent
Can be used both inside CT and on HN
User can specify CTIDs or CT names to output info about specific containers only
Optional top-like autoupdate mode (internally using "watch" utility)
Optional "relative failcnt" mode (show increase in UBC fail counters since the last run
Optional quiet mode (only shows "worth to look at" UBCs, ie ones close to limits and/or with failcnt)
Now, this is how default vzubc output for a particular VE (with non-vswap configuration) looks like:
As you can see, it shows all beancounters in human-readable form. Zero or unlimited values are shown by a dash. It also shows the ratio of held and maxheld to barrier and limit, in per cent.
Now, let's try to explore functionality available via command-line switches.
First of all, -q or --quiet enables quiet mode to only show beancounters with fails and those with held/maxheld values close to limits. If vzubc --q produces empty output, that probably means you don't have to worry about anything related to UBC. There are two built-in thresholds for quiet mode, one for barrier and one for limit. They can be changed to your liking using -qh and -qm options.
Next, -c or --color enables using colors to highlight interesting fields. In this mode "warnings" are shown in yellow, and "errors" are in red. Here a warning means a parameter close to limit (same thresholds are used as for the quiet mode), and an error means non-zero fail counter.
The following screenshot demonstrates the effect of -c and -q options. I have run a forkbomb inside CT to create a resource shortage:
Now, -r or --relative is the ultimate answer to the frequently asked "how do I reset failcnt?" question. Basically it saves the current failcnt value during the first run, and shows its delta (rather than the absolute value) during the next runs. This is how it works:
There is also -i or --incremental flag to activate a mode in which an additional column shows a difference in held value from the previous run, so you can see the change in usage. This option also affects quiet mode: all lines with changed held values are shown.
Here's a screenshot demonstrating the "color, relative, incremental, quiet" mode for vzubc:
Finally, you can use -w or --watch to enable a la top mode, to monitor beancounters. It's not as powerful as top, in fact it just uses watch(1) tool to run vzubc every 2 seconds and that's it. Please note that this mode is not compatible with --color, and you have to press Ctrl-C to quit. Oh, and since I am not a big fan of animated gifs, there will be no screenshot.
Man page vzubc(8) man page gives you more formal description of vzubc, including some minor options I haven't described here. Enjoy.
During my last holiday on the of hospitable Turkey sunny seaside at night, instead of quenching my thirst or taking rest after a long and tedious day at a beach, I was sitting in a hotel lobby, where they have free Wi-Fi, trying to make live migration of a container on a ploop device working. I succeeded, with about 20 commits to ploop and another 15 to vzctl, so now I'd like to share my findings and tell the story about it.
Let's start from the basics and see how migration (i.e. moving a container from one OpenVZ server to another) is implemented. It's vzmigrate, a shell script which does the following (simplified for clarity):
1. Checks that a destination server is available via ssh w/o entering a password, that there is no container with the same ID on it, and so on. 2. Runs rsync of /vz/private/$CTID to the destination server 3. Stops the container 4. Runs a second rsync of /vz/private/$CTID to the destination 5. Starts the container on the destination 6. Removes it locally
Obviously, two rsync runs are needed, so the first one moves most of the data, while the container is still up and running, and the second one moves the changes made during the time period between the first rsync run and the container stop.
Now, if we need live migration (option --online to vzmigrate), then instead of CT stop we do vzctl checkpoint, and instead of start we do vzctl restore. As a result, a container will move to another system without your users noticing (process are not stopping, just freezing for a few seconds, TCP connections are migrating, IP addresses are not changed etc. -- no cheating, just a little bit of magic).
So this is the way it was working for years, making users happy and singing in the rain. One fine day, though, ploop was introduced and it was soon discovered that live migration is not working for ploop-based containers. I found a few reasons why (for example, one can't use rsync --sparse for copying ploop images, because in-kernel ploop driver can't work with files having holes). But the main thing I found is the proper way of migrating a ploop image: not with rsync, but with ploop copy.
ploop copy is a mechanism of effective copy of a ploop image with the help of build-in ploop kernel driver feature called write tracker. One ploop copy process is reading blocks of data from ploop image and sends them to stdout (prepending each block with a short header consisting of a magic label, a block position and its size). The other ploop copy process receives this data from stdin and writes them down to disk. If you connect these two processes via a pipe, and add ssh $DEST in between, you are all set.
You can say, cat utility can do almost the same thing. Right. The difference is, before starting to read and send data, ploop copy asks the kernel to turn on the write tracker, and the kernel starts to memorize a list of data blocks that are modified (written into). Then, after all the blocks are sent, ploop copy is politely expressing the interest in this list, and send the blocks from the list again, while the kernel is creating another list. The process repeats a few times, and the list becomes shorter and shorter. After a few iterations (either the list is empty, or it is not getting shorter, or we just decide that we already did enough iterations) ploop copy executes an external command, which should stop any disk activity for this ploop device. This command is either vzctl stop for offline migration or vzctl checkpoint for live migration; obviously, stopped or frozen container will not write anything to disk. After that, ploop copy asks the kernel for the list of modified blocks again, transfers the blocks listed, and finally asks the kernel for this list again. If this time the list is not empty, that means something is very wrong, meaning that stopping command haven't done what it should and we fail. Otherwise all is good and ploop copy sends a marker telling the transfer is over. So this is how the sending process works.
The receiving ploop copy process is trivial -- it just reads the blocks from stdin and writes those to file (to a specified position). If you want to see the code of both sending and receiving sides, look no further.
All right, in the previous migration description ploop copy is used in place of second rsync run (steps 3 and 4). I'd like to note that this way it's more effective, because rsync is trying to figure out which files have changed and where, while ploop copy just asks about it from the kernel. Also, because the "ask and send" process is iterative, container will be stopped or frozen as late as it can be and, even if container is writing data to disk actively, the period for which it is stopped is minimal.
Just out of pure curiosity I performed a quick non-scientific test, having "od -x /dev/urandom > file" running inside a container and live migrating it back and forth. ploop copy time after the freeze was a bit over 1 second, and the total frozen time a bit less than 3 seconds. Similar numbers can be obtained from the traditional simfs+rsync migration, but only if a container is not doing any significant I/O. Then I tried to migrate similar container on simfs running the same command running inside, and the frozen time increased to 13-16 seconds. I don't want to say these measurements are to be trusted, I just ran it without any precautions, with OpenVZ instances running inside Parallels VMs, with the physical server busy with something else...
Oh, the last thing. All this functionality is already included into latest tools releases: ploop 1.3 and vzctl 3.3.
Are you ready for the next cool feature? Please welcome CT console.
Available in RHEL6-based kernel since 042stab048.1, this feature is pretty simple to use. Use vzctl attach CTID to attach to this container's console, and you will be able to see all the messages CT init is writing to console, or run getty on it, or anything else.
Please note that the console is persistent, i.e. it is available even if a container is not running. That way, you can run vzctl attach vzctl consoleand then (in another terminal) vzctl start. That also means that if a container is stopped, vzctl attach is still there.
Press Esc . to detach from the console.
The feature (vzctl git commit) will be available in up-coming vzctl-3.0.31. I have just made a nightly build of vzctl (version 3.0.30.2-18.git.a1f523f) available so you can test this. Check http://wiki.openvz.org/Download/vzctl/nightly for information of how to get a nightly build.
Update: the feature is renamed to vzctl console. Update: comments disabled due to spam.
Update: note you need VSwap enabled (ie currently RHEL6-based) kernel for the below stuff to work, see http://wiki.openvz.org/VSwap
VSwap is an excellent feature, simplifying container resource management a lot. Now it's time to also simplify the command line interface, i.e. vzctl. Here is what we did recently (take a look at vzctl git repo if you want to review the actual changes):
1. We no longer require to set kmemsize, dcachesize and lockedpages parameters to non-unlimited values (this is one of the enhancements in the recent kernels we have talked about recently). Therefore, setting these parameters to fractions of CT RAM (physpages) are now removed from configuration files and vzsplit output.
2. There is no longer a need to specify all the UBC parameters in per-container configuration file. If you leave some parameters unset, the kernel will use default values (usually unlimited). So, VSwap example configs are now much smaller, with as much as 19 parameters removed from those.
3. vzctl set now supports two new options: --ram and --swap. These are just convenient short aliases for --physpages and --swappages, the differences being that you only need to specify one value (the limit) and the argument is in bytes rather than pages.
So, to configure a container named MyCT to have 1.5GB of RAM and 3GB of swap space available, all you need to do is just run this command: vzctl set MyCT --ram 1.5G --swap 3G --save
4. This is not VSwap-related, but nevertheless worth sharing. Let's illustrate it by a copy-paste from a terminal: # vzctl create 123 --ostemplate centos-4-x86_64 Creating container private area (centos-4-x86_64) Found centos-4-x86_64.tar.gz at http://download.openvz.org/template/precreated//centos-4-x86_64.tar.gz Downloading... -- 2011-11-29 18:54:08 -- http://download.openvz.org/template/precreated//centos-4-x86_64.tar.gz Resolving download.openvz.org... 64.131.90.11 Connecting to download.openvz.org|64.131.90.11|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 171979832 (164M) [application/x-gzip] Saving to: `/vz/template/cache/centos-4-x86_64.tar.gz'
100%[======================================>] 171,979,832 588K/s in 4m 27s
Success! Performing postcreate actions Saved parameters for CT 123 Container private area was created
All this will be available in vzctl-3.0.30, which is to be released soon (next week? who knows). If you can't wait and want to test this stuff, here's a nightly build of vzctl available (version 3.0.29.3-27.git.0535fe1) from http://wiki.openvz.org/Download/vzctl/nightly. Please give it a try and tell me what you think.
I tried it and was able to migrate a CentOS 7 container... but the Fedora 22 one seems to be stuck in the "started" phase. It creates a /vz/private/{ctid} dir on the destination host (with the same…
The fall semester is just around the corner... so it is impossible for me to break away for a trip to Seattle. I hope one or more of you guys can blog so I can attend vicariously.
Comments
Do you still stand by your opinions above now in 2016?…