From those proposals, one can clearly see what are our plans regarding the mainline integration. In a few words: dcache management, memory and CPU cgroup controllers improvements, container enter, improved /proc virtualization, checkpoint/restart [mostly] in userspace (of which I have blogged recently), and making vzctl work with mainline kernel containers. Oh, and the interesting loopback-like block device to hold a container filesystem (a.k.a. ploop).
Quite a lot of interesting stuff, what do you think?
We have checkpoint/restart (CPT) and live migration in OpenVZ for ages (well, OK, since 2007 or so), allowing for containers to be freely moved between physical servers without any service interruption. It is a great feature which is valued by our users. The problem is we can't merge it upstream, ie to vanilla kernel.
Various people from our team worked on that, and they all gave up. Then, Oren Laadan was trying very hard to merge his CPT implementation -- unfortunately it didn't worked out very well either. The thing is, checkpointing is a complex thing, and the patch implementing it is very intrusive.
Recently, our kernel team leader Pavel Emelyanov got a new idea of moving most of the checkpointing complexity out of the kernel and into user space, thus minimizing the amount of the in-kernel changes needed. In about two weeks of time he wrote a working prototype. So far the reaction is mostly positive, and he's going to submit a second RFC version for review to lkml.
For more details, read the lwn.net article. After all, while I am sitting next to Pavel, Mr. Corbet ability to explain complex stuff in simple terms is way better than mine.
I liked ioping and worked on it a bit, too, just for run. Among some other minor stuff I have added a man page and spec file, so it is now available as an RPM package.
Official project site: http://code.google.com/p/ioping/
My RPM packages and stuff: http://kir.sacred.ru/ioping/
Make sure you use latest vzctl (3.0.26.2), otherwise
vzctl enter won't work with Ubuntu 11.04 containers. As usual, report all bugs to http://bugzilla.openvz.org/New templates:
* Debian 6.0
* SUSE 11.4
Moved from beta:
* SUSE 11.2, 11.3
* Fedora 14
Moved to unsupported (because they are no longer supported by upstream vendors):
* Fedora 12 (EOL Dec 02 2010)
* SUSE 11.1 (EOL Jan 14 2011)
The update will appear in a few days, probably as soon as next Tuesday or so.
And yes, we are still waiting for CentOS 6 to be ready...
Update: it took me much more time than expected; updates are finally released 28 Apr 2011
http://wiki.openvz.org/Download/template/precreated
This is to announce that from now on devel@ is a separate list, not mirroring containers@ or anything. From now on, if the topic is openvz-specific, like a patch to OpenVZ, please use devel@. If the topic is about containers (as appearing in mainline), use containers@.
Let me explain. Initially, when we started moving OpenVZ project forward, we wanted to discuss all the things about containers on a mailing list, and therefore I created devel@. Later, then other parties joined, it was decided to create containers at osdl.org mailing list (remember OSDL later became the Linux Foundation). At that time I was worried that the discussions will split, and decided to just subscribe our devel@ to containers@, so devel@ becomes a super-set of containers@ (i.e. every message posted to containers@ will appear on devel@, but not vice versa).
Of course it ended up being a big mess. Better late than never, mess is no more!
Update: comments disabled due to spam
back in 2006 I whined about a bug in sysvinit we found. Until today I thought is was never fixed upstream.
This night I found out that it's actually fixed in sysvinit (2.87dsf), released in Jul 2009, according to its changelog:
* Adjust init to terminate argv0 with one 0 rather than two so that
process name can be one character longer. Patch by Kir Kolyshkin.
Unfortunately it wrongly contributes me as a patch author. The actual author is Dmitry Mishin, as seen in OpenVZ bug #60, I just submitted it.
I was working a bit on vzctl today (my target was bug #1757, which is still a work in progress) and ... I am not sure how, but I ended up declaring most functions in src/vzlist.c as static. I thought it doesn't have any practical value -- I was wrong!
In C, if you declare the function as static, it means its visibility is limited to the translation unit (i.e. a file) in which it is defined. In other words, you can only call/use a static function from another function in the same file.
Now, in vzctl sources vzlist.c is only linked to one binary -- vzlist, and therefore I thought it doesn't make much sense to declare functions as static. Nevertheless I did it (see git commit).
Next thing I got is a set of compiler warnings! OK, all right, let's take a look...
First set of warnings is self-explanatory. See:
vzlist.c:825:14: warning: ‘parse_var’ defined but not used
vzlist.c:1075:14: warning: ‘remove_sp’ defined but not used
vzlist.c:1357:12: warning: ‘get_stop_quota_stats’ defined but not used
Easy! In some ancient time, these functions were used, now the code has changed and no one needs these three, but they were not removed for some reason (probably just forgotten). Solution: remove the dead code (see git commit).
Second set of warnings looks similar:
vzlist.c:400:1: warning: ‘dcachesize_m_sort_fn’ defined but not used
vzlist.c:400:1: warning: ‘dcachesize_l_sort_fn’ defined but not used
vzlist.c:400:1: warning: ‘dcachesize_b_sort_fn’ defined but not used
vzlist.c:400:1: warning: ‘dcachesize_f_sort_fn’ defined but not used
vzlist.c:411:1: warning: ‘diskinodes_s_sort_fn’ defined but not used
vzlist.c:411:1: warning: ‘diskinodes_h_sort_fn’ defined but not used
Hmm... all these *_sort_fn are sort functions generated by means of a few #define statements, and they are used when vzlist needs to sort its output by some column or parameter (vzlist -s). It is very strange that these are not used, because they should be. Let's take a closer look... zOMG! it's a bug!
Apparently, someone was using copy-paste technique* and forgot to change the names of the functions. The bug is, when you ask vzlist to sort its output to, say, dcachesize failcounter values, it sorts it by dcachesize held values instead, because of the wrong sort function used. Such bugs are hard to notice manually, and there are no autotests for vzlist.
* Yes some parts of vzlist is a copy-pasted mess, I am slowly working on untangling it. For example, see my previous cleanup patches (committed back in June 2010):
src/vzlist.c: streamline a few macros
vzlist: put similar print_ functions in a macro
vzlist.c: simplify last_field logic
Morale: sometimes declaring functions as static actually helps!
PS if you see mistakes in this blog post, patches to it are welcome. It's 1am here and I am a bit sleepy.
The thing is, this kernel is called after Ilya Repin, a leading Russian painter and sculptor of the Peredvizhniki artistic school. One of his best paintings is called "Unexpected Return", and I happen to enjoy the original in Tretyakov Gallery here in Moscow a couple of weeks ago. So here it is: the unexpected return of 2.6.27 kernel. It took Ilya 4 years to finish the painting, it took Pavel 6 months to release the fix. Better late than never, that is.
Please enjoy: Ilya Repin. Unexpected return. 1884-1888.
I have added vswap confguration samples to vzctl git. Basically, you set physpages and swappages and leave every other beancounter at unlimited. For example, this is how ve-vswap-256m-conf.sample looks like:
( cat ve-vswap-256m.conf-sampleCollapse )As you can see, physpages (ie RAM size) is set to 256 megabytes, while swappages (ie swap size) is set to 512 megabytes, all the other beancounters are unlimited. Wow, it's never been easier to configure your containers!
Now, we can utilize this stuff using RHEL6 based kernel. This is what we see from inside the container:
[root@localhost ~]# vzctl enter 103
entered into CT 103
[root@localhost /]# free
total used free shared buffers cached
Mem: 262144 23936 238208 0 0 10968
-/+ buffers/cache: 12968 249176
Swap: 524288 0 524288
( cat /proc/user_beancounters; cat /proc/meminfoCollapse )
Comments
Do you still stand by your opinions above now in 2016?…