Top.Mail.Ru
? ?
We have finally filed a number of proposals for the up-coming Containers and CGroups miniconf to be held during Linux Plumbers Conference, 7 to 9 September 2011 in Santa Rosa, CA.

From those proposals, one can clearly see what are our plans regarding the mainline integration. In a few words: dcache management, memory and CPU cgroup controllers improvements, container enter, improved /proc virtualization, checkpoint/restart [mostly] in userspace (of which I have blogged recently), and making vzctl work with mainline kernel containers. Oh, and the interesting loopback-like block device to hold a container filesystem (a.k.a. ploop).

Quite a lot of interesting stuff, what do you think?

Checkpoint/restart (mostly) in user space

There is a good article at lwn.net telling about one of our latest development.

We have checkpoint/restart (CPT) and live migration in OpenVZ for ages (well, OK, since 2007 or so), allowing for containers to be freely moved between physical servers without any service interruption. It is a great feature which is valued by our users. The problem is we can't merge it upstream, ie to vanilla kernel.

Various people from our team worked on that, and they all gave up. Then, Oren Laadan was trying very hard to merge his CPT implementation -- unfortunately it didn't worked out very well either. The thing is, checkpointing is a complex thing, and the patch implementing it is very intrusive.

Recently, our kernel team leader Pavel Emelyanov got a new idea of moving most of the checkpointing complexity out of the kernel and into user space, thus minimizing the amount of the in-kernel changes needed. In about two weeks of time he wrote a working prototype. So far the reaction is mostly positive, and he's going to submit a second RFC version for review to lkml.

For more details, read the lwn.net article. After all, while I am sitting next to Pavel, Mr. Corbet ability to explain complex stuff in simple terms is way better than mine.

ioping

My colleague koct9i, whose daily job is developing and fixing OpenVZ kernel, was feeling bored last weekend, and to entertain himself he wrote a small utility called ioping. The idea is simple and straightforward: to wrote a utility similar to ping, which will show I/O latency in the same way ping shows network latency. The idea is very simple but I haven't see something like this. Actually, the tool was written to help investigating OpenVZ bug #1880).

I liked ioping and worked on it a bit, too, just for run. Among some other minor stuff I have added a man page and spec file, so it is now available as an RPM package.

Official project site: http://code.google.com/p/ioping/

My RPM packages and stuff: http://kir.sacred.ru/ioping/

Ubuntu 11.04 template

In addition to a bunch of template updates released last week, yesterday night we released Ubuntu 11.04 templates for OpenVZ, for both x86 and x86_64 architectures. They are currently in beta and therefore available from http://wiki.openvz.org/Download/template/precreated/beta (which is actually just a wiki page with pretty links to http://download.openvz.org/template/precreated/beta/).

Make sure you use latest vzctl (3.0.26.2), otherwise vzctl enter won't work with Ubuntu 11.04 containers. As usual, report all bugs to http://bugzilla.openvz.org/

template update

Good news everyone, I am in the process of updating all the precreated templates. From now on I will probably do full updates quarterly.

New templates:
* Debian 6.0
* SUSE 11.4

Moved from beta:
* SUSE 11.2, 11.3
* Fedora 14

Moved to unsupported (because they are no longer supported by upstream vendors):
* Fedora 12 (EOL Dec 02 2010)
* SUSE 11.1 (EOL Jan 14 2011)

The update will appear in a few days, probably as soon as next Tuesday or so.

And yes, we are still waiting for CentOS 6 to be ready...

Update: it took me much more time than expected; updates are finally released 28 Apr 2011
http://wiki.openvz.org/Download/template/precreated

devel@ mailing list mess is no more

OK, I must admit is was a very bad idea of me to subscribe our devel at openvz dot org mailing list to containers at linux-foundation dot com mailing list.

This is to announce that from now on devel@ is a separate list, not mirroring containers@ or anything. From now on, if the topic is openvz-specific, like a patch to OpenVZ, please use devel@. If the topic is about containers (as appearing in mainline), use containers@.

Let me explain. Initially, when we started moving OpenVZ project forward, we wanted to discuss all the things about containers on a mailing list, and therefore I created devel@. Later, then other parties joined, it was decided to create containers at osdl.org mailing list (remember OSDL later became the Linux Foundation). At that time I was worried that the discussions will split, and decided to just subscribe our devel@ to containers@, so devel@ becomes a super-set of containers@ (i.e. every message posted to containers@ will appear on devel@, but not vice versa).

Of course it ended up being a big mess. Better late than never, mess is no more!

Update: comments disabled due to spam

back to 2006, or openvz bug #60

Some software bugs, while being simple and stupid, have an interesting and long lasting life. Here is the story of such a very simple bug with a lifespan of about 5 years (or more? I don't know when it was introduced). The bug doesn't worth looking at otherwise, so I'll try to be short, and more info is available from the links. OK,

back in 2006 I whined about a bug in sysvinit we found. Until today I thought is was never fixed upstream.

This night I found out that it's actually fixed in sysvinit (2.87dsf), released in Jul 2009, according to its changelog:

 * Adjust init to terminate argv0 with one 0 rather than two so that
    process name can be one character longer.  Patch by Kir Kolyshkin.

Unfortunately it wrongly contributes me as a patch author. The actual author is Dmitry Mishin, as seen in OpenVZ bug #60, I just submitted it.

Tags:

on static function declaration

If you are a seasoned C programmer, skip this post entirely (or try to find bugs in it). If you know C but don't consider yourself an expert, please read on -- it might be helpful.

I was working a bit on vzctl today (my target was bug #1757, which is still a work in progress) and ... I am not sure how, but I ended up declaring most functions in src/vzlist.c as static. I thought it doesn't have any practical value -- I was wrong!

In C, if you declare the function as static, it means its visibility is limited to the translation unit (i.e. a file) in which it is defined. In other words, you can only call/use a static function from another function in the same file.

Now, in vzctl sources vzlist.c is only linked to one binary -- vzlist, and therefore I thought it doesn't make much sense to declare functions as static. Nevertheless I did it (see git commit).

Next thing I got is a set of compiler warnings! OK, all right, let's take a look...

First set of warnings is self-explanatory. See:
vzlist.c:825:14: warning: ‘parse_var’ defined but not used
vzlist.c:1075:14: warning: ‘remove_sp’ defined but not used
vzlist.c:1357:12: warning: ‘get_stop_quota_stats’ defined but not used


Easy! In some ancient time, these functions were used, now the code has changed and no one needs these three, but they were not removed for some reason (probably just forgotten). Solution: remove the dead code (see git commit).

Second set of warnings looks similar:
vzlist.c:400:1: warning: ‘dcachesize_m_sort_fn’ defined but not used
vzlist.c:400:1: warning: ‘dcachesize_l_sort_fn’ defined but not used
vzlist.c:400:1: warning: ‘dcachesize_b_sort_fn’ defined but not used
vzlist.c:400:1: warning: ‘dcachesize_f_sort_fn’ defined but not used
vzlist.c:411:1: warning: ‘diskinodes_s_sort_fn’ defined but not used
vzlist.c:411:1: warning: ‘diskinodes_h_sort_fn’ defined but not used


Hmm... all these *_sort_fn are sort functions generated by means of a few #define statements, and they are used when vzlist needs to sort its output by some column or parameter (vzlist -s). It is very strange that these are not used, because they should be. Let's take a closer look... zOMG! it's a bug!

Apparently, someone was using copy-paste technique* and forgot to change the names of the functions. The bug is, when you ask vzlist to sort its output to, say, dcachesize failcounter values, it sorts it by dcachesize held values instead, because of the wrong sort function used. Such bugs are hard to notice manually, and there are no autotests for vzlist.

* Yes some parts of vzlist is a copy-pasted mess, I am slowly working on untangling it. For example, see my previous cleanup patches (committed back in June 2010):
src/vzlist.c: streamline a few macros
vzlist: put similar print_ functions in a macro
vzlist.c: simplify last_field logic

Morale: sometimes declaring functions as static actually helps!

PS if you see mistakes in this blog post, patches to it are welcome. It's 1am here and I am a bit sleepy.

Tags:

You probably thought we have abandoned 2.6.27 kernel branch. Well, we ourselves thought we did (although it was not yet officially announced). Then, out of a sudden, kernel 2.6.27-repin.1 is released, rebasing to latest upstream kernel (2.6.27.57), and fixing OpenVZ bug #1593.

The thing is, this kernel is called after Ilya Repin, a leading Russian painter and sculptor of the Peredvizhniki artistic school. One of his best paintings is called "Unexpected Return", and I happen to enjoy the original in Tretyakov Gallery here in Moscow a couple of weeks ago. So here it is: the unexpected return of 2.6.27 kernel. It took Ilya 4 years to finish the painting, it took Pavel 6 months to release the fix. Better late than never, that is.

Please enjoy: Ilya Repin. Unexpected return. 1884-1888.

news from the VSwap front

I have added vswap confguration samples to vzctl git. Basically, you set physpages and swappages and leave every other beancounter at unlimited. For example, this is how ve-vswap-256m-conf.sample looks like:

cat ve-vswap-256m.conf-sampleCollapse )

As you can see, physpages (ie RAM size) is set to 256 megabytes, while swappages (ie swap size) is set to 512 megabytes, all the other beancounters are unlimited. Wow, it's never been easier to configure your containers!

Now, we can utilize this stuff using RHEL6 based kernel. This is what we see from inside the container:

[root@localhost ~]# vzctl enter 103
entered into CT 103
[root@localhost /]# free
             total       used       free     shared    buffers     cached
Mem:        262144      23936     238208          0          0      10968
-/+ buffers/cache:      12968     249176
Swap:       524288          0     524288

cat /proc/user_beancounters; cat /proc/meminfoCollapse )

Latest Month

July 2016
S M T W T F S
     12
3456789
10111213141516
17181920212223
24252627282930
31      

Syndicate

RSS Atom

Comments

Powered by LiveJournal.com
Designed by Tiffany Chow