Top.Mail.Ru
? ?

Entries by tag: mainstream

Checkpoint/restart (mostly) in user space

There is a good article at lwn.net telling about one of our latest development.

We have checkpoint/restart (CPT) and live migration in OpenVZ for ages (well, OK, since 2007 or so), allowing for containers to be freely moved between physical servers without any service interruption. It is a great feature which is valued by our users. The problem is we can't merge it upstream, ie to vanilla kernel.

Various people from our team worked on that, and they all gave up. Then, Oren Laadan was trying very hard to merge his CPT implementation -- unfortunately it didn't worked out very well either. The thing is, checkpointing is a complex thing, and the patch implementing it is very intrusive.

Recently, our kernel team leader Pavel Emelyanov got a new idea of moving most of the checkpointing complexity out of the kernel and into user space, thus minimizing the amount of the in-kernel changes needed. In about two weeks of time he wrote a working prototype. So far the reaction is mostly positive, and he's going to submit a second RFC version for review to lkml.

For more details, read the lwn.net article. After all, while I am sitting next to Pavel, Mr. Corbet ability to explain complex stuff in simple terms is way better than mine.
Quite frequently, people ask me if OpenVZ is secure enough. They are thinking that because in OpenVZ everything is running under one single kernel (as opposed to, say Xen or VMware, where each partition runs each own kernel), this single kernel is a single point of failure (SPOF).

The answer is: yes, OpenVZ stable kernel is secure enough to be used for production workloads and in hostile environments. Why? The long answer involves a comparison of different virtualization techniques and their SPOFs, a description of OpenVZ architecture, the "denied by default" principle, the fact that its practically proven on a thousands of servers, etc. The short answer is: because we care.

Security is quite a complex field. It's not enough to write secure code once, or secure your system once. In the real world, security comes from constant care. In other words, it’s not enough for a sys admin who is using a good, secure operating system, but doesn't care daily about security.

The Linux kernel is quite secure. Still, new problems are found and resolved from time to time, by those people who care. Most of them are security experts (like Solar Designer), others just work on Linux.

A few days ago, Red Hat released a new update to RHEL4 kernel (RHSA-2007-0014). Let me quote: Red Hat would like to thank Dmitriy Monakhov and Konstantin Khorenko for reporting issues fixed in this erratum.

Both Dmitriy and Konstantin are working in our Virtuozzo/OpenVZ team. Dmitriy works in the Quality Assurance department (which I wrote about before), making sure our kernels are rock-solid (by trying to break it badly, that is). Konstantin works in our kernel support team, mostly fixing the causes of kernel oopses. Besides that, as you see, they both care for security (as well as everybody else in our team). They find bugs (including security bugs), they report and fix those, they send the results to major distribution vendors (and it's not the first time Red Hat has acknowledged our developers), as well as mainstream Linux (again, I wrote about it as well before).

And this is how Linux wins: with all the parties contributing to everybody's benefit.
Good news for all of us on the virtualization front!

The latest prepatch for the stable Linux kernel tree, 2.6.19-rc1, now includes some pieces of OS-level virtualization from OpenVZ, IBM, and Eric Biederman. Those patches have been sitting in -mm (Andrew Morton’s) tree for some time already, and now, during the “2.6.19 merge window,” Andrew has submitted them to Linus Torvalds. So it’s now a part of “vanilla” Linux, and will be finally released as a part of the 2.6.19 kernel when it is released.

So, what exactly went into the Linux kernel? Essentially, three sets of patches that implementing three features needed for any OS-level virtualization solution. Those are IPC and utsname virtualization, and preparations for pid namespaces - click for detailsCollapse )

I am really happy it is a community work and a community process (like I said before). We see different parties bringing in code and expertize, reviewing each other's code, making suggestions, exchanging ideas and improving things — to everybody's benefit!

These are just the first steps. Much more is needed to have full OS-level virtualization in the mainstream Linux kernel. Don’t worry — we are already working on that. A few days ago Kirill sent another iteration (v5) of beancounter patchset for further review and possible inclusion. Beancounters can be used to implement per-VE limits and guarantees for certain resources such as memory.

OpenVZ contributions to the kernel

I just found out an interesting fact: from the 20 patches that are supposed to be included into the next stable 2.6.17.y (where y=10) kernel, 6 5 were done by OpenVZ developers.

What it could mean, besides the fact that OpenVZ team is a valueable contributor to the mainstream kernel? It also means we do care much for stability and security of OpenVZ, we do a lot of testing and QA, which is good for OpenVZ kernel, but for the mainstream kernel as well.

In a broader sense, this is a nice example of how collaborative open source development works. A nice example of “everybody wins” strategy. Indeed, in open source everybody wins.

Links to individual patches submitted by OpenVZ peopleCollapse )

Update: our kernel team found a bug in this blog post. Looks like the bug belongs to the infamous "off-by-one" category. :) There are actually 5 patches from OpenVZ developers, not 6 — it's just Greg sent one patch twice and thus I counted it twice. Fixed.

Latest Month

July 2016
S M T W T F S
     12
3456789
10111213141516
17181920212223
24252627282930
31      

Syndicate

RSS Atom

Comments

Powered by LiveJournal.com
Designed by Tiffany Chow