Top.Mail.Ru
? ?

Entries by tag: andrew morton

One of the goals of OpenVZ project is to integrate containers functionality into the mainstream Linux kernel. As you know, most of the new kernel code goes through Andrew Morton, the right hand of Linus Torvalds.

I just came across the video of Andrew speaking at the LinuxWorld Expo 2007. Among the other topics, he tells what is going to be in the kernel in a year or so. It is quite interesting to see what he thinks of containers -- to see that part, scroll to 40:58.

Update: here's the transcription of the relevant part, provided by dowdle.

The one prediction I am prepared to make is that over the next 1 to 2 years there'll be quite a lot of focus in the core of the Linux kernel on the project which has many names. Some people call it containerization, others will call it operating system virtualization, other people will call it resource management. It's a whole cloud of different features which have different applications.

It can be used for machine partitioning, to partition workloads amongst one machine, otherwise known as workload management.

Server consolidation. Well, you have a whole bunch of servers which are 30 percent loaded -- move all those things onto one the machine without having to tread on each others toes.

Resource management. A number of people in the high end numerical computing want this; numerical computing area want resource management. Other people who are running world famous web search engines also want resource management in their kernel. In fact, the major, central piece of the whole containerization framework is from an engineer at Google. It's in my tree at present and I'm hoping to get it in at 2.6.24. It's just a framework for containerization. A whole lot of other stuff is going to plug in underneath it, which is under development at present.

So an example of resource management is you might have a particular group of processes, [and] you want to not let it use more than 200 MB of physical memory, and a certain amount of disk bandwidth, network bandwidth, a certain amount of CPU -- so you can just have this little blob and give it maximum amount of resources it can consume, let it run without letting it trash everything else which is running on the machine. So that is a resource management application. People also need this feature for high availability... and I'm still not really sure I understand why.

Also the OpenVZ product, which comes out of the development team in Russia -- that's a mature project that is mainly for web server virtualization, having lots and lots of different instances of the web server on one machine, not have one excessively taking resources away from another. They've been working very hard and very patiently, and with great accommodation on this project. I hope slowly we'll start moving significant parts of the OpenVZ product into the Linux kernel in a way in which it's acceptable to all the other stake holders, so that those guys don't end up carrying such a patch burden.

one kernel bug story among 305

A few days ago one of OpenVZ kernel team members, Pavel Emelyanov, posted a one-line patch to fix a bug in Linux kernel. He received the following reply from Andrew Morton, one of the upstream kernel maintainers:


I'm curious. For the past few months, people@openvz.org have discovered (and fixed) an ongoing stream of obscure but serious and quite long-standing bugs.

How are you discovering these bugs?


Andrew added later:


hm, OK, I was visualising some mysterious Russian bugfinding machine or something.

Don't stop ;)


So, here is the story behind that bug.

A few months ago, in the course of OpenVZ kernel testing, our QA (Quality Assurance) team found a strange issue. The thing is, every container (VE) in OpenVZ has a set of resource usage counters (and limits) called beancounters. All the usage counters should be zero when a VE is stopped, since naturally then all the resources are released. The issue was that a resource called kmemsize (a kernel memory used on behalf of given VE) had a usage counter of 78 bytes after the VE was stopped -- which effectively means 78 bytes of kernel memory were lost (or leaked, as programmers say).

Who cares about 78 bytes, especially on a server with 16 gigabytes (17,179,869,184 bytes) of RAM? We do. Pavel checked the beancounters debug information which showed that one struct user object has leaked. He then tried to reproduce that but with no luck.

Bugs that can not be reproduced are tough. The only option left was to audit the kernel source code. That involved finding all the places where struct user object is referenced, and checking the code correctness (the term "correctness" in this context means that every object that is allocated must later be released). It took him 4 hours to do the audit, and he found one place where the reference to an object might be lost (which means it could not later be released). It's the same as if you lend a book to your friend and later forgot whom you gave it to -- you lost the reference and you can't get the book back.

In this case, after the problem was found, fixing it was pretty straightforward. So Pavel wrote a fix and a demo code to trigger the bug, tested the fix and sent it to Linux kernel mailing list.

Why is this particular incident so important?
* It's OpenVZ code (beancounters) which helped to detect the leak in the first place -- as the bug is very hard to trigger (unless you know how) and the leak is small enough that it might not be discovered at all.
* It demonstrates OpenVZ developers dedicated attitude. They never dismiss real bugs as "works for me" or "invalid", and work to find the root cause and fix the problem.
* This bug is in fact a security issue. An ordinary user (actually two users are needed in this case) could exploit the bug and eat all the kernel memory, thus bringing the whole system down. Worse scenarious are possible as well.
* Incidentally, OpenVZ is protected from this security issue -- because kmemsize beancounter (which helped to found it) limits kernel memory usage per Virtual Environment.

Most important of all, this is just one out of 305 kernel patches by our team which were accepted into the mainstream Linux kernel during a one-year period. Almost one patch a day, excluding weekends and holidays. And we are not going to stop! :-)
Recently, I had the opportunity to present at a session of the Gelato Itanium Conference and Expo in San Jose. It was a good fit because they had a special track on virtualization, and OpenVZ (and the Virtuozzo product) is the only stable virtualization technology available now for Itanium servers.

Once again, I was able to talk with Andrew Morton (a kernel hacker, the right hand of Linus Torvalds) and was encouraged about the prospect of OS virtualization and OpenVZ in the Linux kernel. That is something we would really like to see and have been working towards. This article summarizes Andrew’s remarks noting “OpenVZ already has thousands of systems out there” and “as far as containerization standard in mainline goes, ‘most of the stakeholders are playing together quite nicely’”.

Yes, we are and we’ll keep at it so we can realize our goal.

Latest Month

July 2016
S M T W T F S
     12
3456789
10111213141516
17181920212223
24252627282930
31      

Syndicate

RSS Atom

Comments

Powered by LiveJournal.com
Designed by Tiffany Chow