Last week me and Kirill Korotaev visited Ottawa to take part in Linux Kernel Summit and Linux Symposium. It was our first time on these events, so we were in a good mood despite the 16 hours flight from Moscow to Ottawa, and the 8 hours timezone change. We went to those event mostly to discuss containers and their integration into mainstream.
Containers (VEs, VPSs), or kernel-level virtualization technology (implemented in OpenVZ), were discussed very widely during both events. The topic was presented by three parties:
- IBM (ex-Meiosys guys)
- Eric Biederman
The overall feeling among the kernel people is: containers are a good feature to have in Linux kernel, let’s merge it into mainstream. But since several different implementations of the technology are available, and several groups are working on those, the mainstream code should be a result of consensus between all those implementations.
So, let me describe what are all those groups are aiming for:
- Eric Biederman wants to have so-called namespaces in kernel. Namespaces are basically a building blocks of containers, for example, with user namespace we have an ability to have the same root user in different containers; network namespace gives an ability to have a separate network interface; process namespace is when you have an isolated set of processes. All the namespaces combined together creates a container. But, as Eric states, an ability to use not all but only selected namespaces gives endless possibilities to a user.
- IBM people want application containers, and for them the main purpose of such containers is live migration of those. The difference between app. container and the “full” (system) container is a set of features: for example, an application container might lack /proc virtualization, devices, pseudo-terminals (needed to run ssh, for example) etc. So, an application container might be seen as a subset of a system container.
- OpenVZ wants system containers that resemble the real system as much as possible. In other words, we want to preserve existing kernel APIs as much as possible inside a container, so all of the existing Linux distributions and applictions should run fine inside a container without any modifications. Of course, the goal is not 100% achievable, for example we do not want the container to be able to set the system time.
- Linux-VServer wants just about the same as OpenVZ, it’s only that their implementations of various components are different, and their level of a container resembling a real system is a bit lower (for example, in networking).
So, from the first glance it’s really hard to find a consensus. Say, Eric’s approach of having a distinct namespaces faces the fact that all the namespaces are heavily interdependent -- for example, processes belongs to user, so process namespace depend on user namespace, and you can hardly find a namespace which can be independent on all the others.
IBM’s application containers are closer to reality, and actually they might be a first step towards a full containers implementation in mainstream. How hard is to move from app. containers to system containers is not yet clear at this point though -- for example, if we do not care about /proc virtualization from the beginning, it might be real pain to add it later. From the other hand, IBM might be quite happy with full containers since they do all they want.
To conclude — this is not going to be an easy task, but it’s doable, and the thing that we met in person and discussed all that stuff, and that all the other kernel developers are all for us helps a lot. Sooner or later, we will be there.