I've been bugging LWN to cover OpenVZ for a while now especially when the Checkpointing and Live Migration features were released. They cover it in the weekly edition released today. The only problem is that only subscribers can see the weekly edition on the week it is published. Non-subscribers have to wait a week. Check it out when you can. If you have an LWN subscription, you can check it out here:
As might be expected, the checkpointing code is on the long and complicated side. The checkpoint process starts by putting the target process(es) on hold, in a manner similar to what the software suspend code does. Then it comes down to a long series of routines which serialize and write out every data structure and bit of memory associated with a virtual environment. The obvious things are saved: process memory, open files, etc. But the code must also save the full state of each TCP socket (including the backlog of sk_buff structures waiting to be processed), connection tracking information, signal handling status, SYSV IPC information, file descriptors obtained via Unix-domain sockets, asynchronous I/O operations, memory mappings, filesystem namespaces, data in tmpfs files, tty settings, file locks, epoll() file descriptors, accounting information, and more.