Monday, 7 April 2008

Linux sysadmin'ing tip of the day (again)

On the theme of clocks, virtual machines and just not getting it.

So you've got a spankin' new server, one with humongous disk space, lots of ram and cpus on the double digits. What's the first thing you think? "We can put all sorts of machines here and there'll still be space for a fool around linux to try new things".

Yes, the joys of virtualization. In this case, a centos machine running xen. Installation was simple enough, and after the first hour I already had a virtualized linux and was starting the process of installing a windows machine to serve as a development playground for everyone. By the second day I had already forgotten about the xen installation, since its purpose was being fully fulfilled as the host for the many (okay, just 3...) virtual machines installed on top.

Fast forward some weeks, and here I am trying to figure out why the bloody server is one hour ahead of time. NTPd is running, I can see the messages telling me that "yes, the time was a bit ahead, and we've got it right, now", but still the time was ahead one hour and it was not right. Damned time zones, I think! and so (naively, as it is clear to me now) I set the time zone from Europe/Lisbon to GMT in an attempt to make the system think he's one hour behind. And it works.

Until today.

After committing a couple of files to the server, and before packing on home, I check the integration server to see if everything is okay and the build is on its way to a green icon. Nop, still green. Last build time... yesterday. Strange. Picking through the logs I find out the server's not picking the latest change in the source. And that's when I notice that the commit emails fired from subversion come out one hour ahead. Again. Back I go, logging in to the server and trying to think up why the hell the hour kept going back.

Perhaps ntpd isn't working properly? Was it syncing to a bad server? Was the daemon not running correctly?

A peek in the logs shows ntpd trying to sync the time, time and time again without success, with no reason as to why it wasn't working. Okay, let's try and set the date by hand. Good, it works. And now it's back to normal. Hun?! Why... did the date... change by itself? Hardware clock?

Oh, wait... hardware clock... in a xen environment?

Yes, ladies and gentlemen, the answer right before our silly little noses. The hardware clock, in a xen virtualized environment, is managed by the host (dom0 in xen parlance), and unless specified otherwise in an obscure flag it stays that way, not allowing changes in the client environments.

And so, all it was needed was for the host machine to have a correct time and all was well in the land. Setting the correct date and time on the host machine sets also the correct date and time on the clients.

