Xen kernel hangs randomly
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Xen |
Confirmed
|
Critical
|
|||
xen-3.1 (Ubuntu) |
Invalid
|
Undecided
|
Unassigned |
Bug Description
Binary package hint: linux-source-2.6.22
I have 3 Xen machines that are hanging randomly. They completely freeze up - nothing on the console either (no panic messages or anything) but I sometimes see a clocksource message in logs (caught only with remote syslogging). This is occurring within 5 seconds of the hang (clustering software reports lost machines within 5 seconds):
xen01 kernel: clocksource/0: Time went backwards: delta=-
The hang doesn't appear to be linked to CPU, RAM or disk activity, as I graph that and haven't noticed anything abnormal around the time of the hangs. In fact, one machine has no xen guests running and no services in use in dom0 and still hung.
I *can* reproduce the hang consistently by generating network traffic though using iperf - usually within 20 or 30 minutes of sustained traffic. Again, often with the clocksource message before hang. I've not yet reproduced it using burnP6 or userspace memtest.
This is with linux-image-
I still have the problem after replacing the hypervisor with the official 32bit PAE and 64bit Xensource versions.
This kernel is running on a Feisty system, with backported Xen 3.1 packages from Gutsy.
The hardware is all PowerEdge 1950, Intel. dom0 is given 2048M (of 16G) and 1 cpu (of 8 possible cores) and therefore has switched to UP mode.
Network cards vary between machines. Some machines have entirely "Intel Corporation 82571EB Gigabit Ethernet Controller", some entirely "Broadcom Corporation NetXtreme II BCM5708".
Bug #146924 is possibly related.
Changed in ubuntu-xen: | |
status: | Unknown → Confirmed |
Changed in ubuntu-xen: | |
importance: | Unknown → Critical |
I have not been able to reproduce this on the same hardware with the official Xensource 64bit hypervisor and the Gutsy AMD64 architecture Xen kernel.
Though likely unimportant - the 32bit xen userspace tools don't work for me with the 64bit kernel, so it was running in SMP mode with 8 cpus.