Stefan, the kernel version in the Amazon Linux AMI that Matt pointed at is 3.2.21-1.32.6.amzn1.x86_64, so it is very close to comparable with the affected Ubuntu kernel (yes, there are source differences, but they at least have a merge base of 3.2.21 so they share significant lineage).
I was able to reproduce the deadlock on the latest Ubuntu 12.04 PV AMI (ami-3d4ff254), running linux-image-3.2.0-31-virtual (3.2.0-31.50). It didn't take very long until the thing was totally frozen (less than a GiB of space on /mnt consumed).
I'm currently trying something new now. I've built the same Ubuntu kernel from git (Ubuntu-3.2.0-31.50-0-g0d9657d), but instead of using the Ubuntu kernel config, I grabbed the kernel config used in the Amazon Linux AMI Matt mentioned (ami-aecd60c7). So far it hasn't keeled over (at 13GiB right now, running for about 1 hour 20 minutes).
I'm looking through the config diff right now, from Ubuntu config -> Amazon Linux config. They have significant differences, but mostly in the drivers selected. These are some highlights that stand out to me:
CONFIG_DEFAULT_IOSCHED="noop" instead of "deadline"
CONFIG_HZ=1000 instead of CONFIG_HZ=250
No CONFIG_IOSCHED_{DEADLINE,CFQ}
No CONFIG_COMPACTION
No CONFIG_CLEANCACHE
No CONFIG_CFS_BANDWIDTH
No CONFIG_SCHED_AUTOGROUP
No CONFIG_CGROUP_MEM_RES_CTLR*
No CONFIG_XEN_SELFBALLOONING
No hugepage-related options (HUGETLBFS, TRANSPARENT_HUGEPAGE, etc)
Numerous device drivers not relevant to a VM are disabled (CONFIG_DVB_*, CONFIG_VIDEO_*, CONFIG_SND_*, etc), though these code paths are largely not exercised in the guest regardless of whether they're built, and I'd find it difficult to believe one of these causes the deadlock.
I suspect the faulting path is hit by one of the above config options. I'll continue my investigation.
Stefan, the kernel version in the Amazon Linux AMI that Matt pointed at is 3.2.21- 1.32.6. amzn1.x86_ 64, so it is very close to comparable with the affected Ubuntu kernel (yes, there are source differences, but they at least have a merge base of 3.2.21 so they share significant lineage).
I was able to reproduce the deadlock on the latest Ubuntu 12.04 PV AMI (ami-3d4ff254), running linux-image- 3.2.0-31- virtual (3.2.0-31.50). It didn't take very long until the thing was totally frozen (less than a GiB of space on /mnt consumed).
I'm currently trying something new now. I've built the same Ubuntu kernel from git (Ubuntu- 3.2.0-31. 50-0-g0d9657d) , but instead of using the Ubuntu kernel config, I grabbed the kernel config used in the Amazon Linux AMI Matt mentioned (ami-aecd60c7). So far it hasn't keeled over (at 13GiB right now, running for about 1 hour 20 minutes).
I'm looking through the config diff right now, from Ubuntu config -> Amazon Linux config. They have significant differences, but mostly in the drivers selected. These are some highlights that stand out to me:
CONFIG_ DEFAULT_ IOSCHED= "noop" instead of "deadline" IOSCHED_ {DEADLINE, CFQ} CFS_BANDWIDTH SCHED_AUTOGROUP CGROUP_ MEM_RES_ CTLR* XEN_SELFBALLOON ING HUGEPAGE, etc)
CONFIG_HZ=1000 instead of CONFIG_HZ=250
No CONFIG_
No CONFIG_COMPACTION
No CONFIG_CLEANCACHE
No CONFIG_
No CONFIG_
No CONFIG_
No CONFIG_
No hugepage-related options (HUGETLBFS, TRANSPARENT_
Numerous device drivers not relevant to a VM are disabled (CONFIG_DVB_*, CONFIG_VIDEO_*, CONFIG_SND_*, etc), though these code paths are largely not exercised in the guest regardless of whether they're built, and I'd find it difficult to believe one of these causes the deadlock.
I suspect the faulting path is hit by one of the above config options. I'll continue my investigation.