Comment 2 for bug 785124

Revision history for this message
Thomas Schweikle (tps) wrote :

After really digging deep into this issue, I found what caused the kernel panic:
it was creating an additional lvm volume within the pool I had created. Since this pool shares volumes for the system and ones for virtual machines nothing not common. I'm wondering, why this action took about 24h to lead to a crash.

Here is what was done before:
1. Created an additional lvm-volume to be assigned to a virtual machine later on.
2. copied data on this volume (netcat+dd from an other network reachable server).
3. shutdown the vm in question (named afs).
4. assigned the volume, removed the old file based volume.
5. started the vm again.
6. vm started OK, no problems so far.
7. 8h later: restarted libvirtd for other reasons.
7. system crashed trying to access this volume and write data to it. The crash wasn't immediate. It was a slow down of the vm until it was inaccessible, after restarting the vm the whole system crashed.

Here is what was not done assigning the volume to the virtual machine:
Normally virtlibd creates two files in /etc/apparmore.d/libvirtd named "libvirt-{uuid-of-virtual-machine}" and "libvirt-{uuid-of-virtual-machine}.files". The first includes the second. After creating these files apparmore is reloaded.

These files were not created, and apparmore was not reloaded after assigning the volume to the vm. But the starting process had necessary rights to access and write the volume.

For other reasons (I could not find out -- does libvirtd if restarted reload apparmore?), apparmore was reloaded about 8h later. The moment this was done the vm did not have access to its volume and crashed. This wasn't really fatal, since this was only one vm within a whole bunch of them. It would not have had any consequences except it had been restarted. But: this crash lead to a lvm crash, which turned out fatal to the running host: it stopped working with a kernel panic. This should not happen at all!