Comment 7 for bug 298906

Revision history for this message
ddom (dominikus-nold) wrote :

I can even confirm that bug. It first appeared on my 2.6.27-11 kernel (x86_64, SMP) running Ubuntu 8.10.

The strange thing is that it has been running fine until a few days ago. My system is running two s-ata disks in raid 1 mode so md0 gets started correctly, md1 stops as stated by the kernel message before running /scripts/local-top/cryptroot. After spending some time in debugging at which point the problem occurs, I noticed that adding a delay didn't really fix it, though it fixed it on another system running just 1 disk (ide) with raid1 setup. Before I even had to enter the passphrase exactly 4 times, after adding a loop (3 times checking for LUKS encryption) it works for the ide system.

The s-ata system still fails, so I added a function greping modprobe -l output and loading missing modules like raid1, etc. - if not found in modprobe output. Then - this was the most important part I noticed - I started mdadm --assemble --scan one time and added some sleep seconds (3 in my case). That caused md1 to start up even it has been stopped before by some other process I haven't identified by now.

I append my patch here for information purposes and for easier understanding. Maybe it's not the "best" way to work around this problem, but it works for now - tested on 2.6.27-11 on x86_64 with SMP (AMD Turion X2 Ultra) and 2.6.27-12 on x86 without SMP (VIA Nehemiah). I even played around a little bit before so there might be more changes that are not relevant (e.g. I tried to change the way the key is passed to cryptsetup to identify the point of failure).

Another side note: I noticed that since that time boot failed on my 2-disk s-ata system the kernel shows messages like "s-ata-1: soft reset failed" (s-ata 1, 2, 3 and 4). I got to know this when I left usplash for debugging purposes temporarily in cryptroot script.

It would be a great deal if someone would be able to identify the problem and whether it might be related to an ide/s-ata controller problem when resetting the device and then stopping the raid device (though it has been started before) just before executing cryptroot or anything related to mdadm daemon.