preseed Lucid with RAID and LVM fails to boot

Bug #591909 reported by Luis Mondesi
32
This bug affects 5 people
Affects Status Importance Assigned to Milestone
mdadm
New
Undecided
Unassigned
partman-auto-raid (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

Binary package hint: partman-auto-raid

System fails to boot after preseed Lucid (10.04) to create RAID1 + LVM partitions:

1. I can never preseed away the question to confirm that the partition should be committed (d-i partman-md/confirm boolean true)
2. after reboot, /dev/md1p1 is assigned to /dev/sda and /dev/sdb, even though I explicitly had /dev/sda1+/dev/sdb1 as /dev/md0 and /dev/sda2+/dev/sdb2 as /dev/md1

Note: I manually remove all the partitions from the disks (fdisk /dev/sda ... o ... w), including using mdadm --zero-superblock to remove all the information for the arrays.

After rebooting, the system dumps me to a shell where I can manually "fix" things. fdisk -l /dev/sda and /dev/sdb show the right

Attached is a copy of the preseed file I'm using.

Here is the debian instructions http://svn.debian.org/wsvn/d-i/trunk/installer/doc/devel/partman-auto-raid-recipe.txt

I opened this Forum post looking for answers, but at this point, after much testing, this looks more like a bug:

http://ubuntuforums.org/showthread.php?t=1504045&highlight=lvm+raid+preseed

Hopefully this is a trivial thing to fix.

Hardware:
- Sun Workstation Ultra20 m2
- 2 Seagate 500G disks

Tags: lvm preseed raid1
Revision history for this message
Luis Mondesi (lemsx1) wrote :
Changed in partman-auto-raid (Ubuntu):
assignee: nobody → Colin Watson (cjwatson)
Revision history for this message
Pasi Sjöholm (pasi-sjoholm) wrote :

For 1, try:

d-i partman-md/confirm_nooverwrite boolean true

Revision history for this message
Luis Mondesi (lemsx1) wrote :

Pasi,

Thanks. I'm sure I had this option before. I tried that and it indeed allowed the preseed to continue.

When done, I still see the /dev/md1p1 weird partitions and I get dropped to initramfs shell to fix it manually. This sucks.

Revision history for this message
Luis Mondesi (lemsx1) wrote :

I believe this bug boils down to the disk being partition in the wrong way by d-i. Take a look at this:

root@zod[~]# cfdisk /dev/sdb
FATAL ERROR: Bad primary partition 1: Partition ends in the final partial cylind
                          Press any key to exit cfdisk

root@zod[~]# fdisk -l /dev/sdb

Disk /dev/sdb: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x000aba37

   Device Boot Start End Blocks Id System
/dev/sdb1 * 1 32 248832 fd Linux raid autodetect
Partition 1 does not end on cylinder boundary.
/dev/sdb2 32 60802 488136704 fd Linux raid autodetect

As you can see here, sdb1 would take the full drive since it starts at cylinder 1 and ends in 32, but sdb2 starts at 32 and ends in 60802.

This confuses mdadm when re-assembling the RAID1 array as the first partition (sda1), the second partition (sda2) and the full disk (sda) will have the same superblock!

If you preseed a system without RAID, it will boot normal and the system will somehow manage to work, even though the partitions will have the same problem mentioned above.

Example:
root@zod[~]# fdisk -l /dev/sda

Disk /dev/sda: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00004c34

   Device Boot Start End Blocks Id System
/dev/sda1 * 1 32 248832 83 Linux
Partition 1 does not end on cylinder boundary.
/dev/sda2 32 60802 488135681 5 Extended
/dev/sda5 32 59328 476296192 83 Linux
/dev/sda6 59328 60802 11838464 82 Linux swap / Solaris

root@zod[~]# cfdisk /dev/sda
FATAL ERROR: Bad primary partition 1: Partition ends in the final partial cylind

If I partition the disk using cfdisk (fdisk /dev/sdb and use "o" to remove the partition table so I get a clean mbr), then the partitions have the right start-end cylinders:

root@zod[~]# fdisk -l /dev/sdb

Disk /dev/sdb: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0xd276e505

   Device Boot Start End Blocks Id System
/dev/sdb1 1 31 248976 83 Linux
/dev/sdb2 32 60801 488135025 fd Linux raid autodetect

This is a very serious bug and it's amazing to see that nobody has reported this before. Perhaps because I've been trying to find a solution for raid+lvm preseed and didn't look for "debian installer wrong partition cylinder" instead.

Revision history for this message
Luis Mondesi (lemsx1) wrote :

And sure enough, this bug report has the right solution in comment #3 :
https://bugs.launchpad.net/ubuntu/+source/debian-installer/+bug/574232/comments/3

"Probably related to Bug #561573 and Bug #551965

Appending partman/alignment=cylinder as kernel boot parameter solved the problem for me."

Revision history for this message
Luis Mondesi (lemsx1) wrote :

Finally! The solution for #2 is to do:

partman/alignment=cylinder

After this bootparam (in pxelinux.cfg/default), mdadm did the right thing and the system booted just fine.

Note that this is done on 2 different x86_64 systems:

1. Sun workstation Ultra20 m2
2. HP xw4600

Both have 2 SATA disks of the same size and same manufacturer.

I'm glad that this is finally resolved!

Revision history for this message
Luis Mondesi (lemsx1) wrote :

Attached is a working configuration for LVM + RAID partitioning.

Note that extended partitions must be used for the LVM physical device (changing this to primary and using sd[ab]2 failed miserably). And also note that you need to use LBA alignment either from a boot param (preferred) or from the preseed file (not tested by me but it should work).

Enjoy

p.s. this bug should be closed after you (ubuntu devs) decide whether to enable LBA alignment by default in the d-i (doubt it) or fix mdadm so it doesn't get confused by partitions whose cylinder appear to be overlapping (likely). This also means that this bug should affect mdadm as well.

Revision history for this message
Luis Mondesi (lemsx1) wrote :

OMG posted an older version... sorry about this.

Revision history for this message
Gionn (giovanni.toraldo) wrote :

I've got the same problem using the standard netinstall image via pxe, I've manually created a md0 for /boot and md1 for /.

After the first reboot, /boot wasn't mounted by ubuntu, and md1 got assembled with /dev/sda and /dev/sdb instead of /dev/sda2 and /dev/sdb2.

This should be an high blocking bug, because raid+lvm is a standard on server equipment.

Colin Watson (cjwatson)
Changed in partman-auto-raid (Ubuntu):
assignee: Colin Watson (cjwatson) → nobody
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in partman-auto-raid (Ubuntu):
status: New → Confirmed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.