[Trusty] fails to boot with kernels later than v3.11: systemd-udevd[133]: conflicting device node

Bug #1358491 reported by TJ
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Confirmed
High
Unassigned

Bug Description

I have a lab server named 'caddy' that is used for data recovery and forensics of disk drives. It has hot-swap drive bays for various disk interface types. Amongst others it has a Promise FasTrak TX2000 IDE 'fake' RAID controller.

It was upgraded from Saucy to Trusty. After the upgrade the server fails to boot using kernel version 3.13.0-24-generic during early udev whilst still in the initrd. Errors of the form:

[ 6.989549] systemd-udevd[137]: inotify_add_watch(7, /dev/sdi2, 10) failed: No such file or directory
...
[ 7.092733] systemd-udevd[133]: conflicting device node '/dev/mapper/pdc_ecjaiecgch1' found, link to '/dev/dm-2' will not be created

are reported for some devices, usually the Promise 'fake' RAID devices.

The system hangs at that point without ever dropping to a busybox shell.

Starting with an earlier Saucy kernel version 3.11.0-12-generic allows the server to start successfully.

After some research it appears that maybe this is due to an incompatibility between systemd-udevd and device-mapper and/or dmraid-activate. I read in a similar Fedora bug report a comment by Kay Seivers:

https://bugzilla.redhat.com/show_bug.cgi?id=867593#c11

"Device-mapper seems to mknod() things in /dev, which just can not work
correctly today.

There is nothing udev can fix here, it will never touch any device
node, which should not exist in the first place, that is in the way."

I've tried breaking initrd, but unless it is done at 'top' udevd starts and the system hits this problem.

Serial console logs of the failed and successful boot attempts are attached

Revision history for this message
TJ (tj) wrote :
Revision history for this message
TJ (tj) wrote :
Revision history for this message
TJ (tj) wrote :

Due to bug 560246 "Launchpad requires the REFERER header on form submission..." apport-collect cannot send the system diagnostic data.

Revision history for this message
TJ (tj) wrote :

Looking closer, the reason 3.11.x starts is that there are no dmraid device-mapper nodes created at all. I'm modifying the /init script so I can break in early, and re-run it to break-points of my choice, to do more diagnosis.

Changed in linux (Ubuntu):
assignee: nobody → TJ (tj)
status: Triaged → In Progress
Revision history for this message
TJ (tj) wrote :

The 'fake' RAID controller has two disks attached. Each is configured as separate RAID-0 stripe arrays - this is because the controller doesn't support JBOD pass-through.

This issue is caused by the LVM volumes on a fake-RAID device failing to be ready because the LVM VG_CADDY/rootfs is not created since the VG's container partition "/dev/mapper/pdc_ecjaiecgch4" does not get created:

(initramfs) cat /proc/version
Linux version 3.13.0-24-generic (buildd@panlong) (gcc version 4.8.2 (Ubuntu 4.8.2-19ubuntu1) ) #46-Ubuntu SMP Thu Apr 10 19:11:08 UTC 2014

(initramfs) cat /proc/cmdline
BOOT_IMAGE=/vmlinuz-3.13.0-24-generic root=/dev/mapper/VG_CADDY-rootfs ro debug bootdegraded=true libata.force=noncq console=tty0 console=ttyS0,115200n8 netconsole=@10.254.1.3/eth0,@10.254.1.51/ --debug --verbose break=top nomdmonddf nomdmonisw

(initramfs) dmesg | grep -C2 sdb
[ 97.610562] scsi 11:0:1:0: Direct-Access ATA ST380011A 3.06 PQ: 0 ANSI: 5
[ 97.610762] sd 11:0:1:0: [sdb] 156299375 512-byte logical blocks: (80.0 GB/74.5 GiB)
[ 97.610786] sd 11:0:1:0: Attached scsi generic sg2 type 0
[ 97.610893] sd 11:0:1:0: [sdb] Write Protect is off
[ 97.610895] sd 11:0:1:0: [sdb] Mode Sense: 00 3a 00 00
[ 97.610942] sd 11:0:1:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[ 97.675153] sdb: sdb1 sdb2 sdb3 sdb4
[ 97.675669] sd 11:0:1:0: [sdb] Attached SCSI disk
--
<27>[ 100.196769] systemd-udevd[176]: inotify_add_watch(7, /dev/sdb3, 10) failed: No such file or directory
<27>[ 100.196845] systemd-udevd[168]: inotify_add_watch(7, /dev/sdb1, 10) failed: No such file or directory
<27>[ 100.197798] systemd-udevd[177]: inotify_add_watch(7, /dev/sdb4, 10) failed: No such file or directory
<27>[ 100.198351] systemd-udevd[175]: inotify_add_watch(7, /dev/sdb2, 10) failed: No such file or directory
<27>[ 100.198375] systemd-udevd[179]: inotify_add_watch(7, /dev/sdc2, 10) failed: No such file or directory
<27>[ 100.198911] systemd-udevd[180]: inotify_add_watch(7, /dev/sdc3, 10) failed: No such file or directory

(initramfs) dmesg | grep 'too small'
[ 100.346794] device-mapper: table: 252:13: dm-0 too small for target: start=13574144, len=142725198, dev_size=156298401

# fake RAID raw device
(initramfs) kpartx -l /dev/sdb
sdb1 : 0 2014 /dev/sdb 34
sdb2 : 0 1024000 /dev/sdb 2048
sdb3 : 0 12023809 /dev/sdb 1288192
sdb4 : 0 142725198 /dev/sdb 13574144

# same device through the device-mapper viewpoint
(initramfs) kpartx -l /dev/mapper/pdc_ecjaiecgch
Alternate GPT is invalid, using primary GPT.
pdc_ecjaiecgch1 : 0 2014 /dev/mapper/pdc_ecjaiecgch 34
pdc_ecjaiecgch2 : 0 1024000 /dev/mapper/pdc_ecjaiecgch 2048
pdc_ecjaiecgch3 : 0 12023809 /dev/mapper/pdc_ecjaiecgch 1288192
pdc_ecjaiecgch4 : 0 142725198 /dev/mapper/pdc_ecjaiecgch 13574144

I've moved the VG to another disk on a plain PATA controller for now. I'll investigate more when I have time.

Revision history for this message
blausand (blausand) wrote :

I have the very same problem and it freaks me out because i cannot return to the much older kernel that woud boot.
*I kindly request some guidance to a temporary manual fix.*
The mapper-ID (isw_****) seems to change from time to time. I don't know if that stems from the Ubuntu (Studio 14.04 LTS) installation or, possibly, from Intels RAID rebuild code that is launched via the Windows OS e.g. after power loss.
_____
blausand@Utopon ~> cat /proc/cmdline
BOOT_IMAGE=/vmlinuz-3.2.0-67-generic root=UUID=57907805-da9f-4d85-9c49-71807a97c4c0 ro nosplash
blausand@Utopon ~> cat /proc/version
Linux version 3.2.0-67-generic (buildd@brownie) (gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5) ) #101-Ubuntu SMP Tue Jul 15 17:46:11 UTC 2014

Revision history for this message
TJ (tj) wrote :

blausand, I'm afraid I've not spent much time on this issue since I developed the work-around of moving the rootfs. As it only affected the forensics server when hot-swap disks with dmraid metadata were attached I've not re-visited this issue so far.

Revision history for this message
Michael Smith (michael-c-smith) wrote :
Download full text (5.2 KiB)

TJ, I have the same problem on my home system, a HP Z600 with hardware RAID.

Looking at the results from 4 different reboots today:

$ grep "conflicting device node" </var/log/syslog
Mar 7 09:36:22 z600 kernel: [ 14.744525] systemd-udevd[738]: conflicting device node '/dev/mapper/isw_ciijhaceja_Volume0p4' found, link to '/dev/dm-4' will not be created
Mar 7 09:36:22 z600 kernel: [ 14.923887] systemd-udevd[733]: conflicting device node '/dev/mapper/isw_ciijhaceja_Volume0p1' found, link to '/dev/dm-1' will not be created
Mar 7 09:36:22 z600 kernel: [ 14.934934] systemd-udevd[741]: conflicting device node '/dev/mapper/isw_ciijhaceja_Volume0p2' found, link to '/dev/dm-2' will not be created
Mar 7 09:36:22 z600 kernel: [ 15.412385] systemd-udevd[733]: conflicting device node '/dev/mapper/isw_ciijhaceja_Volume0p3' found, link to '/dev/dm-3' will not be created
Mar 7 20:34:03 z600 kernel: [ 14.711869] systemd-udevd[742]: conflicting device node '/dev/mapper/isw_ciijhaceja_Volume0p1' found, link to '/dev/dm-1' will not be created
Mar 7 20:34:03 z600 kernel: [ 14.737949] systemd-udevd[741]: conflicting device node '/dev/mapper/isw_ciijhaceja_Volume0p2' found, link to '/dev/dm-2' will not be created
Mar 7 20:34:03 z600 kernel: [ 14.775406] systemd-udevd[742]: conflicting device node '/dev/mapper/isw_ciijhaceja_Volume0p4' found, link to '/dev/dm-4' will not be created
Mar 7 20:34:03 z600 kernel: [ 14.775510] systemd-udevd[731]: conflicting device node '/dev/mapper/isw_ciijhaceja_Volume0p1' found, link to '/dev/dm-1' will not be created
Mar 7 20:34:03 z600 kernel: [ 15.036880] systemd-udevd[741]: conflicting device node '/dev/mapper/isw_ciijhaceja_Volume0p3' found, link to '/dev/dm-3' will not be created
Mar 7 20:34:03 z600 kernel: [ 15.275910] systemd-udevd[733]: conflicting device node '/dev/mapper/isw_ciijhaceja_Volume0p2' found, link to '/dev/dm-2' will not be created
Mar 7 20:34:03 z600 kernel: [ 15.276606] systemd-udevd[730]: conflicting device node '/dev/mapper/isw_ciijhaceja_Volume0p5' found, link to '/dev/dm-5' will not be created
Mar 7 20:34:03 z600 kernel: [ 15.461964] systemd-udevd[730]: conflicting device node '/dev/mapper/isw_ciijhaceja_Volume0p5' found, link to '/dev/dm-5' will not be created
Mar 7 20:34:03 z600 kernel: [ 15.831112] systemd-udevd[731]: conflicting device node '/dev/mapper/isw_ciijhaceja_Volume0p6' found, link to '/dev/dm-6' will not be created
Mar 7 20:34:03 z600 kernel: [ 15.875357] systemd-udevd[730]: conflicting device node '/dev/mapper/isw_ciijhaceja_Volume0p7' found, link to '/dev/dm-7' will not be created
Mar 7 20:34:03 z600 kernel: [ 15.945102] systemd-udevd[733]: conflicting device node '/dev/mapper/isw_ciijhaceja_Volume0p8' found, link to '/dev/dm-8' will not be created
Mar 7 21:03:39 z600 kernel: [ 18.100972] systemd-udevd[737]: conflicting device node '/dev/mapper/isw_ciijhaceja_Volume0p2' found, link to '/dev/dm-2' will not be created
Mar 7 21:03:39 z600 kernel: [ 18.140472] systemd-udevd[745]: conflicting device node '/dev/mapper/isw_ciijhaceja_Volume0p5' found, link to '/dev/dm-5' will not be created
Mar 7 21:03:39 z600 kernel: [ 18.677290] systemd-udevd[...

Read more...

Revision history for this message
jim (jimmouris) wrote :

Have anyone found a solution? I have the same problem here.. "conflicting device node '/dev/mapper/1sw_bchgjehgcf_biosraid6' found, link to '/dev/dm-6' will not be created"

TJ (tj)
Changed in linux (Ubuntu):
assignee: TJ (tj) → nobody
status: In Progress → Confirmed
Revision history for this message
Robert (sheph) wrote :

I worked around this by removing the long name /dev/isw_<verylongname>/p1 and replacing it with /dev/dm-1 in the /etc/fstab

Interestingly, this is the 2nd raid in the system. The first one mounts with no problem.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.