/dev/disk/by-path not properly populated for (e)SATA port multiplier disks

Bug #1611945 reported by Chris Siebenmann on 2016-08-10
30
This bug affects 4 people
Affects Status Importance Assigned to Milestone
systemd (Ubuntu)
Undecided
Unassigned

Bug Description

We have a just-installed Ubuntu 16.04 LTS machine with a number of disks behind port-multiplier eSATA ports, all of them driven by a SiI 3124 controller (sata_sil24 kernel driver). Our machine sees all disks on all channels, however under 16.04 only one disk from each channel shows up in /dev/disk/by-path/ (all disks show up in /dev/disk/by-id and /dev/disk/by-uuid). For our usage this is a severe defect because we rotate disks in and out of the external enclosure and rely on mounting specific slots in the external enclosure through /dev/disk/by-path.

This did not happen in Ubuntu 12.04 LTS, the release that this machine was previously running.

According to 'udevadm info --export-db' and 'udevadm test-builtin path_id' and so on, systemd's udev stuff is assigning all drives behind the same port the same disk/by-path data (ID_PATH et al). In 'udevadm info /sys/block/sdX', the 'P:' and 'E: DEVPATH=' values show a difference in the target portion of PCI path, eg:

  P: /devices/pci0000:00/0000:00:01.0/0000:01:00.0/0000:02:00.0/ata1/host0/target0:0:0/0:0:0:0/block/sda
  P: /devices/pci0000:00/0000:00:01.0/0000:01:00.0/0000:02:00.0/ata1/host0/target0:1:0/0:1:0:0/block/sdb

However the 'S: disk/by-path', 'E: DEVLINKS=', and 'E: ID_PATH' portions do not. For both devices above, we see:

  S: disk/by-path/pci-0000:02:00.0-ata-1
  E: ID_PATH=pci-0000:02:00.0-ata-1

Naturally only one device can have a /dev/disk/by-path/pci-0000:02:00.0-ata-1 symlink, so instead of four disks per channel in /dev/disk/by-path we see one.

Ubuntu release: 16.04

Package versions from 'apt-cache policy udev systemd':
udev:
  Installed: 229-4ubuntu7
systemd:
  Installed: 229-4ubuntu7

'journalctl -b' reports that during boot systemd does report some 'appeared twice with different sysfs paths' notes, eg:

Aug 10 13:34:21 verdandi systemd[1]: dev-disk-by\x2dpath-pci\x2d0000:02:00.0\x2data\x2d1\x2dpart1.device: Dev dev-disk-by\x2dpath-pci\x2d0000:02:00.0\x2data\x2d1\x2dpart1.device appeared twice with different sysfs paths /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/0000:02:00.0/ata1/host0/target0:3:0/0:3:0:0/block/sdd/sdd1 and /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/0000:02:00.0/ata1/host0/target0:0:0/0:0:0:0/block/sda/sda1

However it doesn't seem to be reporting this for all port-multiplier drives and their partitions.

If it would be useful I can attach full 'udevadm info --export-db' output or the like.

Sitsofe Wheeler (sitsofe) wrote :

Chris:
Could you attach the output of
sudo udevadm test /sys/class/block/sda
?

Sitsofe Wheeler (sitsofe) wrote :

I guess it would also be good to have the same output for different disk in the same enclosure...

Sitsofe Wheeler (sitsofe) wrote :

A quick search digs up that the path is created by this:
https://github.com/systemd/systemd/blob/09541e49ebd17b41482e447dd8194942f39788c0/src/udev/udev-builtin-path_id.c#L349 . This code in that region was added in https://github.com/systemd/systemd/commit/ba86822db70d9ffd02ad78cd02b237ff8c569c7a but doesn't account for the fact that there may be multiple targets (each serving up a different ATA port 1)... Prior to that it doesn't look there was anything to build ATA paths at all? Additionally if my analysis is correct then this affects HEAD systemd-udev too.

I guess it would be good to know what the paths looked like back in 12.04 (perhaps they were SCSI based?) and whether you get any paths of that style on a system like 14.04 or 16.04.

Chris Siebenmann (cks) wrote :

Here is the full 'udevadm test' output for two disks on the same port multiplier channel. I can do a disk on a different channel as well if you want.

On 12.04, the sysfs path of the same disk slot is /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/0000:02:00.0/host8/target8:0:0/8:0:0:0/block/sdk (the sdN numbering is inconsistent from boot to boot, which is why we want /dev/disk/by-path names for all of them). We don't have any 14.04 hosts with a port multiplier, so I don't know when the kernel started putting the hostN directory in the /ataN/ directory and thus triggering udev's special ATA disk handling.

('udevadm test /sys/class/block/sdk' on the 12.04 machine says that udev sees this as an ATA and SATA disk; ID_ATA and ID_ATA_SATA are both 1 and ID_BUS is ata. But it winds up with ID_PATH=pci-0000:02:00.0-scsi-2:0:0:0, instead of an ata variant.)

I agree with your analysis that this is affects systemd HEAD. As far as I can see HEAD has nothing in handle_scsi_ata() that would give different names to multiple disks behind the same ATA port. Sadly we have no systems that are running a recent enough systemd that I can report it to them, based on their reporting policies. Is there a bootable live CD of the in-progress next Ubuntu versions? That might have a recent enough systemd that I could boot it on the system in question, verify that its systemd isn't generating the right /dev/disk/by-path results, and report it upstream.

Sitsofe Wheeler (sitsofe) wrote :

I thought everything in /sys was built by udev rules? If so any path changes could be down to changes there rather than in the kernel.

http://cdimage.ubuntu.com/daily-live/20160811/ is likely a new yaketty build but use at your own risk etc.

Sitsofe Wheeler (sitsofe) wrote :

I should also note 14.04 live CDs are also available: http://releases.ubuntu.com/14.04/ .

Chris Siebenmann (cks) wrote :

In 16.04 (and I think everywhere), /sys is sysfs, so its contents are generated by the kernel, device drivers, and so on. Udev looks at sysfs in order to determine device information (eg ATA port number) that it uses to create everything else. How hardware is represented in sysfs can change over kernel versions, as we see here.

Chris Siebenmann (cks) wrote :

I've confirmed this behavior on the yakkety live build you linked to above, with systemd 231 according to its dpkg output. I gathered as much data about it as I could think of (and I can go back for more if necessary). Would you rather I pass the data to you here for you to file an upstream bug with, or should I go file one directly against upstream?

Feel free to file the bug directly upstream at https://github.com/systemd/systemd/issues/new. This is not ubuntu specific in any way.

(In the future, the same general rule applies: the "two releases" rule is intended to let us avoid dealing with long-fixed bugs and versions of systemd that we're no longer actively working on. But if the code is obviously unchanged between some distro version and upstream, just file the bug and say so in the bug.)

Chris Siebenmann (cks) wrote :

Thanks for your encouragement. I've now filed this as an issue with upstream systemd as https://github.com/systemd/systemd/issues/3943 .

Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in systemd (Ubuntu):
status: New → Confirmed
Sitsofe Wheeler (sitsofe) wrote :

@zbyszek-in: would you be willing to take patches that upstream refuse? There's a fix for this via https://github.com/sitsofe/systemd/commit/ee26c33ede684138ba9fdc7f286bfa402860aff3 but upstream have a clear "no more changes will ever be made to systemd provided storage udev rules" rule : https://github.com/systemd/systemd/issues/3943#issuecomment-240982482 . I can attach the patch here if required...

Sitsofe Wheeler (sitsofe) wrote :

Attach patch to solve PMP attached device persistent naming

The attachment "ee26c33ede684138ba9fdc7f286bfa402860aff3.patch" seems to be a patch. If it isn't, please remove the "patch" flag from the attachment, remove the "patch" tag, and if you are a member of the ~ubuntu-reviewers, unsubscribe the team.

[This is an automated message performed by a Launchpad user owned by ~brian-murray, for any issues please contact him.]

tags: added: patch
Dimitri John Ledkov (xnox) wrote :

I am wondering if this should be merged as a distro patch, or not.

WinEunuchs2Unix (ricklee518) wrote :

Just to confirm this year old bug is still around. Ubuntu 16.04.3, Kernel 4.14.4, NVMe Gen 3.0 x 4 M.2 SSD + Legacy 1 TB spinner, 3 NTFS-3G mounts: /mnt/c/, /mnt/d, /mnt/e defined in /etc/fstab.

3 Errors, with 2 info lines in between, reported by `journalctl -b`:
====================================================================

Dec 13 05:52:20 alien systemd[1]: dev-disk-by\x2dpartlabel-Microsoft\x5cx20reserved\x5cx20partition.device: Dev dev-disk-by\x2dpartlabel-Microsoft\x5cx20reserved\x5cx20partition.device appeared twice with different sysfs paths /sys/devices/pci0000:00/0000:00:1d.0/0000:3e:00.0/nvme/nvme0/nvme0n1/nvme0n1p3 and /sys/devices/pci0000:00/0000:00:17.0/ata2/host1/target1:0:0/1:0:0:0/block/sda/sda2

Dec 13 05:52:20 alien systemd[1]: dev-disk-by\x2dpartlabel-Basic\x5cx20data\x5cx20partition.device: Dev dev-disk-by\x2dpartlabel-Basic\x5cx20data\x5cx20partition.device appeared twice with different sysfs paths /sys/devices/pci0000:00/0000:00:1d.0/0000:3e:00.0/nvme/nvme0/nvme0n1/nvme0n1p4 and /sys/devices/pci0000:00/0000:00:17.0/ata2/host1/target1:0:0/1:0:0:0/block/sda/sda3

Dec 13 05:52:20 alien systemd[1]: Found device HGST_HTS721010A9E630 HGST_Win10.

Dec 13 05:52:20 alien systemd[1]: Mounting /mnt/d...

Dec 13 05:52:20 alien systemd[1]: dev-disk-by\x2dpartlabel-EFI\x5cx20system\x5cx20partition.device: Dev dev-disk-by\x2dpartlabel-EFI\x5cx20system\x5cx20partition.device appeared twice with different sysfs paths /sys/devices/pci0000:00/0000:00:1d.0/0000:3e:00.0/nvme/nvme0/nvme0n1/nvme0n1p2 and /sys/devices/pci0000:00/0000:00:17.0/ata2/host1/target1:0:0/1:0:0:0/block/sda/sda1

Parition Information from `lsblk`
=================================
NAME FSTYPE LABEL MOUNTPOINT SIZE MODEL
sda 931.5G HGST HTS721010A9
├─sda4 ntfs WINRETOOLS 450M
├─sda2 128M
├─sda5 ntfs Image 11.4G
├─sda3 ntfs HGST_Win10 /mnt/d 919G
└─sda1 vfat ESP 500M
nvme0n1 477G Samsung SSD 960 PRO 512GB
├─nvme0n1p5 ext4 NVMe_Ubuntu_16.0 / 44.6G
├─nvme0n1p3 16M
├─nvme0n1p1 ntfs 450M
├─nvme0n1p6 swap Linux Swap [SWAP] 7.9G
├─nvme0n1p4 ntfs NVMe_Win10 /mnt/c 414.9G
├─nvme0n1p2 vfat /boot/efi 99M
└─nvme0n1p7 ntfs Shared_WSL+Linux /mnt/e 9G

Norman Henderson (norm-audrey) wrote :

Ladies and Gentlemen, The technical stuff is way over my head but I am getting the same syslog errors and the same inconsistent device paths on an HP Proliant ML110 G7 with Ubuntu 16.04.3 kernel 4.4.0-98-generic.

It seems clear that no-one is taking ownership of this to fix it in an actual update that ordinary people like me can install in the normal course of system updates. The nature of open source software I guess.

However could someone please let me know:
 - is this just an annoying message that won't be fixed, or are there operational implications?
 - if there are implications, are they serious?
 - if they are serious, could you explain (or point me at a resource that explains) in detail, how to install the patch provided. I've never done that before.

Thank you in advance!

Download full text (4.0 KiB)

I'm not sure of the ramifications of these error messages but I can
confirm they are still there.

On Mon, Feb 12, 2018 at 8:35 AM, Norman Henderson <email address hidden> wrote:
> Ladies and Gentlemen, The technical stuff is way over my head but I am
> getting the same syslog errors and the same inconsistent device paths on
> an HP Proliant ML110 G7 with Ubuntu 16.04.3 kernel 4.4.0-98-generic.
>
> It seems clear that no-one is taking ownership of this to fix it in an
> actual update that ordinary people like me can install in the normal
> course of system updates. The nature of open source software I guess.
>
> However could someone please let me know:
> - is this just an annoying message that won't be fixed, or are there operational implications?
> - if there are implications, are they serious?
> - if they are serious, could you explain (or point me at a resource that explains) in detail, how to install the patch provided. I've never done that before.
>
> Thank you in advance!
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1611945
>
> Title:
> /dev/disk/by-path not properly populated for (e)SATA port multiplier
> disks
>
> Status in systemd package in Ubuntu:
> Confirmed
>
> Bug description:
> We have a just-installed Ubuntu 16.04 LTS machine with a number of
> disks behind port-multiplier eSATA ports, all of them driven by a SiI
> 3124 controller (sata_sil24 kernel driver). Our machine sees all disks
> on all channels, however under 16.04 only one disk from each channel
> shows up in /dev/disk/by-path/ (all disks show up in /dev/disk/by-id
> and /dev/disk/by-uuid). For our usage this is a severe defect because
> we rotate disks in and out of the external enclosure and rely on
> mounting specific slots in the external enclosure through /dev/disk
> /by-path.
>
> This did not happen in Ubuntu 12.04 LTS, the release that this machine
> was previously running.
>
> According to 'udevadm info --export-db' and 'udevadm test-builtin
> path_id' and so on, systemd's udev stuff is assigning all drives
> behind the same port the same disk/by-path data (ID_PATH et al). In
> 'udevadm info /sys/block/sdX', the 'P:' and 'E: DEVPATH=' values show
> a difference in the target portion of PCI path, eg:
>
> P: /devices/pci0000:00/0000:00:01.0/0000:01:00.0/0000:02:00.0/ata1/host0/target0:0:0/0:0:0:0/block/sda
> P: /devices/pci0000:00/0000:00:01.0/0000:01:00.0/0000:02:00.0/ata1/host0/target0:1:0/0:1:0:0/block/sdb
>
> However the 'S: disk/by-path', 'E: DEVLINKS=', and 'E: ID_PATH'
> portions do not. For both devices above, we see:
>
> S: disk/by-path/pci-0000:02:00.0-ata-1
> E: ID_PATH=pci-0000:02:00.0-ata-1
>
> Naturally only one device can have a /dev/disk/by-
> path/pci-0000:02:00.0-ata-1 symlink, so instead of four disks per
> channel in /dev/disk/by-path we see one.
>
> Ubuntu release: 16.04
>
> Package versions from 'apt-cache policy udev systemd':
> udev:
> Installed: 229-4ubuntu7
> systemd:
> Installed: 229-4ubuntu7
>
> 'journalctl -b' reports that during boot systemd does report some
> 'appe...

Read more...

rich painter (painterengr) wrote :

I have this same problem for some time and on the latest update too.
all of the discs affected are used in a zfs zpool.

uname -a
Linux PEI-Server 4.15.0-45-generic #48~16.04.1-Ubuntu SMP Tue Jan 29 18:03:48 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

cat /sys/module/zfs/version
0.7.5-1ubuntu16.4

I have a total of 6 discs split equally across 2 esata port multipliers. All are part a a single raidz2 pool.

Zillions of these errors:
Mar 3 22:04:48 PEI-Server systemd[1]: dev-disk-by\x2dpath-pci\x2d0000:07:00.0\x2data\x2d1\x2dpart1.device: Dev dev-disk-by\x2dpath-pci\x2d0000:07:00.0\x2data\x2d1\x2dpart1.device appeared twice with different sysfs paths /sys/devices/pci0000:00/0000:00:1c.2/0000:07:00.0/ata3/host2/target2:0:0/2:0:0:0/block/sdd/sdd1 and /sys/devices/pci0000:00/0000:00:1c.2/0000:07:00.0/ata3/host2/target2:1:0/2:1:0:0/block/sde/sde1

I have lots of detailed data if anyone wants to see it.

It would REALLY be nice if a fix trickled down soon...

thanks
rich

thulle (thulle) wrote :

I saw the discussion in zfs-discuss and read a bit for fun.

To answer some old questions from what i understood from linked resources:
This bug/deficiency seem to make it impossible to see all drives behind a port multiplier when using links in /dev/disk/by-path/ but not other links, making the drives unadressable by slot.

This can be fixed by patching systemd. Why this haven't been done seems to be summarized here:
https://github.com/systemd/systemd/issues/3943#issuecomment-404996399

The patch submitted here seems to be outdated, later one found here:
https://github.com/sitsofe/systemd/commit/ee26c33ede684138ba9fdc7f286bfa402860aff3

IMHO, the easiest (but not easy) way to add this patch in Ubuntu on your own would be to create an PPA and rebuild systemd from ubuntu with the patch added. Note that you'll be responsible for keeping your systemd-package up to date on your own and that changes can be made upstream that breaks the patch.

It could be added in the official ubuntu package of systemd, but I'm guessing they don't want to add a patch that adds a naming scheme for drives that later might change when the proper fix mentioned last in the first link is implemented.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.