zfsroot fails: FileNotFoundError - [Errno 2] No such file or directory: '/dev/disk/by-id'

Bug #1760879 reported by Andreas Hasenack on 2018-04-03
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MAAS
Undecided
Unassigned
curtin
High
Unassigned

Bug Description

curtin 18.1-1-g45564eef-0ubuntu1
maas 2.4.0~beta1-6799-g391e5f16d-0ubuntu1

Selected zfsroot for vda-part1, mounted on /, and got this deployment error:

curtin: Installation started. (18.1-1-g45564eef-0ubuntu1)

third party drivers not installed or necessary.

An error occured handling 'vda-part1_format_zfsroot_pool': FileNotFoundError - [Errno 2] No such file or directory: '/dev/disk/by-id'

[Errno 2] No such file or directory: '/dev/disk/by-id'

curtin: Installation failed with exception: Unexpected error while running command.

Command: ['curtin', 'block-meta', 'custom']

Exit code: 3

Reason: -

Stdout: An error occured handling 'vda-part1_format_zfsroot_pool': FileNotFoundError - [Errno 2] No such file or directory: '/dev/disk/by-id'

        [Errno 2] No such file or directory: '/dev/disk/by-id'

Stderr: ''

Related branches

Andres Rodriguez (andreserl) wrote :

Hey Andreas,

Can you please attach the curtin config file sent to curtin?

maas <user> machines get-curtin-config <system_id>

Changed in maas:
status: New → Incomplete
Andreas Hasenack (ahasenack) wrote :
Changed in maas:
status: Incomplete → New
Andreas Hasenack (ahasenack) wrote :

This is how storage was like when I hit deploy.

Andres Rodriguez (andreserl) wrote :

So as per agreed with the server team, passing zfsroot as a filesystem should allow it to work exactly the same was as any normal filesystem would.

So from th MAAS perspective, I believe this is a correct configuration that's been sent. Since curtin does the storage, I'm opening a task against curtin.

Changed in maas:
status: New → Incomplete
milestone: none → 2.4.0beta2
Ryan Harper (raharper) wrote :

I think curtin can help here, but zfsroot requires GPT partitions, this is msdos.

Can you get the curtin install log?

If you can ssh into the ephemeral node after failure, you can run curtin collect-logs

Changed in curtin:
importance: Undecided → High
status: New → Incomplete
Andreas Hasenack (ahasenack) wrote :

having a hard time doing that
ubuntu@46-f4:~$ sudo su -
root@46-f4:~# curtin collect-logs

Command 'curtin' not found, but can be installed as:

apt install curtin

root@46-f4:~# Connection to 10.0.5.191 closed by remote host.
Connection to 10.0.5.191 closed.

Before I could check what's going on, the system was shutdown. I need to do something in maas to prevent that.

Andreas Hasenack (ahasenack) wrote :

This is what I managed to grab by ssh'ing in and tailing /var/log/cloud-init*

If you could share a tip to prevent the shutdown, that would help as I would have more time and would be able to apt install what's missing.

Andreas Hasenack (ahasenack) wrote :

This is what I have in /dev/disk:
ubuntu@46-f4:~$ sudo su -
root@46-f4:~# ls -la /dev/disk
total 0
drwxr-xr-x 4 root root 80 Apr 3 17:25 .
drwxr-xr-x 17 root root 3600 Apr 3 17:25 ..
drwxr-xr-x 2 root root 60 Apr 3 17:25 by-partuuid
drwxr-xr-x 2 root root 160 Apr 3 17:25 by-path
root@46-f4:~# ls -la /dev/disk/by-partuuid/
total 0
drwxr-xr-x 2 root root 60 Apr 3 17:25 .
drwxr-xr-x 4 root root 80 Apr 3 17:25 ..
lrwxrwxrwx 1 root root 10 Apr 3 17:25 9ef12100-01 -> ../../vda1
root@46-f4:~# ls -la /dev/disk/by-path/
total 0
drwxr-xr-x 2 root root 160 Apr 3 17:25 .
drwxr-xr-x 4 root root 80 Apr 3 17:25 ..
lrwxrwxrwx 1 root root 9 Apr 3 17:25 pci-0000:00:06.0 -> ../../vda
lrwxrwxrwx 1 root root 10 Apr 3 17:25 pci-0000:00:06.0-part1 -> ../../vda1
lrwxrwxrwx 1 root root 9 Apr 3 17:25 pci-0000:00:08.0 -> ../../vdb
lrwxrwxrwx 1 root root 9 Apr 3 17:25 virtio-pci-0000:00:06.0 -> ../../vda
lrwxrwxrwx 1 root root 10 Apr 3 17:25 virtio-pci-0000:00:06.0-part1 -> ../../vda1
lrwxrwxrwx 1 root root 9 Apr 3 17:25 virtio-pci-0000:00:08.0 -> ../../vdb

Changed in curtin:
status: Incomplete → New

OK,

The issue here is that we want to follow zfs best-practices here which
means specifying disks by-id;
You don't have those here because your disks don't have serial numbers.

That said, I think we could WARN and accept it; though it does mean
the user is at risk if the drive names
change.

On Tue, Apr 3, 2018 at 12:26 PM, Andreas Hasenack <email address hidden> wrote:
> This is what I have in /dev/disk:
> ubuntu@46-f4:~$ sudo su -
> root@46-f4:~# ls -la /dev/disk
> total 0
> drwxr-xr-x 4 root root 80 Apr 3 17:25 .
> drwxr-xr-x 17 root root 3600 Apr 3 17:25 ..
> drwxr-xr-x 2 root root 60 Apr 3 17:25 by-partuuid
> drwxr-xr-x 2 root root 160 Apr 3 17:25 by-path
> root@46-f4:~# ls -la /dev/disk/by-partuuid/
> total 0
> drwxr-xr-x 2 root root 60 Apr 3 17:25 .
> drwxr-xr-x 4 root root 80 Apr 3 17:25 ..
> lrwxrwxrwx 1 root root 10 Apr 3 17:25 9ef12100-01 -> ../../vda1
> root@46-f4:~# ls -la /dev/disk/by-path/
> total 0
> drwxr-xr-x 2 root root 160 Apr 3 17:25 .
> drwxr-xr-x 4 root root 80 Apr 3 17:25 ..
> lrwxrwxrwx 1 root root 9 Apr 3 17:25 pci-0000:00:06.0 -> ../../vda
> lrwxrwxrwx 1 root root 10 Apr 3 17:25 pci-0000:00:06.0-part1 -> ../../vda1
> lrwxrwxrwx 1 root root 9 Apr 3 17:25 pci-0000:00:08.0 -> ../../vdb
> lrwxrwxrwx 1 root root 9 Apr 3 17:25 virtio-pci-0000:00:06.0 -> ../../vda
> lrwxrwxrwx 1 root root 10 Apr 3 17:25 virtio-pci-0000:00:06.0-part1 -> ../../vda1
> lrwxrwxrwx 1 root root 9 Apr 3 17:25 virtio-pci-0000:00:08.0 -> ../../vdb
>
>
> ** Changed in: curtin
> Status: Incomplete => New
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1760879
>
> Title:
> zfsroot fails: FileNotFoundError - [Errno 2] No such file or
> directory: '/dev/disk/by-id'
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/curtin/+bug/1760879/+subscriptions

Ryan Harper (raharper) wrote :

As per the collect logs, on an ephemeral node, you can find curtin here:

/curtin/bin/curtin collect-logs

On Tue, Apr 3, 2018 at 1:06 PM, Ryan Harper <email address hidden> wrote:
> OK,
>
> The issue here is that we want to follow zfs best-practices here which
> means specifying disks by-id;
> You don't have those here because your disks don't have serial numbers.
>
> That said, I think we could WARN and accept it; though it does mean
> the user is at risk if the drive names
> change.
>
>
>
> On Tue, Apr 3, 2018 at 12:26 PM, Andreas Hasenack <email address hidden> wrote:
>> This is what I have in /dev/disk:
>> ubuntu@46-f4:~$ sudo su -
>> root@46-f4:~# ls -la /dev/disk
>> total 0
>> drwxr-xr-x 4 root root 80 Apr 3 17:25 .
>> drwxr-xr-x 17 root root 3600 Apr 3 17:25 ..
>> drwxr-xr-x 2 root root 60 Apr 3 17:25 by-partuuid
>> drwxr-xr-x 2 root root 160 Apr 3 17:25 by-path
>> root@46-f4:~# ls -la /dev/disk/by-partuuid/
>> total 0
>> drwxr-xr-x 2 root root 60 Apr 3 17:25 .
>> drwxr-xr-x 4 root root 80 Apr 3 17:25 ..
>> lrwxrwxrwx 1 root root 10 Apr 3 17:25 9ef12100-01 -> ../../vda1
>> root@46-f4:~# ls -la /dev/disk/by-path/
>> total 0
>> drwxr-xr-x 2 root root 160 Apr 3 17:25 .
>> drwxr-xr-x 4 root root 80 Apr 3 17:25 ..
>> lrwxrwxrwx 1 root root 9 Apr 3 17:25 pci-0000:00:06.0 -> ../../vda
>> lrwxrwxrwx 1 root root 10 Apr 3 17:25 pci-0000:00:06.0-part1 -> ../../vda1
>> lrwxrwxrwx 1 root root 9 Apr 3 17:25 pci-0000:00:08.0 -> ../../vdb
>> lrwxrwxrwx 1 root root 9 Apr 3 17:25 virtio-pci-0000:00:06.0 -> ../../vda
>> lrwxrwxrwx 1 root root 10 Apr 3 17:25 virtio-pci-0000:00:06.0-part1 -> ../../vda1
>> lrwxrwxrwx 1 root root 9 Apr 3 17:25 virtio-pci-0000:00:08.0 -> ../../vdb
>>
>>
>> ** Changed in: curtin
>> Status: Incomplete => New
>>
>> --
>> You received this bug notification because you are subscribed to the bug
>> report.
>> https://bugs.launchpad.net/bugs/1760879
>>
>> Title:
>> zfsroot fails: FileNotFoundError - [Errno 2] No such file or
>> directory: '/dev/disk/by-id'
>>
>> To manage notifications about this bug go to:
>> https://bugs.launchpad.net/curtin/+bug/1760879/+subscriptions

Andreas Hasenack (ahasenack) wrote :

After adding serial numbers, I can confirm that the deployment works.

ubuntu@46-f4:~$ zfs list
NAME USED AVAIL REFER MOUNTPOINT
rpool 4.12G 15.1G 176K /
rpool/ROOT 4.12G 15.1G 176K none
rpool/ROOT/zfsroot 4.12G 15.1G 4.12G /
ubuntu@46-f4:~$ mount -t zfs
rpool/ROOT/zfsroot on / type zfs (rw,relatime,xattr,noacl)
ubuntu@46-f4:~$ zfs list
NAME USED AVAIL REFER MOUNTPOINT
rpool 4.12G 15.1G 176K /
rpool/ROOT 4.12G 15.1G 176K none
rpool/ROOT/zfsroot 4.12G 15.1G 4.12G /
ubuntu@46-f4:~$ zpool list
NAME SIZE ALLOC FREE EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
rpool 19.9G 4.12G 15.8G - - 20% 1.00x ONLINE -
ubuntu@46-f4:~$ zpool status
  pool: rpool
 state: ONLINE
status: The pool is formatted using a legacy on-disk format. The pool can
 still be used, but some features are unavailable.
action: Upgrade the pool using 'zpool upgrade'. Once this is done, the
 pool will no longer be accessible on software that does not support
 feature flags.
  scan: none requested
config:

 NAME STATE READ WRITE CKSUM
 rpool ONLINE 0 0 0
   virtio-111-part1 ONLINE 0 0 0

errors: No known data errors

ubuntu@46-f4:~$ ll /dev/disk/by-id/
total 0
drwxr-xr-x 2 root root 100 Apr 3 18:14 ./
drwxr-xr-x 8 root root 160 Apr 3 18:14 ../
lrwxrwxrwx 1 root root 9 Apr 3 18:14 virtio-111 -> ../../vda
lrwxrwxrwx 1 root root 10 Apr 3 18:14 virtio-111-part1 -> ../../vda1
lrwxrwxrwx 1 root root 9 Apr 3 18:14 virtio-222 -> ../../vdb

I'll leave it up to you guys to decide if this should be a supported scenario or not.

Ryan Harper (raharper) on 2018-04-04
Changed in curtin:
status: New → In Progress
Changed in maas:
milestone: 2.4.0beta2 → 2.4.0beta3
Ryan Harper (raharper) wrote :

An upstream commit landed for this bug.

To view that commit see the following URL:
https://git.launchpad.net/curtin/commit/?id=05de7810

Changed in curtin:
status: In Progress → Fix Committed
Changed in maas:
milestone: 2.4.0beta3 → 2.4.0beta4
Changed in maas:
milestone: 2.4.0beta4 → 2.4.x

This bug is believed to be fixed in curtin in version18.1-17-gae48e86f-0ubuntu1. If this is still a problem for you, please make a comment and set the state back to New

Thank you.

Changed in curtin:
status: Fix Committed → Fix Released
yinxingpan (yinxingpan) wrote :

2.4.0beta2, i got same error, when i deploy centos7 in uefi model.

i dont why ,

error: local disk .....,
error: no such device: /efi/ubuntu/grubx64.efi.
error: File not found.

Press any key to continue .......

i dont know why, it seems everything is ok, no error, just can not boot.

why does centos7 uefi get this error? (ubuntu/grubx64.efi)

so i change rhel in uefi, error is gone, boot fine.

so, anyone who give me some advice, i will appreciate it.

yinxingpan (yinxingpan) wrote :

i push e, and change /efi/ubuntu/grubx64.efi -> /efi/centos/grubx64.efi, press F10, it can boot,

but, it can not find /dev/disk/by-uuid/

and i just check the dev device, dont has the uuid device.

 HP, Lenovo, all got this error.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers