subiquity fails to handle a large burst of udev events

Bug #2009141 reported by bugproxy
34
This bug affects 4 people
Affects Status Importance Assigned to Milestone
Ubuntu on IBM z Systems
Fix Released
High
Skipper Bug Screeners
subiquity (Ubuntu)
Fix Released
High
Dan Bungert

Bug Description

Subiquity, when faced with a large burst of events, may fail to process them in a timely fashion which can result in a traceback from pyudev related to filled event buffers. For a reproducer, please see the "many-partitions" attachment.

Related pyudev issue: https://github.com/pyudev/pyudev/issues/194

There also appears to be a udev-related issue on Mantic that makes this issue more likely to be seen: LP: #2037569

---

Original bug description:

Bug Description:
Installation of Ubuntu 22.04 on s390x failed with an unknown error just after having successfully activated a zfcp HBA with Fibre-Channel-attached SCSI disks.

I do see 0.0.100d successfully being online and having paths attached:
                   zfcp-host
                   0.0.100d online ?
                   0x500173800cef0111:0x0000000000000000 sg16
                   0x500173800cef0111:0x0001000000000000 sdp sg17
                   0x500173800cef0111:0x0002000000000000 sdq sg18
[host.ilabg13_9.11.116.213_ubuntu-22.04_ssh_installer.txt lines 41-41/9373 byte 238696/1146881 21%]

But immediately after that, the installer reports an error:

                   An error occurred during installation

??????????????????????????????????????????????????????????????????????????
?subiquity/Early/apply_autoinstall_config
?subiquity/Reporting/apply_autoinstall_config
?subiquity/Error/apply_autoinstall_config
?subiquity/Userdata/apply_autoinstall_config
?subiquity/Package/apply_autoinstall_config
?subiquity/Debconf/apply_autoinstall_config
?subiquity/Kernel/apply_autoinstall_config
?subiquity/Late/apply_autoinstall_config ??????????????????????????????????????????????????????????????????????????
? Sorry, an unknown error occurred. ?
? Information is being collected from the system that will help the ?
? developers diagnose the report. |

It looks like there was some ascii art progress bar while data was collected and then the dialog updated to:

? [ View full report ] ?
? If you want to help improve the installer, you can send an error ?
? report. ?
? [ Send to Canonical ] ?
? [ Close report ] |

Next "View full report" was selected?

ProblemType: Bug
Architecture: s390x
CrashDB: {'impl': 'launchpad', 'project': 'subiquity'}
CurrentDmesg:
 [ 0.093997] Linux version 5.15.0-43-generic (buildd@bos02-s390x-005) (gcc (Ubuntu 11.2.0-19ubuntu1) 11.2.0, GNU ld (GNU Binutils for Ubuntu) 2.38) #46-Ubuntu SMP Tue Jul 12 12:40:17 UTC 2022 (Ubuntu 5.15.0-43.46-generic 5.15.39)
 [ 0.094001] setup: Linux is running as a z/VM guest operating system in 64-bit mode

 [ 0.360759] Kernel command line: ip=9.11.116.213::9.11.116.1:255.255.255.0:ilabg13:ence0f:none:9.11.227.25 https://cdimage.ubuntu.com/releases/jammy/release/ubuntu-22.04.1-live-server-s39

 [ 2677.003438] zfcp 0.0.100d: qdio: ZFCP on SC 10 using AI:1 QEBSM:1 PRI:1 TDD:1 SIGA: W
 [ 2677.032397] scsi host0: zfcp
 [ 2677.060025] scsi 0:0:0:0: RAID IBM 2810XIV-LUN-0 0000 PQ: 0 ANSI: 5
 [ 2677.061097] scsi 0:0:0:0: alua: disable for non-disk devices
 [ 2677.061134] scsi 0:0:0:0: Attached scsi generic sg0 type 12
 [ 2677.062988] scsi 0:0:0:1: Direct-Access IBM 2810XIV 0000 PQ: 0 ANSI: 5
 [ 2677.064628] scsi 0:0:0:1: alua: supports implicit TPGS
 [ 2677.064632] scsi 0:0:0:1: alua: device naa.6001738cfc900cef0000000000013596 port group 0 rel port 301
 [ 2677.064679] sd 0:0:0:1: Attached scsi generic sg1 type 0

==> zfcp could successfully set 0.0.100d online and automatic LUN scan worked.

 [ 2677.513028] sd 0:0:16:1074020357: [sdac] Attached SCSI disk
 [ 2677.522641] sd 0:0:16:1074020356: [sdab] Attached SCSI disk
 [ 2677.522677] sd 0:0:15:1074020357: [sdaa] Attached SCSI disk
 [ 2677.522693] sd 0:0:15:1074020356: [sdz] Attached SCSI disk
 [ 2678.660373] device-mapper: multipath service-time: version 0.3.0 loaded
Date: Wed Jan 18 21:04:03 2023
DistroRelease: Ubuntu 22.04
ExecutablePath: /snap/subiquity/3699/lib/python3.8/site-packages/subiquity/cmd/server.py
InstallerServerLog:
 2023-01-18 20:23:57,901 INFO subiquity:112 Starting Subiquity server revision 3699

 2023-01-18 21:02:53,381 DEBUG root:39 start: subiquity/Zdev/GET:
 2023-01-18 21:02:53,381 DEBUG subiquitycore.utils:64 run_command called: ['lszdev', '--quiet', '--pairs', '--columns', 'id,type,on,exists,pers,auto,failed,names']
 2023-01-18 21:02:53,405 DEBUG subiquitycore.utils:77 run_command ['lszdev', '--quiet', '--pairs', '--columns', 'id,type,on,exists,pers,auto,failed,names'] exited with code 0
 2023-01-18 21:02:53,406 DEBUG root:39 finish: subiquity/Zdev/GET: SUCCESS: 200 [{"id": "0.0.0009", "type": "generic-ccw", "on": true, "exists": true, "pers"...
 2023-01-18 21:02:53,414 INFO aiohttp.access:233 [18/Jan/2023:21:02:53 +0000] "GET /zdev HTTP/1.1" 200 2059 "-" "Python/3.8 aiohttp/3.6.2"
 2023-01-18 21:03:06,492 DEBUG subiquitycore.utils:64 run_command called: ['udevadm', 'settle', '-t', '0']
 2023-01-18 21:03:06,534 DEBUG subiquitycore.utils:77 run_command ['udevadm', 'settle', '-t', '0'] exited with code 0
 2023-01-18 21:03:06,534 DEBUG probert.network:585 event for addr_change: CHANGE {'ifindex': 2, 'flags': 768, 'family': 10, 'scope': 0, 'local': b'2002:90b:e006:116:76:54ff:fe00:5/64'}
 2023-01-18 21:03:06,534 DEBUG probert.network:717 addr_change CHANGE {'ifindex': 2, 'flags': 768, 'family': 10, 'scope': 0, 'local': b'2002:90b:e006:116:76:54ff:fe00:5/64'}
 2023-01-18 21:03:06,534 DEBUG root:39 start: subiquity/Network/_send_update: CHANGE ence0f
 2023-01-18 21:03:06,534 DEBUG subiquity.server.controllers.network:354 dev_info ence0f {'addresses': ['9.11.116.213/24'], 'gateway4': '9.11.116.1', 'nameservers': {'addresses': ['9.11.227.25']}}
 2023-01-18 21:03:06,535 DEBUG root:39 finish: subiquity/Network/_send_update: SUCCESS: CHANGE ence0f
 2023-01-18 21:03:19,283 DEBUG subiquitycore.utils:64 run_command called: ['udevadm', 'settle', '-t', '0']
 2023-01-18 21:03:19,314 DEBUG subiquitycore.utils:77 run_command ['udevadm', 'settle', '-t', '0'] exited with code 0
 2023-01-18 21:03:19,314 DEBUG probert.network:585 event for addr_change: CHANGE {'ifindex': 2, 'flags': 768, 'family': 10, 'scope': 0, 'local': b'2002:90b:e006:116:76:54ff:fe00:5/64'}
 2023-01-18 21:03:19,314 DEBUG probert.network:717 addr_change CHANGE {'ifindex': 2, 'flags': 768, 'family': 10, 'scope': 0, 'local': b'2002:90b:e006:116:76:54ff:fe00:5/64'}
 2023-01-18 21:03:19,314 DEBUG root:39 start: subiquity/Network/_send_update: CHANGE ence0f
 2023-01-18 21:03:19,314 DEBUG subiquity.server.controllers.network:354 dev_info ence0f {'addresses': ['9.11.116.213/24'], 'gateway4': '9.11.116.1', 'nameservers': {'addresses': ['9.11.227.25']}}
 2023-01-18 21:03:19,314 DEBUG root:39 finish: subiquity/Network/_send_update: SUCCESS: CHANGE ence0f
 2023-01-18 21:03:59,037 DEBUG root:39 start: subiquity/Zdev/chzdev_POST:
 2023-01-18 21:03:59,038 DEBUG subiquitycore.utils:92 arun_command called: ['chzdev', '--enable', '0.0.100d']
 2023-01-18 21:03:59,357 DEBUG subiquitycore.utils:64 run_command called: ['udevadm', 'settle', '-t', '0']
 2023-01-18 21:03:59,373 DEBUG subiquitycore.utils:77 run_command ['udevadm', 'settle', '-t', '0'] exited with code 1
 2023-01-18 21:03:59,373 DEBUG subiquity.server.controller.filesystem:495 waiting 0.1 to let udev event queue settle
... <<< some time with repeated udevadm settle >>> ...
 2023-01-18 21:04:02,984 DEBUG subiquitycore.utils:64 run_command called: ['udevadm', 'settle', '-t', '0']
 2023-01-18 21:04:02,986 DEBUG subiquitycore.utils:77 run_command ['udevadm', 'settle', '-t', '0'] exited with code 1
 2023-01-18 21:04:02,986 DEBUG subiquity.server.controller.filesystem:495 waiting 0.1 to let udev event queue settle
 2023-01-18 21:04:03,029 DEBUG subiquitycore.utils:101 arun_command ['chzdev', '--enable', '0.0.100d'] exited with code 0
 2023-01-18 21:04:03,029 DEBUG subiquitycore.utils:64 run_command called: ['lszdev', '--quiet', '--pairs', '--columns', 'id,type,on,exists,pers,auto,failed,names']
 2023-01-18 21:04:03,047 DEBUG subiquitycore.utils:77 run_command ['lszdev', '--quiet', '--pairs', '--columns', 'id,type,on,exists,pers,auto,failed,names'] exited with code 0
 2023-01-18 21:04:03,052 DEBUG root:39 finish: subiquity/Zdev/chzdev_POST: SUCCESS: 200 [{"id": "0.0.0009", "type": "generic-ccw", "on": true, "exists": true, "pers"...
 2023-01-18 21:04:03,052 INFO aiohttp.access:233 [18/Jan/2023:21:03:59 +0000] "POST /zdev/chzdev?action=%22enable%22&zdev=%7B%22id%22:+%220.0.100d%22,+%22type%22:+%22zfcp-host%22,+%22on%22:+false,+%22exists%22:+true,+%22pers%22:+false,+%22auto%22:+false,+%22failed%22:+false,+%22names%22:+%22%22%7D HTTP/1.1" 200 7956 "-" "Python/3.8 aiohttp/3.6.2"
 2023-01-18 21:04:03,087 DEBUG subiquitycore.utils:64 run_command called: ['udevadm', 'settle', '-t', '0']
 2023-01-18 21:04:03,114 DEBUG subiquitycore.utils:77 run_command ['udevadm', 'settle', '-t', '0'] exited with code 0
 2023-01-18 21:04:03,114 ERROR subiquity.server.server:416 top level error
 2023-01-18 21:04:03,114 ERROR subiquity.server.server:416 top level error
 Traceback (most recent call last):
   File "/snap/subiquity/3699/usr/lib/python3.8/asyncio/events.py", line 81, in _run
     self._context.run(self._callback, *self._args)
   File "/snap/subiquity/3699/lib/python3.8/site-packages/subiquity/server/controllers/filesystem.py", line 506, in _udev_event
     action, dev = self._monitor.receive_device()
   File "/snap/subiquity/3699/lib/python3.8/site-packages/pyudev/monitor.py", line 397, in receive_device
     device = self.poll()
   File "/snap/subiquity/3699/lib/python3.8/site-packages/pyudev/monitor.py", line 357, in poll
     if eintr_retry_call(poll.Poll.for_events((self, 'r')).poll, timeout):
   File "/snap/subiquity/3699/lib/python3.8/site-packages/pyudev/_util.py", line 163, in eintr_retry_call
     return func(*args, **kwargs)
   File "/snap/subiquity/3699/lib/python3.8/site-packages/pyudev/_os/poll.py", line 97, in poll
     return list(self._parse_events(eintr_retry_call(self._notifier.poll, timeout)))
   File "/snap/subiquity/3699/lib/python3.8/site-packages/pyudev/_os/poll.py", line 112, in _parse_events
     raise IOError('Error while polling fd: {0!r}'.format(fd))
 OSError: Error while polling fd: 20
 2023-01-18 21:04:03,116 DEBUG subiquitycore.common.errorreport:384 generating crash report
 2023-01-18 21:04:03,116 INFO subiquitycore.common.errorreport:406 saving crash report 'unknown error crashed with OSError' to /var/crash/1674075843.116781473.unknown.crash
 2023-01-18 21:04:03,117 INFO root:39 start: subiquity/ErrorReporter/1674075843.116781473.unknown/add_info:
 2023-01-18 21:04:03,117 INFO root:39 finish: subiquity/Meta/status_GET: SUCCESS: 200 {"state": "ERROR", "confirming_tty": "", "error": {"state": "INCOMPLETE", "ba...
 2023-01-18 21:04:03,117 INFO root:39 finish: subiquity/Meta/status_GET: SUCCESS: 200 {"state": "ERROR", "confirming_tty": "", "error": {"state": "INCOMPLETE", "ba...
 2023-01-18 21:04:03,117 INFO root:39 finish: subiquity/Meta/status_GET: SUCCESS: 200 {"state": "ERROR", "confirming_tty": "", "error": {"state": "INCOMPLETE", "ba...
 2023-01-18 21:04:03,118 INFO root:39 finish: subiquity/Meta/status_GET: SUCCESS: 200 {"state": "ERROR", "confirming_tty": "", "error": {"state": "INCOMPLETE", "ba...
 2023-01-18 21:04:03,118 DEBUG subiquitycore.utils:64 run_command called: ['udevadm', 'settle', '-t', '0']
InstallerServerLogInfo:

InterpreterPath: /snap/subiquity/3699/usr/bin/python3.8

UdevDb:
...
 P: /devices/css0/0.0.0010/0.0.100d
 L: 0
 E: DEVPATH=/devices/css0/0.0.0010/0.0.100d
 E: SUBSYSTEM=ccw
 E: DRIVER=zfcp
 E: CU_TYPE=1731
 E: CU_MODEL=03
 E: DEV_TYPE=1732
 E: DEV_MODEL=03
 E: MODALIAS=ccw:t1731m03dt1732dm03
...
 P: /devices/css0/0.0.0010/0.0.100d/host0/rport-0:0-0/target0:0:0/0:0:0:1/block/sda
 N: sda
 L: 0
 S: disk/by-path/ccw-0.0.100d-fc-0x500173800cef0131-lun-1
 S: disk/by-id/scsi-0IBM_2810XIV_host=ilabg13_tuc3_fcp_32G
 S: disk/by-id/scsi-36001738cfc900cef0000000000013596
 S: disk/by-id/wwn-0x6001738cfc900cef0000000000013596
 S: disk/by-uuid/da904d37-e306-4223-8fbb-9670357ca708
 S: disk/by-id/scsi-SIBM_2810XIV_6000CEF0000000000013596
 S: disk/by-id/scsi-1IBM_2810XIV_6000CEF0000000000013596
 ...
 E: DM_MULTIPATH_DEVICE_PATH=1
...
 P: /devices/css0/0.0.0010/0.0.100d/host0/rport-0:0-17/target0:0:17/0:0:17:1/block/sdae
 N: sdae
 L: 0
 S: disk/by-id/scsi-1IBM_FlashSystem-984026c6702d26c6-0000-007d-000294
 S: disk/by-id/scsi-SIBM_FlashSystem-9840_26c6702d26c6-0000-007d-000294
 S: disk/by-path/ccw-0.0.100d-fc-0x500507605e8b7271-lun-1
 S: disk/by-uuid/97c7452a-2f45-4f7b-9127-d63b4a34e1e4
 S: disk/by-id/scsi-36005076b19bcb49b180000007d000294
 S: disk/by-id/wwn-0x6005076b19bcb49b180000007d000294
 E: DEVPATH=/devices/css0/0.0.0010/0.0.100d/host0/rport-0:0-17/target0:0:17/0:0:17:1/block/sdae
 E: SUBSYSTEM=block
 E: DEVNAME=/dev/sdae
 E: DEVTYPE=disk
 E: DISKSEQ=43
 E: MAJOR=65
 E: MINOR=224
 E: USEC_INITIALIZED=2677514875
 E: DM_MULTIPATH_DEVICE_PATH=1
...
 P: /devices/css0/0.0.0010/0.0.100d/host0/rport-0:0-9/target0:0:9/0:0:9:1/block/sdo
 N: sdo
 L: 0
 S: disk/by-path/ccw-0.0.100d-fc-0x500507680b2541ba-lun-1
 S: disk/by-id/wwn-0x6005076400818089b000000000000157
 S: disk/by-id/scsi-36005076400818089b000000000000157
 S: disk/by-id/scsi-SIBM_2145_010020c0226cXX00
 S: disk/by-uuid/0fbe15bc-49fb-4a09-b9f2-269129fa9913
 E: DEVPATH=/devices/css0/0.0.0010/0.0.100d/host0/rport-0:0-9/target0:0:9/0:0:9:1/block/sdo
 E: SUBSYSTEM=block
 E: DEVNAME=/dev/sdo
 E: DEVTYPE=disk
 E: DISKSEQ=27
 E: MAJOR=8
 E: MINOR=224
 E: USEC_INITIALIZED=2677616364
 E: DM_MULTIPATH_DEVICE_PATH=1

==> Probably the last discovered SCSI LUN.

 P: /devices/virtual/block/dm-0
 N: dm-0
 L: 50
 S: disk/by-id/scsi-360050762198c1fc2180000000b000132
 S: disk/by-id/wwn-0x60050762198c1fc2180000000b000132
 S: disk/by-id/dm-name-mpatha
 S: disk/by-id/dm-uuid-mpath-360050762198c1fc2180000000b000132
 S: mapper/mpatha
 E: DEVPATH=/devices/virtual/block/dm-0
 E: SUBSYSTEM=block
 E: DEVNAME=/dev/dm-0
 E: DEVTYPE=disk
 E: DISKSEQ=44
 E: MAJOR=253
 E: MINOR=0
 E: USEC_INITIALIZED=2678671661
 E: DM_UDEV_DISABLE_LIBRARY_FALLBACK_FLAG=1
 E: DM_UDEV_PRIMARY_SOURCE_FLAG=1
 E: DM_ACTIVATION=1
 E: DM_NAME=mpatha
 E: DM_UUID=mpath-360050762198c1fc2180000000b000132
 E: DM_SUSPENDED=0
 E: DM_UDEV_RULES=1
 E: DM_UDEV_RULES_VSN=2
 E: MPATH_SBIN_PATH=/sbin
 E: MPATH_DEVICE_READY=1

==> Multipath devices also got assembled.

Canonical, why did the installer get an error?

Does the installer really have a busy(!) waiting loop calling udevadm settle with zero timeout?
But even if so, with the number of discovered devices and the settle finally returning with success errorlevel 0, it should just work?

Revision history for this message
bugproxy (bugproxy) wrote : host.ilabg13_9.11.116.213_ubuntu-22.04_ssh_installer

Default Comment by Bridge

tags: added: architecture-s39064 bugnameltc-201751 severity-high targetmilestone-inin---
Changed in ubuntu:
assignee: nobody → Skipper Bug Screeners (skipper-screen-team)
affects: ubuntu → linux (Ubuntu)
Revision history for this message
Frank Heimes (fheimes) wrote : Re: [UBUNTU 22.04] OS installer exits for zfcp 32G adapter with an unknown error. An error occurred during installation

Please can you check if there is a crash file in /var/crash and if so share this
and ideally the entire /var/log folder (or at least /var/log/installer)? Thx.
We don't have a 32Gbit FCP adapter to test with.

affects: linux (Ubuntu) → subiquity (Ubuntu)
Changed in subiquity (Ubuntu):
assignee: Skipper Bug Screeners (skipper-screen-team) → nobody
Changed in ubuntu-z-systems:
assignee: nobody → Skipper Bug Screeners (skipper-screen-team)
Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2023-03-03 12:55 EDT-------
(In reply to comment #11)
> Please can you check if there is a crash file in /var/crash and if so share
> this
> and ideally the entire /var/log folder (or at least /var/log/installer)? Thx.
> We don't have a 32Gbit FCP adapter to test with.

In response to request for /var/crash and /var/log folder files.

It is the subiquity installer that is exiting with an error

This failure exited after 0.0.100d online the installer exits with an error
i.e.
Sorry, an unknown error occurred

There are no /var/crash or /var/log folders or files at this point since the subiquity installer exited with an error

Revision history for this message
bugproxy (bugproxy) wrote : subiquity ssh terminal text log from the script tool

------- Comment on attachment From <email address hidden> 2023-03-03 13:04 EDT-------

not sure the existing attachment of the installer console log was mirrored, so I attach it again

Revision history for this message
Frank Heimes (fheimes) wrote : Re: [UBUNTU 22.04] OS installer exits for zfcp 32G adapter with an unknown error. An error occurred during installation

Yes, I understood that the installer is crashing here
and that the ssh connection might have been dropped.
But one can usually reconnect via ssh,
go via the top-right menu to the installer shell (or I think via Alt-F2),
and navigate to the folders /var/crash and /var/log and even pack and scp them - that was the hope.

Revision history for this message
bugproxy (bugproxy) wrote : subiquity ssh terminal text log from the script tool

------- Comment on attachment From <email address hidden> 2023-03-03 13:04 EDT-------

not sure the existing attachment of the installer console log was mirrored, so I attach it again

Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla
Download full text (6.2 KiB)

------- Comment From <email address hidden> 2023-03-13 08:25 EDT-------
(In reply to comment #11)
> We don't have a 32Gbit FCP adapter to test with.

I doubt this is related to a particular hardware as the HBAs have the same programming interface to the zfcp device driver.

(In reply to comment #15)
> Yes, I understood that the installer is crashing here
> and that the ssh connection might have been dropped.

I'm not sure the ssh connection dropped, as I see the installer TUI and installer debug data after the reported error, so I suppose the connection remained.

> But one can usually reconnect via ssh,
> go via the top-right menu to the installer shell

>(or I think via Alt-F2),

There are no Linux Virtual Terminals on s390, so I guess that key combo won't work.

> and navigate to the folders /var/crash and /var/log and even pack and scp
> them - that was the hope.

(In reply to comment #6)
> Installation of Ubuntu 22.04 on s390x failed with an unknown error just
> after having successfully activated a zfcp HBA with Fibre-Channel-attached
> SCSI disks.
>
> I do see 0.0.100d successfully being online and having paths attached:

> [host.ilabg13_9.11.116.213_ubuntu-22.04_ssh_installer.txt lines 41-41/9373
> byte 238696/1146881 21%]

The attachment should already contain a lot of debug data.

It was collected with the script tool recording the ssh session during the installation attempt. Unfortunately, there is no timing information so "scriptreplay" does not work. The 'best' thing I could come up is using "less -R" to let at least render the ansi color escape sequences instead of cluttering the output with escaped escape sequences. It's somewhat readable (with a column offset, though; and super slow to render with massive cpu consumption due to one super long line being continuously wrapped by the less pager) with 167 columns for me after trial and error resizing of my terminal.

> But immediatelly after that, the installer reports an error:
>
> An error occurred during installation

> ??????????????????????????????????????????????????????????????????????????
> ? Sorry, an unknown error occurred. ?
> ? Information is being collected from the system that will help the ?
> ? developers diagnose the report. |

> ? [ View full report ] ?
> ? If you want to help improve the installer, you can send an error ?
> ? report. ?
> ? [ Send to Canonical ] ?
> ? [ Close report ] |
>
> Next "View full report" was selected?

The attached file contains the installer debug data starting at line 42.

I hope this is the same information as would be in "Send to Canonical", which is not necessarily an option on s390 potentially not having a sufficient internet connection for security reasons.

> ProblemType: Bug
> Architecture: s390x
> CrashDB: {'impl': 'launchpad', 'project': 'subiquity'}
> CurrentDmesg:

> Date: Wed Jan 18 21:04:03 2023
> DistroRelease:...

Read more...

Revision history for this message
bugproxy (bugproxy) wrote : subiquity ssh terminal text log from the script tool

------- Comment on attachment From <email address hidden> 2023-03-03 13:04 EDT-------

not sure the existing attachment of the installer console log was mirrored, so I attach it again

Revision history for this message
Frank Heimes (fheimes) wrote : Re: [UBUNTU 22.04] OS installer exits for zfcp 32G adapter with an unknown error. An error occurred during installation

Hi, yes, I also do not really believe that it's related to the 32Gbit adapters themselves.

Ah, ok, your ssh sessions stayed up, I see.
(The key shortcuts like Alt-F2, work fine in the installer ssh shell, but not needed in this case since you got dropped to the installer shell anyway.)

Well, since the installer is cross platform, the "Send to Canonical" exists as usual,
known that the network is not always in place (esp. on s390x)...

Yes, /var/crash/1674075843.116781473.unknown.crash can provide additional information.

The Launchpad bug got marked as affecting 'subiquity' (the installer) and subiquity developers got subscribed to it.

With this data I now noticed that the installer package version is pretty old (and outdated):
SnapRevision: 3699
SnapVersion: 22.07.2
(current is: subiquity - 4383 - 23.02.1)
as well as the kernel version:
Uname: Linux 5.15.0-43-generic s390x
(current is 5.15.0-60-generic)
which makes me think that an outdated ISO image was used (maybe 22.04.1 ?)

Well, please notice that a new 22.04 "point-release" makes all previous ones obsolete
AND includes updated installers (and kernel).
So the latest supported image is the 22.04.2:
https://cdimage.ubuntu.com/releases/22.04/release/ubuntu-22.04.2-live-server-s390x.iso

So I recommend to give it another try with this up-to-date image (updated kernel and updated installer).

Changed in subiquity (Ubuntu):
importance: Undecided → High
Changed in ubuntu-z-systems:
importance: Undecided → High
Revision history for this message
bugproxy (bugproxy) wrote : subiquity ssh terminal text log from the script tool

------- Comment on attachment From <email address hidden> 2023-03-03 13:04 EDT-------

not sure the existing attachment of the installer console log was mirrored, so I attach it again

Revision history for this message
Dan Bungert (dbungert) wrote : Re: [UBUNTU 22.04] OS installer exits for zfcp 32G adapter with an unknown error. An error occurred during installation

A retest with 22.04.2 is a good idea.
If that fails similarly, I would appreciate a tarball of the contents of /var/log/installer.

Changed in subiquity (Ubuntu):
status: New → Incomplete
Revision history for this message
bugproxy (bugproxy) wrote : subiquity ssh terminal text log from the script tool

------- Comment on attachment From <email address hidden> 2023-03-03 13:04 EDT-------

not sure the existing attachment of the installer console log was mirrored, so I attach it again

Frank Heimes (fheimes)
Changed in ubuntu-z-systems:
status: New → Incomplete
Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment on attachment From <email address hidden> 2023-03-03 13:04 EDT-------

not sure the existing attachment of the installer console log was mirrored, so I attach it again

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment on attachment From <email address hidden> 2023-03-03 13:04 EDT-------

not sure the existing attachment of the installer console log was mirrored, so I attach it again

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment on attachment From <email address hidden> 2023-03-03 13:04 EDT-------

not sure the existing attachment of the installer console log was mirrored, so I attach it again

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment on attachment From <email address hidden> 2023-03-03 13:04 EDT-------

not sure the existing attachment of the installer console log was mirrored, so I attach it again

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment on attachment From <email address hidden> 2023-03-03 13:04 EDT-------

not sure the existing attachment of the installer console log was mirrored, so I attach it again

Revision history for this message
bugproxy (bugproxy) wrote : ubuntu-22.04.2_installer_04052023_1330_mst_1.txt

------- Comment on attachment From <email address hidden> 2023-04-05 18:23 EDT-------

Issue still occurs on ubuntu-22.04.2_installer
Linux script file ubuntu-22.04.2_installer_04052023_1330_mst_1.txt with output from ssh installer@<ip address>

Error after selecting 100d zfcp-host and enable

OSError to /var/crash/1680726785.177204132.unknown.crash

Revision history for this message
bugproxy (bugproxy) wrote : View_full_report_1680726785.177204132.unknown.crash.txt
Download full text (9.4 KiB)

------- Comment on attachment From <email address hidden> 2023-04-05 18:28 EDT-------

View full report output copied to file View_full_report_1680726785.177204132.unknown.crash.txt

Next ssh installer@<ip address> returns OSError: [Errno 28] No space left on device
i.e.

Welcome to Ubuntu 22.04.2 LTS (GNU/Linux 5.15.0-60-generic s390x)

 * Documentation: https://help.ubuntu.com
 * Management: https://landscape.canonical.com
 * Support: https://ubuntu.com/advantage

 * Introducing Expanded Security Maintenance for Applications.
   Receive updates to over 25,000 software packages with your
   Ubuntu Pro subscription. Free for personal use.

     https://ubuntu.com/pro

Expanded Security Maintenance for Applications is not enabled.

0 updates can be applied immediately.

Enable ESM Apps to receive additional future security updates.
See https://ubuntu.com/esm or run: sudo pro status

The list of available updates is more than a week old.
To check for new updates run: sudo apt update

Last login: Wed Apr 5 21:44:58 2023 from 9.11.56.94
--- Logging error ---
Traceback (most recent call last):
  File "/snap/subiquity/4383/usr/lib/python3.8/logging/__init__.py", line 1089, in emit
    self.flush()
  File "/snap/subiquity/4383/usr/lib/python3.8/logging/__init__.py", line 1069, in flush
    self.stream.flush()
OSError: [Errno 28] No space left on device
Call stack:
  File "/snap/subiquity/4383/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/snap/subiquity/4383/usr/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/snap/subiquity/4383/lib/python3.8/site-packages/subiquity/__main__.py", line 5, in <module>
    sys.exit(main())
  File "/snap/subiquity/4383/lib/python3.8/site-packages/subiquity/cmd/tui.py", line 126, in main
    logger.info("Starting Subiquity revision {}".format(version))
Message: 'Starting Subiquity revision 4383'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
  File "/snap/subiquity/4383/usr/lib/python3.8/logging/__init__.py", line 1089, in emit
    self.flush()
  File "/snap/subiquity/4383/usr/lib/python3.8/logging/__init__.py", line 1069, in flush
    self.stream.flush()
OSError: [Errno 28] No space left on device
Call stack:
  File "/snap/subiquity/4383/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/snap/subiquity/4383/usr/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/snap/subiquity/4383/lib/python3.8/site-packages/subiquity/__main__.py", line 5, in <module>
    sys.exit(main())
  File "/snap/subiquity/4383/lib/python3.8/site-packages/subiquity/cmd/tui.py", line 126, in main
    logger.info("Starting Subiquity revision {}".format(version))
Message: 'Starting Subiquity revision 4383'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
  File "/snap/subiquity/4383/usr/lib/python3.8/logging/__init__.py", line 1089, in emit
    self.flush()
  File "/snap/subiquity/4383/usr/lib/python3.8/logging/__init__.py", line 1069, in flush
    self.stream.flush()
OSError: [Errno 28]...

Read more...

Frank Heimes (fheimes)
Changed in ubuntu-z-systems:
status: Incomplete → Triaged
Changed in subiquity (Ubuntu):
status: Incomplete → Triaged
Revision history for this message
bugproxy (bugproxy) wrote : subiquity ssh terminal text log from the script tool

------- Comment on attachment From <email address hidden> 2023-03-03 13:04 EDT-------

not sure the existing attachment of the installer console log was mirrored, so I attach it again

Revision history for this message
Frank Heimes (fheimes) wrote : Re: [UBUNTU 22.04] OS installer exits for zfcp 32G adapter with an unknown error. An error occurred during installation

Thx for attaching the logs and the crash report, we'll investigate ...

What I'm just wondering about are the '"OSError: [Errno 28] No space left on device"' messages. Is there something with the FCP/SCSI LUN (size) or options to write to it?

Revision history for this message
bugproxy (bugproxy) wrote : subiquity ssh terminal text log from the script tool

------- Comment on attachment From <email address hidden> 2023-03-03 13:04 EDT-------

not sure the existing attachment of the installer console log was mirrored, so I attach it again

Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla
Download full text (3.7 KiB)

------- Comment From <email address hidden> 2023-04-06 11:49 EDT-------
The installer reported
zfcp-host
0.0.100d --> Enable action by user
0.0.110d
0.0.120d
0.0.130d

The installer acreen then shows ? Updating... ? after the zfcp-host is enabled by user

Installer Screen reported:
0.0.100d -> Enable -> ? Updating... ? ---> List of scsi luns reported ---> "An error occurred during installation"
--
Output displayed a list of scsi luns on zfcp-host 0.0.100d, just before reporting "An error occurred during installation" screen views

i.e.
zfcp-host

0x500173800cef0131:0x0000000000000000 sg5
0x500173800cef0131:0x0001000000000000
0x500173800cef013
0000000000 sdg sg7
0x500507605ebff1f1:0x0000000000000000 sde sg4
0x500507630600d6d3:0x4001404300000000 sda sg0
0x500507630600d6d3:0x4004402f00

0x500507680b2541ba:0x0000000000000000 sdl sg12
0x50050768101702e1:0x0000000000000000 sdh sg8
0x50050768101702e1:0x0001000000000000
sdi sg9
;128;40m0x5005076810170cc9:0x0000000000000000 sdj sg10
0.0.110d ?
0.0.120d ?
0.0.13
0d

Log file shows chzdev and lszdev issued 0.0.100d

i.e.
2023-04-05 20:33:00,208 DEBUG subiquitycore.utils:92 arun_command called: ['chzdev', '--enable', '0.0.100d']
2023-04-05 20:33:01,062 DEBUG subiquitycore.utils:101 arun_command ['chzdev', '--enable', '0.0.100d'] exited with code 0
2023-04-05 20:33:01,062 DEBUG subiquitycore.utils:64 run_command called: ['lszdev', '--quiet', '--pairs', '--columns', 'id,type,on,exists,pers,auto,failed,names']
2023-04-05 20:33:01,129 DEBUG subiquitycore.utils:77 run_command ['lszdev', '--quiet', '--pairs', '--columns', 'id,type,on,exists,pers,auto,failed,names'] exited with code 0
2023-04-05 20:33:01,133 DEBUG root:37 finish: subiquity/Zdev/chzdev_POST: SUCCESS: 200 [{"id": "0.0.0009", "type": "generic-ccw", "on": true, "exists": true, "pers"...
2023-04-05 20:33:01,134 INFO aiohttp.access:233 [05/Apr/2023:20:33:00 +0000] "POST /zdev/chzdev?action=%22enable%22&zdev=%7B%22id%22:+%220.0.100d%22,+%22type%22:+%22zfcp-host%22,+%22on%22:+false,+%22exists%2 2:+true,+%22pers% ^H:^[[K^M^[[K22:+false,+%22auto%22:+false,+%22failed%22:+false,+%22names%22:+%22%22%7D HTTP/1.1" 200 4435 "-" "Python/3.8 aiohttp/3.6.2"

Later reports 2023-04-05 20:33:05,174 ERROR subiquity.server.server:424 top level error

i.e.

2023-04-05 20:33:05,174 ERROR subiquity.server.server:424 top level error
Traceback (most recent call last):
File "/snap/subiquity/4383/usr/lib/python3.8/asyncio/events.py", line 81, in _run
self._context.run(self._callback, *self._args)
File "/snap/subiquity/4383/lib/python3.8/site-packages/subiquity/server/controllers/filesystem.py", line 682, in _udev_event
action, dev = self._monitor.receive_device()
File "/snap/subiquity/4383/lib/python3.8/site-packages/pyudev/monitor.py", line 400, in receive_device
device = self.poll()
File "/snap/subiquity/4383/lib/python3.8/site-packages/pyudev/monitor.py", line 358, in poll
if eintr_retry_call(poll.Poll.for_events((self, "r")).poll, timeout):
File "/snap/subiquity/4383/lib/python3.8/site-packages/pyudev/_uti...

Read more...

Revision history for this message
bugproxy (bugproxy) wrote : subiquity ssh terminal text log from the script tool

------- Comment on attachment From <email address hidden> 2023-03-03 13:04 EDT-------

not sure the existing attachment of the installer console log was mirrored, so I attach it again

Revision history for this message
Frank Heimes (fheimes) wrote : Re: [UBUNTU 22.04] OS installer exits for zfcp 32G adapter with an unknown error. An error occurred during installation

I tried to re-create this (or at least a situation that is similar) on our system, but I have to admit that I do not have the exact same hardware.
But I have a PR/SM system (not a DPM box) with FICON Express 16S (no 32s, but the driver is the same for both) and a DS8000 disk storage sub-system. I tried that on a z/VM guest that has one 64Gbit LUN attached via two HBAs each with two paths. (DASD ECKD devices were also available, but not used at all.)
With that I could successfully complete an Ubuntu Server 22.04.2 installation.
Please see the attached doc for the relevant storage related installer screens and at the end the lszdev output from the installer shell.

Unfortunately I don't have an XIV storage system (since I know that it behaves slightly different compared to a DS8k).

And you already retried on 22.04.2 with the same result you've reported.

And you did a (standard) interactive install (and no 'autoinstall'), right?

May I also ask if your system a DPM system?
And if you, could you please check if there is autoconf data set, like 'lszdev --auto-conf'?

Then I faced a special situation where a disk was previously used and had existing LVM config on it that the installer tries to read, but struggled with it - or where a very old LVM existed that had (meanwhile) incompatible meta data.
To rule out any issues like this I also want to recommend to try manually wiping out the disk, like:
Start the installer (which fortunately is a Ubuntu live system), enable the FCP LUN (either in the UI or in an installer shell, which can be reached via the help menu or Control-Z, respectively 'F2') and wipe the disks like:
# ls -la /dev/mapper/
control mpatha mpatha-part1
# wipefs -a -f /dev/mapper/mpatha-part1
# wipefs -a -f /dev/mapper/mpatha
( # or go via the scsi device:
# wipefs -a -f /dev/sda
# wipefs -a -f /dev/sda1 )
Afterwards it's needed to restart the installer from scratch (Load task).

Revision history for this message
bugproxy (bugproxy) wrote : subiquity ssh terminal text log from the script tool

------- Comment on attachment From <email address hidden> 2023-03-03 13:04 EDT-------

not sure the existing attachment of the installer console log was mirrored, so I attach it again

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment on attachment From <email address hidden> 2023-03-03 13:04 EDT-------

not sure the existing attachment of the installer console log was mirrored, so I attach it again

Revision history for this message
bugproxy (bugproxy) wrote : host_installer_shell_cmds_04062023_1.txt
Download full text (3.1 KiB)

------- Comment on attachment From <email address hidden> 2023-04-06 17:43 EDT-------

Reference attached file host_installer_shell_cmds_04062023_1.txt on DMP details, from the shell for command executed in shell

Note after chzdev --enable, Quantity 1969 files generated in /var/log/crash filling up /
i.e.
ls -l /var/crash
chzdev --enable 0.0.100d
lszdev
ls -l | grep unknown | wc -l
1969

i.e.
  45590 Apr 6 12:17 1680808641.142762184.unknown.crash.gz
     90 Apr 6 12:17 1680808641.142762184.unknown.meta.gz
  45751 Apr 6 12:17 1680808641.190968275.unknown.crash.gz
     77 Apr 6 12:17 1680808641.190968275.unknown.meta.gz
.
.
.
 142575 Apr 6 12:18 1680808725.929860353.unknown.crash.gz
     77 Apr 6 12:18 1680808725.929860353.unknown.meta.gz
 162995 Apr 6 12:18 1680808726.069142342.unknown.crash.gz
     77 Apr 6 12:18 1680808726.069142342.unknown.meta.gz
 141831 Apr 6 12:18 1680808726.164782763.unknown.crash.gz
     77 Apr 6 12:18 1680808726.164782763.unknown.meta.gz

Includes
lszdev --auto-conf
multipath -l
lszfcp
fdisk -l /dev/mapper/xxxxx
ls -l /var/crash | wc -l <=============== Quantity 1969 files generated filling up /

root@ubuntu-server:~# df -T
Filesystem Type 1K-blocks Used Available Use% Mounted on
tmpfs tmpfs 3294672 305604 2989068 10% /run
/dev/loop0 iso9660 1149188 1149188 0 100% /cdrom
/cow overlay 16473348 16473348 0 100% /
overlay overlay 292992 292992 0 100% /media/filesystem
tmpfs tmpfs 16473348 0 16473348 0% /dev/shm
tmpfs tmpfs 5120 0 5120 0% /run/lock
tmpfs tmpfs 16473348 0 16473348 0% /tmp
tmpfs tmpfs 3294668 4 3294664 1% /run/user/1000
overlay overlay 292992 292992 0 100% /tmp/tmpcsrrjbgt/root.dir

In response to
> try manually wiping out the disk that have old LVM

Some of the luns are older OS Boot luns similar to lun mpathb (Boot lun for ubuntu 20.04).

Wiping these luns would not be an option as they are needed for ongoing tests and/or support

Will use a work-around to unmap such luns from the host during a new OS install so a new lun [i.e. for 20.04.2) can be installed

Another OS Boot lun is 20.04.1 on zfcp-host 120d/130d [not enabled for latest run]

Boot lun for ubuntu 20.04
lun mpathb (360050762198c1fc2180000000b000132) = (Boot lun for ubuntu 20.04)

mpathb (360050762198c1fc2180000000b000132) dm-1 IBM,FlashSystem-9840
size=50G features='0' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=0 status=active
  |- 0:0:1:0 sdc 8:32 active undef running
  `- 0:0:4:0 sdi 8:128 active undef running

fdisk -l /dev/mapper/mpathb
Disk /dev/mapper/mpathb: 50 GiB, 53687091200 bytes, 104857600 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 65536 bytes
Disklabel type: gpt
Disk identifier: D3739269-492B-463E-927C-5441496FB8F3

Device Start End Sectors Size Type
/dev/mapper/mpathb-part1 2048 2099199 2097152 1G Linux filesystem
/dev/mapper/mpathb-part2 2099200 104855551 102756352 ...

Read more...

Revision history for this message
bugproxy (bugproxy) wrote : subiquity ssh terminal text log from the script tool

------- Comment on attachment From <email address hidden> 2023-03-03 13:04 EDT-------

not sure the existing attachment of the installer console log was mirrored, so I attach it again

Revision history for this message
bugproxy (bugproxy) wrote : var_log_installer_04062023_1.tar.gz

------- Comment on attachment From <email address hidden> 2023-04-06 17:51 EDT-------

For case of enabling zfcp-host 0.0.100d in the shell

chzdev --enable 0.0.100d
lszdev
ls -l | grep unknown | wc -l
1969

Revision history for this message
bugproxy (bugproxy) wrote : subiquity ssh terminal text log from the script tool

------- Comment on attachment From <email address hidden> 2023-03-03 13:04 EDT-------

not sure the existing attachment of the installer console log was mirrored, so I attach it again

Revision history for this message
bugproxy (bugproxy) wrote : var_crash_04062023_a.tar.gz

------- Comment on attachment From <email address hidden> 2023-04-06 18:08 EDT-------

For case of enabling zfcp-host 0.0.100d in the shell

Included 42 of the 1969 files in this attachment

Revision history for this message
bugproxy (bugproxy) wrote : subiquity ssh terminal text log from the script tool

------- Comment on attachment From <email address hidden> 2023-03-03 13:04 EDT-------

not sure the existing attachment of the installer console log was mirrored, so I attach it again

Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2023-04-07 12:36 EDT-------
Comment on attachment 157094
subiquity ssh terminal text log from the script tool

This attachment is a duplicate of attachment 157095 and seems to cause a hick-up in the IBM-Canonical bridge. As a result, the attachment comment is appended to the Launchpad entry over and over again every 24 hours.
By deleting the attachment, I am trying to fix this problem.

Revision history for this message
bugproxy (bugproxy) wrote :
Download full text (5.7 KiB)

------- Comment From <email address hidden> 2023-04-12 13:24 EDT-------
(In reply to comment #25)
> Thx for attaching the logs and the crash report, we'll investigate ...
>
> What I'm just wondering about are the '"OSError: [Errno 28] No space left on
> device"' messages. Is there something with the FCP/SCSI LUN (size) or
> options to write to it?

Looking at the log, the installer has not made much progress yet. We just successfully(!) probed a few SCSI disks, but haven't configured any partitioning, let alone mount points. I take it that the installer must not write to any real disk at that point in time. So ENOSPC cannot come from zfcp-attached SCSI disks. Let's not get hung up on zfcp or on different FCP-attached storage arrays (DS8000, XIV, FlashSystem, etc.); they all present standard SCSI disks for which the common code Linux kernel driver sd_mod provides regular block devices; nothing special about this at all.

BTW, the kernel boot parameters look odd:

[ 0.440956] Kernel command line: @```@%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@

@<email address hidden>, what exact parm file content did you use to boot the installer?
Since the network interface ence0f appears without DPM auto conf being used and I don't see it being configured interactively in the installer, I wonder where its ccwgroup configuration came from. Maybe the parm file had enough leading zeros to get truncated during kernel console output, but maybe there was some boot parameter to group 0.0.0e0f,0.0.0e10,0.0.0e11 for ence0f?

(In reply to comment #29)
> host_installer_shell_cmds_04062023_1.txt
> Note after chzdev --enable, Quantity 1969 files generated in /var/log/crash
> filling up /
> ls -l /var/crash
> ls -l | grep unknown | wc -l
> 1969
...
> 142575 Apr 6 12:18 1680808725.929860353.unknown.crash.gz
> 77 Apr 6 12:18 1680808725.929860353.unknown.meta.gz
> 162995 Apr 6 12:18 1680808726.069142342.unknown.crash.gz
...

> root@ubuntu-server:~# df -T
> Filesystem Type 1K-blocks Used Available Use% Mounted on
> /cow overlay 16473348 16473348 0 100% /
> overlay overlay 292992 292992 0 100% /media/filesystem
> overlay overlay 292992 292992 0 100% /tmp/tmpcsrrjbgt/root.dir

I see ENOSPC also when the installer tries to log something. I assume this must happen towards some space in the ramdisk the installer runs within.
There seem to be a number of (too many?) installer "crash" files under /var/crash likely on the completely filled up overlay-fs.
Those "crash" files are neither created by chzdev nor lszdev.

However, if I read the logs correctly, these debug data files consuming too much space only get generated due to other earlier python tracebacks from subiquity. IOW, ENOSPC (or EMFILE) errors are just misleading follow-on errors.

The very first one of those tracebacks happens on udev settle for the network device (before any zfcp devices):

2023-04-06 19:17:21,023 DEBUG subiquity.server.controllers.filesystem:671 waiting 0.1 to let udev event queue settle
2023-04-06 19:17:21,124 DEBUG subiquitycore.utils:64 run_command called: ['udevadm', 'settle', '-t', '0']
2023-...

Read more...

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2023-04-13 20:16 EDT-------
(In reply to comment #34)
@<email address hidden>

> BTW, the kernel boot parameters look odd:
>
> [ 0.440956] Kernel command line:
> @```@%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
> @@@
>
> @<email address hidden>, what exact parm file content did you use to boot the
> installer?

I started out with the following, but my syntax sequence for the line was not accepted , as was getting "Unknown kernel command line parameters"
I need to check the proper syntax and sequence for the Kernel command line input to put in file PARMFILE UBUNTU

PARMFILE UBUNTU
ip=9.11.116.213::9.11.116.1:24:255.255.255.0:ilabg13:ence0f:none:9.11.227.25

i.e.
[ 0.452018] Kernel command line: ip=9.11.116.213::9.11.116.1:24:255.255.255.0:ilabg13:ence0f:none:9.11.227.25
[ 0.452037] Unknown kernel command line parameters "ip=9.11.116.213::9.11.116.1:24:255.255.255.0:ilabg13:ence0f:none:9.11.227.25", will be passed to user space

So I chose to go with the default PARMFILE UBUNTU, with just the "---" line and enter the info as requested per the the statement "will be passed to user space", as it prompts
i.e.
Default PARMFILE UBUNTU
---

Revision history for this message
bugproxy (bugproxy) wrote :
Download full text (5.7 KiB)

------- Comment From <email address hidden> 2023-04-14 07:02 EDT-------
(In reply to comment #35)
> (In reply to comment #34)
> > BTW, the kernel boot parameters look odd:
> >
> > [ 0.440956] Kernel command line:
> > @```@%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
> > @@@
> >
> > @<email address hidden>, what exact parm file content did you use to boot the
> > installer?
>
> I started out with the following, but my syntax sequence for the line was
> not accepted , as was getting "Unknown kernel command line parameters"

Unfortunately, this kernel message can be misleading.

The Linux "kernel" command line can contain different types of parameters:
(1) actual kernel parameters
(2) arbitrary user space parameters

See also Section "Parameters other than kernel parameters" in https://www.ibm.com/docs/en/linux-on-systems?topic=skp-different-sources-1#ipl_kernparm_conflicts__title__4
and
https://www.ibm.com/docs/en/linux-on-systems?topic=s-kernel-parameter-line-1

Recently, the kernel started to check the syntax for (1). So any parameters from class (2) [before an optional separator "--"] get reported as unknown, but they can still be perfectly used by user space.

You can specify all non-kernel parameters from class (2) after a "--" separator on the kernel command line like this to avoid misleading kernel messages [see also below]:
<all actual kernel parameters> -- <any non-kernel parameters>

> I need to check the proper syntax and sequence for the Kernel command line
> input to put in file PARMFILE UBUNTU
>
> PARMFILE UBUNTU
> ip=9.11.116.213::9.11.116.1:24:255.255.255.0:ilabg13:ence0f:none:9.11.227.25

I think the duplicate specification of both a CIDR netmask and an IP netmask seems odd:
24:255.255.255.0

I'm not sure, which syntax the Ubuntu installer initrd actually uses.
There is a (user space) dracut cmdline option
https://mirrors.edge.kernel.org/pub/linux/utils/boot/dracut/dracut.html#_network
ip=<client-IP>:[<peer>]:<gateway-IP>:<netmask>:<client_hostname>:<interface>:{none|off|dhcp|on|any|dhcp6|auto6|ibft}[:[<mtu>][:<macaddr>]]
ip=<client-IP>:[<peer>]:<gateway-IP>:<netmask>:<client_hostname>:<interface>:{none|off|dhcp|on|any|dhcp6|auto6|ibft}[:[<dns1>][:<dns2>]]
And there is an actual kernel parameter
https://www.kernel.org/doc/html/latest/admin-guide/nfs/nfsroot.html#kernel-command-line
ip=<client-ip>:<server-ip>:<gw-ip>:<netmask>:<hostname>:<device>:<autoconf>:<dns0-ip>:<dns1-ip>:<ntp0-ip>

Comparing with my last Ubuntu on s390x installation, I would probably use this instead (dropping ":24"), no matter if used as kernel boot parameter or entered interactively (in case the latter understands ip= syntax, which I don't know):

ip=9.11.116.213::9.11.116.1:255.255.255.0:ilabg13:ence0f:none:9.11.227.25

FWIW, I had also added an url= pointing to the ISO in my parmfile when I manually installed Ubuntu on s390x the last time, but that might be optional(?):

url=http://bistro/ubuntu/UBUNTUxx.yy/zzz.-live-server-...-s390x.iso

Also, without DPM device auto configuration, I wonder how the network interface ence0f would get configured as prerequisite. As opposed to PCI(e) based network devices such as RoCE adapters, CC...

Read more...

Revision history for this message
bugproxy (bugproxy) wrote :
Download full text (7.1 KiB)

------- Comment From <email address hidden> 2023-04-14 08:43 EDT-------
(In reply to comment #36)
> (In reply to comment #35)
> > (In reply to comment #34)

> > I need to check the proper syntax and sequence for the Kernel command line
> > input to put in file PARMFILE UBUNTU
> >
> > PARMFILE UBUNTU
> > ip=9.11.116.213::9.11.116.1:24:255.255.255.0:ilabg13:ence0f:none:9.11.227.25
>
> I think the duplicate specification of both a CIDR netmask and an IP netmask
> seems odd:
> 24:255.255.255.0

> Comparing with my last Ubuntu on s390x installation, I would probably use
> this instead (dropping ":24"), no matter if used as kernel boot parameter or
> entered interactively (in case the latter understands ip= syntax, which I
> don't know):
>
> ip=9.11.116.213::9.11.116.1:255.255.255.0:ilabg13:ence0f:none:9.11.227.25

host.ilabg13_9.11.116.213_ubuntu-22.04_ssh_installer.txt.gz seems to show an example with a correct ip syntax, which likely worked:

[ 0.360759] Kernel command line: ip=9.11.116.213::9.11.116.1:255.255.255.0:ilabg13:ence0f:none:9.11.227.25 https://cdimage.ubuntu.com/releases/jammy/release/ubuntu-22.04.1-live-server-s39
[ 0.360781] Unknown kernel command line parameters "ip=9.11.116.213::9.11.116.1:255.255.255.0:ilabg13:ence0f:none:9.11.227.25", will be passed to user space.

[ 1.852616] Run /init as init process
[ 1.852617] with arguments:
[ 1.852618] /init
[ 1.852618] with environment:
[ 1.852619] HOME=/
[ 1.852619] TERM=linux
[ 1.852620] ip=9.11.116.213::9.11.116.1:255.255.255.0:ilabg13:ence0f:none:9.11.227.25

[ 3.638269] qeth: register layer 2 discipline
[ 3.638983] qeth 0.0.0e0f: CHID: ffee CHPID: ee
[ 3.645883] qeth 0.0.0e11: qdio: OSA on SC 2 using AI:1 QEBSM:1 PRI:1 TDD:1 SIGA: W
[ 3.675469] qeth 0.0.0e0f: Device is a OSD Express card (level: 0173)
with link type OSD_1000.
[ 3.675884] qeth 0.0.0e0f: The device represents a Bridge Capable Port
[ 3.677706] qeth 0.0.0e0f: MAC address 02:76:54:00:00:05 successfully registered
[ 3.681817] qeth 0.0.0e0f ence0f: renamed from eth0

2023-01-18 20:23:57,906 DEBUG subiquitycore.netplan:109 config for ence0f = {'addresses': ['9.11.116.213/24'], 'gateway4': '9.11.116.1', 'nameservers': {'addresses': ['9.11.227.25']}}

> > So I chose to go with the default PARMFILE UBUNTU, with just the "---" line
> > and enter the info as requested per the the statement "will be passed to
> > user space", as it prompts
>
> Sorry, if I missed it as I'm a bit lost in all the debug data. Where can I
> see this prompt and what was interactively entered?

I think I found an example:

ilabg13_9.11.116.213_ubuntu-22.04.2_setup_install_04032023_9.txt

Two methods available for IP configuration:
* static: for static IP configuration
* dhcp: for automatic IP configuration
static dhcp (default 'dhcp'): static
ip: 9.11.116.213
netmask (default 255.255.255.0):
gateway (default 9.11.116.1):
dns (default 9.11.116.1): 9.11.227.25
vlan id (optional):
http://cdimage.ubuntu.com/releases/jammy/release/ubuntu-22.04-live-server-s390x.iso (default)
url: http://cdimage.ubuntu.com/releases/22.04.1/release/ubuntu-22.04.2-live-server-s390x.iso
http_proxy (optional):
C...

Read more...

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2023-04-18 15:04 EDT-------
@Canonical, A general comment on usability for users. Recommend updating reference to "zdev" for users highlighting what type of "zdev" is expected ?

A QETH network device (0e0f) and not a zfcp QDIO device [i.e. 0.0.100d,0.0.110d,0.0.120d,0.0.130d)

Even though a network device is implied, it is not clear to users what type of zdev is being asked for

Reference to "zdev" is open to interpretation as to what type of z device is requested
i.e. QETH network or zfcp QDIO device)

One variation
i.e.
Attempt interactive netboot from a URL?
yes no (default yes): yes
Available qeth devices:
0.0.0db0 0.0.0dc0 zdev to activate (comma separated, optional): 0.0.100d,0.0.110d,0.0.120d,0.0.130d <============ equals zfcp QDIO device

Second variation
Attempt interactive netboot from a URL?
yes no (default yes):
Available qeth devices:
0.0.0db0 0.0.0dc0 0.0.0e0f
zdev to activate (comma separated, optional): 0e0f <============ equals QETH network device

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2023-04-19 06:07 EDT-------
(In reply to comment #39)
> @Canonical, A general comment on usability for users. Recommend updating
> reference to "zdev" for users highlighting what type of "zdev" is expected ?
>
> A QETH network device (0e0f) and not a zfcp QDIO device [i.e.
> 0.0.100d,0.0.110d,0.0.120d,0.0.130d)

Regarding terminology: "zfcp QDIO" sounds odd to me as zfcp maintainer. Both qeth [the q stands for qdio] and zfcp are based on qdio.

I would like to avoid starting to specify exclusion lists as it's incomplete and would need to be maintained. Instead, let's just clearly specify the allowed list of zdev types.

Just as a reference, the list of zdev types that the backend tool knows (not necessarily all of which might be supported by subiquity as frontend):

https://www.ibm.com/docs/en/linux-on-systems?topic=sr-persistent-configuration
https://www.ibm.com/docs/en/linux-on-systems?topic=pc-view-configuration

# lszdev --list-types
TYPE DESCRIPTION
dasd FICON-attached Direct Access Storage Devices (DASDs)
dasd-eckd Enhanced Count Key Data (ECKD) DASDs
dasd-fba Fixed Block Architecture (FBA) DASDs
zfcp SCSI-over-Fibre Channel (FCP) devices and SCSI devices
zfcp-host FCP devices
zfcp-lun zfcp-attached SCSI devices
qeth OSA-Express and HiperSockets network devices
ctc Channel-To-Channel (CTC) and CTC-MPC network devices
lcs LAN-Channel-Station (LCS) network devices
generic-ccw Generic Channel-Command-Word (CCW) devices

>
> Even though a network device is implied, it is not clear to users what type
> of zdev is being asked for

@Canonical: I would like to understand if the user dialog really means to imply a network device. (looks like it just passed whatever the user entered to zdev from s390-tools and can thus configure any of the above zdev types; but it also appears during something that seems to imply a network device context)

>
> Reference to "zdev" is open to interpretation as to what type of z device is
> requested
> i.e. QETH network or zfcp QDIO device)
>
> One variation
> i.e.
> Attempt interactive netboot from a URL?
> yes no (default yes): yes

> Available qeth devices:
> 0.0.0db0 0.0.0dc0

Which zdev device types does subiquity support here and could it ever enumerate available (network) devices of another type than qeth, such as lcs or ctc?

If so, maybe a suitable terminology would be
"channel-attached network devices to activate (comma separated, optional):"
That would implicitly exclude PCIe-attached RoCE adapters, which don't need such preconfiguration, and it would also implicitly exclude zfcp and other channel-attached device types such as generic-ccw.

If not, then maybe just
"qeth network device to activate (comma separated, optional):"

> zdev to activate (comma separated, optional):
> 0.0.100d,0.0.110d,0.0.120d,0.0.130d <============ equals zfcp QDIO device
>
>
> Second variation
> Attempt interactive netboot from a URL?
> yes no (default yes):
> Available qeth devices:
> 0.0.0db0 0.0.0dc0 0.0.0e0f
> zdev to activate (comma separated, optional): 0e0f <============ equals QETH
> network device

Revision history for this message
Frank Heimes (fheimes) wrote : Re: [UBUNTU 22.04] OS installer exits for zfcp 32G adapter with an unknown error. An error occurred during installation
Download full text (18.6 KiB)

After being back from a few days pto, I'm catching up on this now ...

I redid my installation (on z/VM with FCP disks, but on a non-DPM system)
leaving here some details:

▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄
  Willkommen! Bienvenue! Welcome! Добро пожаловать!┌──────────────────[ Help ]┐
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀│ Help choosing a language │▀
  Use UP, DOWN and ENTER keys to select your langua│ Keyboard shortcuts │
                                                   │ Enter shell │
                [ Asturianu │ View error reports │
                [ Bahasa Indonesia ├──────────────────────────┤
                [ Català │ About this installer │
                [ Deutsch │ Help on SSH access │
                [ English └──────────────────────────┘
                [ English (UK) ▸ ]█
                [ Español ▸ ]█
                [ Français ▸ ]█
                [ Galego ▸ ]█
                [ Hrvatski ▸ ]█
                [ Latviski ▸ ]
                [ Lietuviškai ▸ ]
                [ Magyar ▸ ]
                [ Nederlands ▸ ]
                [ Norsk bokmål ▸ ]
                [ Occitan (aprèp 1500) ▸ ]
                [ Polski ▸ ]
                [ Português ▸ ]▾

...
Installer shell session activated.

This shell session is running inside the installer environment. You
will be returned to the installer when this shell is exited, for
example by typing Control-D or 'exit'.

Be aware that this is an ephemeral environment. Changes to this
environment will not survive a reboot. If the install has started, the
installed system will be mounted at /target.
root@ubuntu-server:/# cat /proc/cmdline
%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
root@ubuntu-server:/# uname -a
Linux ubuntu-server 5.15.0-60-generic #66-Ubuntu SMP Fri Jan 20 14:30:43 UTC 2023 s390x s390x s390x GNU/Linux
root@ubuntu-server:/# snap list subiquity
Name Version Rev Tracking Publisher Notes
subiquity 23.02.1 4383 latest/stable/… canonical** classic
root@ubuntu-server:/# python3 --version
Python 3.10.6
root@ubuntu-server:/# ls -lad /var/log
drwxrwxr-x 1 root syslog 320 Apr 19 10:18 /var/log
root@ubuntu-server:/# lszdev --online
TYPE ID ON PERS NAMES
qeth 0.0.0600:0.0.0601:0.0.0602 yes no enc600
generic-ccw 0.0.0009 ...

Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2023-04-19 08:20 EDT-------
(In reply to comment #41)
> You mentioned that wiping out older or addditional LUNs is not an option.
> And I think it's not needed, only thought about wiping the LUNs for the OS

> itself and only enable this OS LUN during installaltion. Any additional LUNs
> can be easily added post-install and should not be enabled at install time
> (here, for testing and to be on the safe side).
> (I think that's easier compared to disabling them on the SAN side ...)

That would require disabling zfcp auto lun scan during installation [zfcp.allow_lun_scan=0]. I intentionally did not suggest this as changing the host-mapping of volumes on the storage is cleaner and does not change the scanning behavior of Linux. That said, it would be an option.

> And make sure the parm file is in the correct encoding (fix length "F 80" or
> variable, "Trunc=80"):
> PARMFILE UBUNTU O1 F 80 Trunc=80 Size=4 Line=0 Col=1 Alt=0

I'm not sure it needs fixed record length under z/VM. I use variable record length for parm files successfully in my z/VM guests.
In contrast to that, the binaries for kernel and initrd must indeed be fixed record length 80.

> The 3 dashes (" --- ") are to separate installer from kernel arguments.
> "<installer> --- <kernel>"

As stated recently, the kernel documentation says the separator is a double dash and kernel parameters go before the separator and user space stuff after it. I'm confused.

> At the early boot stage it's about "interactive netboot" and asks for
> network information only
> and all network devices are qeth (of course except RoCE) - so don't specify
> any other devices here (like HBAs or LUNS)'or whatever).

good info, I wasn't aware; then it should explicitly state so [see earlier comments from today]

> I believe that the content of the kernel parameter with all the "@" is more
> a representaton issue (nevertheless, not very nice though ...), but since it
> works for me on my system - and even with much more kernel args that are
> needed in case of a fully non-interactive "autoinstall".

ok, but this can confuse users (or even init/systemd) so it would be good to find the root cause and fix that as well (with lower prio than the actual installation issue)

> What I noticed in the crash file is the following snippet:
> "
> 2023-04-06 19:17:21,139 DEBUG subiquitycore.utils:77 run_command ['udevadm',
> 'settle', '-t', '0'] exited with code 0

> 2023-04-06 19:17:21,143 INFO subiquity.common.errorreport:406 saving crash
> report 'unknown error crashed with OSError' to

yes, I've been pointing to this multiple times and it even occurs early for settling after the network interface (IP address) setup and before any zfcp device config

> That could be a problem with asyncio (I remember that there was an issue
> with asyncio in the past) or a race condition.
> I'll ask my installer colleague to have a look at this ...

looking forward

Revision history for this message
Frank Heimes (fheimes) wrote : Re: [UBUNTU 22.04] OS installer exits for zfcp 32G adapter with an unknown error. An error occurred during installation
Download full text (3.5 KiB)

> > You mentioned that wiping out older or addditional LUNs is not an option.
> > And I think it's not needed, only thought about wiping the LUNs for the OS

> > itself and only enable this OS LUN during installaltion. Any additional LUNs
> > can be easily added post-install and should not be enabled at install time
> > (here, for testing and to be on the safe side).
> > (I think that's easier compared to disabling them on the SAN side ...)

> That would require disabling zfcp auto lun scan during installation
> [zfcp.allow_lun_scan=0]. I intentionally did not suggest this as
> changing the host-mapping of volumes on the storage is cleaner and does
> not change the scanning behavior of Linux. That said, it would be an
> option.
ok, I see and agree

> > And make sure the parm file is in the correct encoding (fix length "F 80" or
> > variable, "Trunc=80"):
> > PARMFILE UBUNTU O1 F 80 Trunc=80 Size=4 Line=0 Col=1 Alt=0

> I'm not sure it needs fixed record length under z/VM. I use variable record > length for parm files successfully in my z/VM guests.
> In contrast to that, the binaries for kernel and initrd must indeed be fixed > record length 80.
Does not need to be fix 80 (like I wrote variable is also fine).
I just transfer it similar to kernel and initrd - and just switch from bin to ascii ...

> > The 3 dashes (" --- ") are to separate installer from kernel arguments.
> > "<installer> --- <kernel>"

> As stated recently, the kernel documentation says the separator is a
> double dash and kernel parameters go before the separator and user space
> stuff after it. I'm confused.
The three dashes are in/for Debian and Ubuntu.

> > At the early boot stage it's about "interactive netboot" and asks for
> > network information only
> > and all network devices are qeth (of course except RoCE) - so don't specify
> > any other devices here (like HBAs or LUNS)'or whatever).

> good info, I wasn't aware; then it should explicitly state so [see
> earlier comments from today]
Ok, thought it's obvious, since it states netboot (but I may have been blind...) - we'll consider changing the text ...

> > I believe that the content of the kernel parameter with all the "@" is more
> > a representaton issue (nevertheless, not very nice though ...), but since it
> > works for me on my system - and even with much more kernel args that are
> > needed in case of a fully non-interactive "autoinstall".

> ok, but this can confuse users (or even init/systemd) so it would be
> good to find the root cause and fix that as well (with lower prio than
> the actual installation issue)
I totally agree, I am pretty sure that this is an upstream issue, maybe introduced with the recent extention of the kernel arg space.
So I need to discuss this with IBM/Boris ..

> > What I noticed in the crash file is the following snippet:
> > "
> > 2023-04-06 19:17:21,139 DEBUG subiquitycore.utils:77 run_command ['udevadm',
> > 'settle', '-t', '0'] exited with code 0

> > 2023-04-06 19:17:21,143 INFO subiquity.common.errorreport:406 saving crash
> > report 'unknown error crashed with OSError' to

> yes, I've been pointing to this multiple times and it even occurs early
> for settling after t...

Read more...

Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2023-04-19 09:30 EDT-------
(In reply to comment #43)
> > > The 3 dashes (" --- ") are to separate installer from kernel arguments.
> > > "<installer> --- <kernel>"
>
> > As stated recently, the kernel documentation says the separator is a
> > double dash and kernel parameters go before the separator and user space
> > stuff after it. I'm confused.

> The three dashes are in/for Debian and Ubuntu.

Are you saying the ubuntu kernel has downstream changes making it different from upstream with regard to kernel cmdline separation?
If so, would you mind sharing some pointer/URL to the change as well as to corresponding documentation?

Revision history for this message
bugproxy (bugproxy) wrote : sosreport-ilabg13-2023-04-19-zfryhcl.tar.xz

------- Comment on attachment From <email address hidden> 2023-04-19 19:42 EDT-------

Attaching sos report file sosreport-ilabg13-2023-04-19-zfryhcl.tar.xz for reference, captured on config (B) 22.04.1 LTS (Jammy Jellyfish), with the three OS LUNs (50G,100G,102G) and other smaller data luns sizes ranging from 4.0G to 14G attached

Current config with LVM configs include the following 3 OS luns

 (A) 20.04.5 LTS (Focal Fossa) 50G Lun 360050762198c1fc2180000000b000132 - Legacy
 (B) 22.04.1 LTS (Jammy Jellyfish) 100G Lun 36005076306ffd6d3000000000000014a - Currently booted
 (C) 22.04.2 target install 102G Lun 36005076306ffd6d30000000000000143 - Test 22.04.2 install

Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2023-04-19 19:54 EDT-------
(In reply to comment #41)
> You mentioned that

>wiping the LUNs for the OS itself and only enable this OS LUN during installation.

The 2 new OS LUNs (initial 22.04.1 lun and second lun for retest on 22.04.2) were new luns created on the storage and mapped to the host, so had no prior LVM configs.

Most likely the installer is having issues with the older OS 20.04.5 LUN, with an existing LVM config and used for ongoing tests and/or support

Current config with LVM configs include the following 3 OS luns

(A) 20.04.5 LTS (Focal Fossa) 50G Lun 360050762198c1fc2180000000b000132 - Legacy
(B) 22.04.1 LTS (Jammy Jellyfish) 100G Lun 36005076306ffd6d3000000000000014a - Currently booted
(C) 22.04.2 target install 102G Lun 36005076306ffd6d30000000000000143 - Test 22.04.2 install

I will next use the work-around to unmap all other OS luns with LVM configs from this host on the storage, in order to have a single OS lun discovered at install [i.e. for 20.04.2) and avoid the installer filling up /var/crash with crash files, returning "OSError: [Errno 28] No space left on device" after running out of space.

The expectation is to not have to unmap other OS luns with LVM configs from a host, in order for a single OS lun be discovered for an install to be successful

>Any additional LUNs can be easily added post-install and should not be enabled at install time (here, for testing and to be on the safe side).

The capability to install a new OS to a new lun, while older OS luns (used for testing/support) with LVM configs remain attached used to work.
Config B) 22.04.1 was installed, enabling zfcp 120d/130d with Config (A) OS lun already attached [however had to not enable zfcp 100d/110d for that install, otherwise hit "OSError: [Errno 28] No space left on device"]

Reducing config to having a single OS lun discovered at install, seems to be a take away in functionality

>Any additional LUNs can be easily added post-install and should not be enabled at >install time (here, for testing and to be on the safe side).
> (I think that's easier compared to disabling them on the SAN side ...)

This adds an additional step, while also requiring to update the initramfs and reboot, if the updated configuration needs to be persistent across follow-on reboots

Revision history for this message
Frank Heimes (fheimes) wrote : Re: [UBUNTU 22.04] OS installer exits for zfcp 32G adapter with an unknown error. An error occurred during installation
Download full text (3.4 KiB)

At '<email address hidden>' - nah, that's not what I'm saying, things are (historically) a bit more complex:

So there is '--' vs '---', whereas Kernel uses everything until '--' and userspace uses everything.

Back in the early Debian-Installer days, d-i used '--' to separate things (for use in installed & installer system vs use in the installer only). When systemd & kernel had a spat, and kernel started to use '--' separator, Debian was forced to change to something else and started to use '---'. So both '--' and '---' have their use case today.
Notice this example:
debug -- systemd.log_level=info --- systems.log_level=debug
Which will have all three on for installer system, propagate the first two to the installed system, with kernel itself only considering the first option.
Because of historical usage of '--' and the conflict that Debian had to yield '--' to kernel.
Depending on the age of the docs one may find in the documentation, info might be outdated or wrong, hence confusing.

Some more references:

https://docs.kernel.org/admin-guide/kernel-parameters.html
The kernel parses parameters from the kernel command line up to “--“; if it doesn’t recognize a parameter and it doesn’t contain a ‘.’, the parameter gets passed to init: parameters with ‘=’ go into init’s environment, others are passed as command line arguments to init. Everything after “--” is passed as an argument to init.

https://d-i.debian.org/manual/en.s390x/install.en.pdf
A “---” in the boot options has special meaning. Kernel parameters that appear after the last “---” may be copied
into the bootloader configuration for the installed system (if supported by the installer for the bootloader).
The installer will automatically filter out any options (like preconfiguration options) that it recognizes.

https://man7.org/linux/man-pages/man1/init.1.html

(kudos to xnox)

At <email address hidden>:

I barely remember that there were changes in the LVM meta data in the past, hence a relatively new installer (22.04.x) might have trouble with an somewhat older 20.04.5 - so 'could'be ...
(Hence my suggestion to wipe an old LUN with potentially old data - which is in your multi-LUN, multi-OS setup not what you want.)

And yes the 'No space' left is due to the amount of crash files.

"The expectation is to not have to unmap other OS luns with LVM configs from a host, in order for a single OS lun be discovered for an install to be successful."

Sure, just note that I'm currently trying to find a workaround to unblock you in the first instance ...

"
Reducing config to having a single OS lun discovered at install, seems to be a take away in functionality

>Any additional LUNs can be easily added post-install and should not be enabled at >install time (here, for testing and to be on the safe side).
> (I think that's easier compared to disabling them on the SAN side ...)

This adds an additional step, while also requiring to update the initramfs and reboot, if the updated configuration needs to be persistent across follow-on reboots
"

I don't think that is is too bad, since it only means shifting the enablement of the LUNs that you don't want to touch and only integrate, from install time to p...

Read more...

Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2023-04-20 09:12 EDT-------
(In reply to comment #47)
> At '<email address hidden>' - nah, that's not what I'm saying, things are
> (historically) a bit more complex:
>
> So there is '--' vs '---', whereas Kernel uses everything until '--' and
> userspace uses everything.
>
> Back in the early Debian-Installer days, d-i used '--' to separate things
> (for use in installed & installer system vs use in the installer only). When
> systemd & kernel had a spat, and kernel started to use '--' separator,
> Debian was forced to change to something else and started to use '---'. So
> both '--' and '---' have their use case today.
> Notice this example:
> debug -- systemd.log_level=info --- systems.log_level=debug
> Which will have all three on for installer system, propagate the first two
> to the installed system, with kernel itself only considering the first
> option.
> Because of historical usage of '--' and the conflict that Debian had to
> yield '--' to kernel.
> Depending on the age of the docs one may find in the documentation, info
> might be outdated or wrong, hence confusing.
>
> Some more references:
>
> https://docs.kernel.org/admin-guide/kernel-parameters.html

yes, I had referenced that earlier

> https://d-i.debian.org/manual/en.s390x/install.en.pdf
> A ?---? in the boot options has special meaning. Kernel parameters that
> appear after the last ?---? may be copied
> into the bootloader configuration for the installed system (if supported by
> the installer for the bootloader).
> The installer will automatically filter out any options (like
> preconfiguration options) that it recognizes.

Ah, I wasn't aware of that. Very insightful. Thanks, Frank.

Since the kernel has '--' and recently also started to complain about unkown parameters if user space stuff appears before the '--', I was just wondering how the kernel would parse and interpret the triple-dash '---' and if the kernel could trip over the triple-dash and maybe that could cause the weird special character /proc/cmdline we saw.

Revision history for this message
Olivier Gayot (ogayot) wrote : Re: [UBUNTU 22.04] OS installer exits for zfcp 32G adapter with an unknown error. An error occurred during installation

Hi,

> Does the installer really have a busy(!) waiting loop calling udevadm settle with zero timeout?
But even if so, with the number of discovered devices and the settle finally returning with success errorlevel 0, it should just work?

The installer wakes up and calls udevadm settle when a udev event matching SUBSYSTEM=block is received. If it feels like an active loop, it probably indicates that so many events are generated in close to no time.

I think the first step would be to understand what these events are ; and if the installer is somehow responsible for the burst.

Would you be able to capture the output `sudo udevadm monitor --subsystem-match=block` on the affected system? Ideally, we should capture once when the installer is not running, and once when it is running.

Ideally, we'd also need to understand more context about the FD error. It might be harmless edge-case (potentially caused by the burst) that we should raise as a warning instead of making it an error resulting in a crash report. Difficult to say for now.

Thanks,
Olivier

Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2023-04-20 10:30 EDT-------
(In reply to comment #34)
> (In reply to comment #25)
> The very first one of those tracebacks happens on udev settle for the
> network device (before any zfcp devices):

Just re-iterating in case this is not related to block devices.

Revision history for this message
Olivier Gayot (ogayot) wrote : Re: [UBUNTU 22.04] OS installer exits for zfcp 32G adapter with an unknown error. An error occurred during installation

> > The very first one of those tracebacks happens on udev settle for the
> > network device (before any zfcp devices):
>
> Just re-iterating in case this is not related to block devices.

I appreciate the heads up, thanks! Indeed, we also call udevadm settle as the result of network events. On the other hand, all the tracebacks that I have seen so far in the logs shared seem to originate from the filesystem code, including this one:

> The very first one of those tracebacks happens on udev settle for the network device (before any zfcp devices):
>
> 2023-04-06 19:17:21,023 DEBUG subiquity.server.controllers.filesystem:671 waiting 0.1 to let udev event queue settle
> 2023-04-06 19:17:21,124 DEBUG subiquitycore.utils:64 run_command called: ['udevadm', 'settle', '-t', '0']
> 2023-04-06 19:17:21,139 DEBUG subiquitycore.utils:77 run_command ['udevadm', 'settle', '-t', '0'] exited with code 0
> 2023-04-06 19:17:21,139 ERROR subiquity.server.server:424 top level error
> Traceback (most recent call last):
> File "/snap/subiquity/4383/usr/lib/python3.8/asyncio/events.py", line 81, in _run
> self._context.run(self._callback, *self._args)
> File "/snap/subiquity/4383/lib/python3.8/site-packages/subiquity/server/controllers/filesystem.py", line 682, in _udev_event
> action, dev = self._monitor.receive_device()
> File "/snap/subiquity/4383/lib/python3.8/site-packages/pyudev/monitor.py", line 400, in receive_device

But I'm happy to be wrong. Have I overlooked something?

I've been able to reproduce the same stack by generating a burst of udev events using `udevadm trigger`.

Based on some reading, it seems that pyudev is mishandling the POLLIN + POLLERR combination; and fails to recover from it. There is an existing upstream discussion: https://github.com/pyudev/pyudev/issues/194

And a patch was merged in radvd for a similar situation: https://github.com/radvd-project/radvd/pull/160

Thanks,
Olivier

Frank Heimes (fheimes)
tags: added: installer
Revision history for this message
Dan Bungert (dbungert) wrote :

I spent some time today looking at this bug.

I don't have the best s390x hardware access at the moment, but per Olivier's above suggestion I can get similar crashes (OSError: Error while polling fd) on amd64.

To do that I'm triggering some events
```
for n in $(seq 1 100); do
    udevadm trigger &
done
```

Olivier's patch to pyudev does appear to help, as with the patch I'm no longer able to see that OSError.

I will do more testing next week. If anyone else would like to test, I have done a test build. To use that, please add subiquity-channel=edge/lp-2009141 to the kernel command line and allow the installer to update itself when prompted.

tags: added: foundations-todo
Changed in subiquity (Ubuntu):
assignee: nobody → Dan Bungert (dbungert)
Revision history for this message
Frank Heimes (fheimes) wrote (last edit ):

Thanks Dan for having a look at this and for your suggestions and special build.
I just tried it - updated subiquity
  from: "subiquity 22.02.2+git1752.3540ad07 5095"
  to: "subiquity 22.02.2+git1753.29c75c31 5107"
and on the first attempt (again using the same system) I hit again a crash.

But it looks like things changed a bit, since only one crash file was generated and the log did not exploded like before. Probably a good step in the right direction.

I attached the log / crash content here ...

Revision history for this message
Frank Heimes (fheimes) wrote :

The logs are either now a bit different or at least better readable:
"
 2023-09-18 09:22:35,931 DEBUG subiquitycore.utils:132 arun_command ['chzdev', '--enable', '0.0.162f'] exited with code 0
 2023-09-18 09:22:35,931 DEBUG subiquitycore.utils:76 run_command called: ['lszdev', '--quiet', '--pairs', '--columns', 'id,type,on,exists,pers,auto,failed,names']
 2023-09-18 09:22:36,143 DEBUG subiquitycore.utils:95 run_command ['lszdev', '--quiet', '--pairs', '--columns', 'id,type,on,exists,pers,auto,failed,names'] exited with code 0
 2023-09-18 09:22:36,201 DEBUG root:30 finish: subiquity/Zdev/chzdev_POST: SUCCESS: 200 [{"id": "0.0.1600", "type": "dasd-eckd", "on": false, "exists": true, "pers":...
 2023-09-18 09:22:36,201 INFO aiohttp.access:206 [18/Sep/2023:09:22:32 +0000] "POST /zdev/chzdev?action=%22enable%22&zdev=%7B%22id%22:+%220.0.162f%22,+%22type%22:+%22dasd-eckd%22,+%22on%22:+false,+%22exists%22:+true,+%22pers%22:+false,+%22auto%22:+false,+%22failed%22:+false,+%22names%22:+%22%22%7D HTTP/1.1" 200 65093 "-" "Python/3.10 aiohttp/3.8.1"
 2023-09-18 09:22:36,202 DEBUG subiquitycore.utils:76 run_command called: ['udevadm', 'settle', '-t', '0']
 2023-09-18 09:22:36,221 DEBUG subiquitycore.utils:95 run_command ['udevadm', 'settle', '-t', '0'] exited with code 0
 2023-09-18 09:22:36,222 ERROR subiquity.server.server:415 top level error
 Traceback (most recent call last):
   File "/snap/subiquity/5107/usr/lib/python3.10/asyncio/events.py", line 80, in _run
     self._context.run(self._callback, *self._args)
   File "/snap/subiquity/5107/lib/python3.10/site-packages/subiquity/server/controllers/filesystem.py", line 1496, in _udev_event
     action, dev = self._monitor.receive_device()
   File "/snap/subiquity/5107/usr/lib/python3/dist-packages/pyudev/monitor.py", line 393, in receive_device
     device = self.poll()
   File "/snap/subiquity/5107/usr/lib/python3/dist-packages/pyudev/monitor.py", line 355, in poll
     return self._receive_device()
   File "/snap/subiquity/5107/usr/lib/python3/dist-packages/pyudev/monitor.py", line 294, in _receive_device
     device_p = self._libudev.udev_monitor_receive_device(self)
   File "/snap/subiquity/5107/usr/lib/python3/dist-packages/pyudev/_ctypeslib/_errorcheckers.py", line 103, in check_errno_on_null_pointer_return
     raise exception_from_errno(errnum)
 OSError: [Errno 105] No buffer space available
 2023-09-18 09:22:36,224 DEBUG subiquity.common.errorreport:394 generating crash report
 2023-09-18 09:22:36,225 INFO subiquity.common.errorreport:415 saving crash report 'unknown error crashed with OSError' to /var/crash/1695028956.225001574.unknown.crash
"

Two things I'm wondering about:
Yes I manually enabled "0.0.162f",
a few lines later I see: "subiquity/Zdev/chzdev_POST: SUCCESS: 200 [{"id": "0.0.1600", "type": "dasd-eckd" - but I haven't touched "0.0.1600" at all.
And the msg "[Errno 105] No buffer space available".

Revision history for this message
Dan Bungert (dbungert) wrote :

Thanks for the test, Frank.

> And the msg "[Errno 105] No buffer space available".

I did actually see that in my amd64 "udevadm trigger" simulation but was unclear if that was a real problem. As we have confirmation now fix the "No buffer space" problem is next.

Revision history for this message
Patricia Domingues (patriciasd) wrote :

hit this issue while trying to install Mantic with a DASD-ECKD disk. (image on current: mantic-live-server-s390x.iso 2023-09-21 00:09)
Installer crashes right after I select the disk to be activated.

Revision history for this message
Ubuntu QA Website (ubuntuqa) wrote :

This bug has been reported on the Ubuntu ISO testing tracker.

A list of all reports related to this bug can be found here:
https://iso.qa.ubuntu.com/qatracker/reports/bugs/2009141

tags: added: iso-testing
Dan Bungert (dbungert)
Changed in subiquity (Ubuntu):
status: Triaged → In Progress
Revision history for this message
Dan Bungert (dbungert) wrote :

I have uploaded a second test build. Like before, if anyone else would like to test, please add subiquity-channel=edge/lp-2009141 to the kernel command line and allow the installer to update itself when prompted.

Revision history for this message
Olivier Gayot (ogayot) wrote :

I reproduced the issue again today on z/VM (just after activating device ID 400). I started capturing the udev block events:

[...]
UDEV [1259.455193] change /devices/css0/0.0.0002/0.0.0400/block/dasda/dasda2 (block)
UDEV [1259.470585] change /devices/css0/0.0.0002/0.0.0400/block/dasda/dasda1 (block)
UDEV [1259.484184] change /devices/css0/0.0.0002/0.0.0400/block/dasda (block)
KERNEL[1259.489365] change /devices/css0/0.0.0002/0.0.0400/block/dasda (block)
KERNEL[1259.490565] change /devices/css0/0.0.0002/0.0.0400/block/dasda/dasda1 (block)
KERNEL[1259.490592] change /devices/css0/0.0.0002/0.0.0400/block/dasda/dasda2 (block)
UDEV [1259.491521] change /devices/css0/0.0.0002/0.0.0400/block/dasda/dasda2 (block)
UDEV [1259.506219] change /devices/css0/0.0.0002/0.0.0400/block/dasda/dasda1 (block)
UDEV [1259.520112] change /devices/css0/0.0.0002/0.0.0400/block/dasda (block)
KERNEL[1259.525453] change /devices/css0/0.0.0002/0.0.0400/block/dasda (block)
KERNEL[1259.526714] change /devices/css0/0.0.0002/0.0.0400/block/dasda/dasda1 (block)
KERNEL[1259.526739] change /devices/css0/0.0.0002/0.0.0400/block/dasda/dasda2 (block)
UDEV [1259.527711] change /devices/css0/0.0.0002/0.0.0400/block/dasda/dasda2 (block)
UDEV [1259.543922] change /devices/css0/0.0.0002/0.0.0400/block/dasda/dasda1 (block)
UDEV [1259.557030] change /devices/css0/0.0.0002/0.0.0400/block/dasda (block)
KERNEL[1259.561572] change /devices/css0/0.0.0002/0.0.0400/block/dasda (block)
KERNEL[1259.562894] change /devices/css0/0.0.0002/0.0.0400/block/dasda/dasda1 (block)
KERNEL[1259.562922] change /devices/css0/0.0.0002/0.0.0400/block/dasda/dasda2 (block)
UDEV [1259.565224] change /devices/css0/0.0.0002/0.0.0400/block/dasda/dasda2 (block)
UDEV [1259.580456] change /devices/css0/0.0.0002/0.0.0400/block/dasda/dasda1 (block)
UDEV [1259.615997] change /devices/css0/0.0.0002/0.0.0400/block/dasda (block)
KERNEL[1259.620799] change /devices/css0/0.0.0002/0.0.0400/block/dasda (block)
[...]

Included is a capture over a few seconds with `udevadm monitor --subsystem=block --properties --udev`

Revision history for this message
Olivier Gayot (ogayot) wrote :

In addition, the journal shows udisks2 complaining

Sep 26 08:13:25 ubuntu-server udisksd[1110]: Couldn't find existing drive object for device /sys/devices/css0/0.0.0002/0.0.0400/block/dasda (uevent action 'change', VPD '0X0400')
Sep 26 08:13:25 ubuntu-server udisksd[1110]: Couldn't find existing drive object for device /sys/devices/css0/0.0.0002/0.0.0400/block/dasda (uevent action 'change', VPD '0X0400')
Sep 26 08:13:25 ubuntu-server udisksd[1110]: Couldn't find existing drive object for device /sys/devices/css0/0.0.0002/0.0.0400/block/dasda (uevent action 'change', VPD '0X0400')
Sep 26 08:13:25 ubuntu-server udisksd[1110]: Couldn't find existing drive object for device /sys/devices/css0/0.0.0002/0.0.0400/block/dasda (uevent action 'change', VPD '0X0400')
Sep 26 08:13:25 ubuntu-server udisksd[1110]: Couldn't find existing drive object for device /sys/devices/css0/0.0.0002/0.0.0400/block/dasda (uevent action 'change', VPD '0X0400')
Sep 26 08:13:25 ubuntu-server udisksd[1110]: Couldn't find existing drive object for device /sys/devices/css0/0.0.0002/0.0.0400/block/dasda (uevent action 'change', VPD '0X0400')
Sep 26 08:13:25 ubuntu-server udisksd[1110]: Couldn't find existing drive object for device /sys/devices/css0/0.0.0002/0.0.0400/block/dasda (uevent action 'change', VPD '0X0400')

Revision history for this message
Frank Heimes (fheimes) wrote :

I think that I bumped into this as well (using the updated 'edge/lp-2009141') doing an LPAR install.
Logs are attached ... (but looks like I had some network issues on top ?! will re-try to be sure ...)

Revision history for this message
Dan Bungert (dbungert) wrote :

@Frank - yours is indeed failing on network things. Yes there is udev noise in the journal but it is not clear to me that it would have caused failure. Please continue the test so we can separate this network item from the udev one.

Revision history for this message
Dan Bungert (dbungert) wrote :

Next steps:

* https://github.com/canonical/subiquity/pull/1806 is open, more or less equivalent to the current edge/lp-2009141 build, I believe this will help. This drastically reduces CPU spent handling udev events
* The source of the apparent loop of udev events needs to be identified. Does a chzdev outside of subiquity produce a similar list of events?
* retest on pre-Mantic is still interesting (with or without edge/lp-2009141), as is the edge/lp-2009141 test on Manticj. @Frank are you available to try either or both of these?

Revision history for this message
Frank Heimes (fheimes) wrote : Re: [Bug 2009141] Re: [UBUNTU 22.04] OS installer exits for zfcp 32G adapter with an unknown error. An error occurred during installation
Download full text (16.4 KiB)

I should be able to try both cases today - will leave a msg when I'm done.

Dan Bungert <email address hidden> schrieb am Mi., 27. Sept. 2023,
06:51:

> Next steps:
>
> * https://github.com/canonical/subiquity/pull/1806 is open, more or less
> equivalent to the current edge/lp-2009141 build, I believe this will help.
> This drastically reduces CPU spent handling udev events
> * The source of the apparent loop of udev events needs to be identified.
> Does a chzdev outside of subiquity produce a similar list of events?
> * retest on pre-Mantic is still interesting (with or without
> edge/lp-2009141), as is the edge/lp-2009141 test on Manticj. @Frank are
> you available to try either or both of these?
>
> --
> You received this bug notification because you are a member of Skipper
> Bug Screeners, which is a bug assignee.
> Matching subscriptions: Skipper Bug Screeners, <email address hidden>
> https://bugs.launchpad.net/bugs/2009141
>
> Title:
> [UBUNTU 22.04] OS installer exits for zfcp 32G adapter with an unknown
> error. An error occurred during installation
>
> Status in Ubuntu on IBM z Systems:
> Triaged
> Status in subiquity package in Ubuntu:
> In Progress
>
> Bug description:
> Bug Description:
> Installation of Ubuntu 22.04 on s390x failed with an unknown error just
> after having successfully activated a zfcp HBA with Fibre-Channel-attached
> SCSI disks.
>
> I do see 0.0.100d successfully being online and having paths attached:
> zfcp-host
> 0.0.100d online
> ?
> 0x500173800cef0111:0x0000000000000000 sg16
>
>
> 0x500173800cef0111:0x0001000000000000 sdp
> sg17
>
> 0x500173800cef0111:0x0002000000000000 sdq
> sg18
> [host.ilabg13_9.11.116.213_ubuntu-22.04_ssh_installer.txt lines
> 41-41/9373 byte 238696/1146881 21%]
>
> But immediately after that, the installer reports an error:
>
> An error occurred during installation
>
>
> ??????????????????????????????????????????????????????????????????????????
> ?subiquity/Early/apply_autoinstall_config
>
> ?subiquity/Reporting/apply_autoinstall_config
>
> ?subiquity/Error/apply_autoinstall_config
>
> ?subiquity/Userdata/apply_autoinstall_config
>
> ?subiquity/Package/apply_autoinstall_config
>
> ?subiquity/Debconf/apply_autoinstall_config
>
> ?subiquity/Kernel/apply_autoinstall_config
>
> ?subiquity/Late/apply_autoinstall_config
>
>
> ??????????????????????????????????????????????????????????????????????????
> ? Sorry, an unknown error occurred.
> ?
> ? Information is being collected from the system that will help the
> ?
> ? developers diagnose the report.
> |
>
> It looks like there was some ascii art progress bar while data was
> collected and then the dialog updated to:
>
> ? [ View full report ]
> ?
> ? If you want to help improve the installer, you can send an error
> ?
> ? report.
> ?
> ? [ Send to Canonical ]
> ?
> ? [ Close report ]
> |
>
> Next "V...

Revision history for this message
Frank Heimes (fheimes) wrote : Re: [UBUNTU 22.04] OS installer exits for zfcp 32G adapter with an unknown error. An error occurred during installation

So I completed 6 different tests (see attached file) - some with mantic, some with jammy, all with the updated "edge/lp-2009141" installer.

Things definitely work better on jammy (22.04.3), but regressed in mantic.

Have we done changes (between jammy and mantic) in the area of udev?

(Found this LP#2016908, but it's fixed released.)

One new DASD feature came in with mantic, the 'DASD auto-quiesce support'.
I'll investigate if this could cause issues ...

Revision history for this message
Frank Heimes (fheimes) wrote :

My observation obviously doesn't fit to the initial bug report, since things worked for me nicely on jammy.
But meanwhile jammy was updated quite a lot.
Initial report was with 5.15.0-43, I used 22.04.3 with 5.15.0-84.

I should probably also check lunar, but the GA tests on lunar were fine (done by me and Solutions QA independently).

Revision history for this message
Frank Heimes (fheimes) wrote :

I did a quick check on lunar.
And things look okay there too, similar to jammy - pretty smooth, I haven't seen any udev showers or loops.

Revision history for this message
Dan Bungert (dbungert) wrote :

Sample script to partition a disk with many partitions.

summary: - [UBUNTU 22.04] OS installer exits for zfcp 32G adapter with an unknown
- error. An error occurred during installation
+ subiquity fails to handle a large burst of udev events
Dan Bungert (dbungert)
description: updated
description: updated
Frank Heimes (fheimes)
Changed in ubuntu-z-systems:
status: Triaged → In Progress
Revision history for this message
Dan Bungert (dbungert) wrote :

The fix here has been committed, mantic daily live-server starting Sept-28 should have this change.

Changed in subiquity (Ubuntu):
status: In Progress → Fix Committed
Frank Heimes (fheimes)
Changed in ubuntu-z-systems:
status: In Progress → Fix Committed
Dan Bungert (dbungert)
tags: removed: foundations-todo
Revision history for this message
Frank Heimes (fheimes) wrote :

Looks like with the latest installer - as it is included in the mantic daily from 2023-09-28 - this seems to be solved (tested so far on z/VM only, will do some testing on LPAR as well).
More details at LP#2037569, comment #3:
https://bugs.launchpad.net/ubuntu-z-systems/+bug/2037569/comments/3

Revision history for this message
Etienne URBAH (eurbah) wrote :

https://bugs.launchpad.net/subiquity/+bug/2038239 is probably a duplicate of this issue.

If this is really the case, then the issue is NOT resolved yet.

Revision history for this message
Frank Heimes (fheimes) wrote (last edit ):

Hi Etienne, I can't open that link - is it the correct URL or is the bug marked as private?
And on top - what are the details of the system (Ubuntu release, architecture, etc.) where you are seeing this issue still?

Revision history for this message
Dan Bungert (dbungert) wrote :

@Frank - I expect bug 2038239 is a duplicate of this one. It was tested with an ubuntu-desktop-installer build that was done before picking up this fix.

Newer subiquity builds do not use pyudev poll().

Traceback from that bug follows:

Traceback (most recent call last):
  File "/snap/hostname-desktop-installer/1243/usr/lib/python3.10/asyncio/events.py", line 80, in _run
    self._context.run(self._callback, *self._args)
  File "/snap/hostname-desktop-installer/1243/bin/subiquity/subiquity/server/controllers/filesystem.py", line 1496, in _udev_event
    action, dev = self._monitor.receive_device()
  File "/snap/hostname-desktop-installer/1243/usr/lib/python3/dist-packages/pyudev/monitor.py", line 393, in receive_device
    device = self.poll()
  File "/snap/hostname-desktop-installer/1243/usr/lib/python3/dist-packages/pyudev/monitor.py", line 354, in poll
    if eintr_retry_call(poll.Poll.for_events((self, 'r')).poll, timeout):
  File "/snap/hostname-desktop-installer/1243/usr/lib/python3/dist-packages/pyudev/_util.py", line 159, in eintr_retry_call
    return func(*args, **kwargs)
  File "/snap/hostname-desktop-installer/1243/usr/lib/python3/dist-packages/pyudev/_os/poll.py", line 94, in poll
    return list(
  File "/snap/hostname-desktop-installer/1243/usr/lib/python3/dist-packages/pyudev/_os/poll.py", line 110, in _parse_events
    raise IOError('Error while polling fd: {0!r}'.format(fd))
OSError: Error while polling fd: 23

Revision history for this message
Frank Heimes (fheimes) wrote :

Based on my testing today (using the mantic daily from Oct 3rd) I think this is solved.
( For more details see: https://bugs.launchpad.net/ubuntu-z-systems/+bug/2037569/comments/12 )

Changed in ubuntu-z-systems:
status: Fix Committed → Fix Released
Revision history for this message
Etienne URBAH (eurbah) wrote :

With https://cdimage.ubuntu.com/daily-live/20231004/mantic-desktop-amd64.iso :

Subiquity snap version is 1245, and this issue seems to have been fixed.

Olivier Gayot (ogayot)
Changed in subiquity (Ubuntu):
status: Fix Committed → Fix Released
bugproxy (bugproxy)
tags: added: targetmilestone-inin2204
removed: targetmilestone-inin---
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.