Bug #2009141 “subiquity fails to handle a large burst of udev ev...” : Bugs : subiquity package : Ubuntu

Revision history for this message

bugproxy (bugproxy) wrote on 2023-03-03: host.ilabg13_9.11.116.213_ubuntu-22.04_ssh_installer

#1

host.ilabg13_9.11.116.213_ubuntu-22.04_ssh_installer Edit (1.1 MiB, text/plain)

Default Comment by Bridge

tags:	added: architecture-s39064 bugnameltc-201751 severity-high targetmilestone-inin---
Changed in ubuntu:
assignee:	nobody → Skipper Bug Screeners (skipper-screen-team)
affects:	ubuntu → linux (Ubuntu)

Revision history for this message

Frank Heimes (fheimes) wrote on 2023-03-03: Re: [UBUNTU 22.04] OS installer exits for zfcp 32G adapter with an unknown error. An error occurred during installation

#2

Please can you check if there is a crash file in /var/crash and if so share this
and ideally the entire /var/log folder (or at least /var/log/installer)? Thx.
We don't have a 32Gbit FCP adapter to test with.

affects:	linux (Ubuntu) → subiquity (Ubuntu)
Changed in subiquity (Ubuntu):
assignee:	Skipper Bug Screeners (skipper-screen-team) → nobody
Changed in ubuntu-z-systems:
assignee:	nobody → Skipper Bug Screeners (skipper-screen-team)

Revision history for this message

bugproxy (bugproxy) wrote on 2023-03-03: Comment bridged from LTC Bugzilla

#3

------- Comment From <email address hidden> 2023-03-03 12:55 EDT-------
(In reply to comment #11)
> Please can you check if there is a crash file in /var/crash and if so share
> this
> and ideally the entire /var/log folder (or at least /var/log/installer)? Thx.
> We don't have a 32Gbit FCP adapter to test with.

In response to request for /var/crash and /var/log folder files.

It is the subiquity installer that is exiting with an error

This failure exited after 0.0.100d online the installer exits with an error
i.e.
Sorry, an unknown error occurred

There are no /var/crash or /var/log folders or files at this point since the subiquity installer exited with an error

Revision history for this message

bugproxy (bugproxy) wrote on 2023-03-03: subiquity ssh terminal text log from the script tool

#4

subiquity ssh terminal text log from the script tool Edit (68.8 KiB, application/gzip)

------- Comment on attachment From <email address hidden> 2023-03-03 13:04 EDT-------

not sure the existing attachment of the installer console log was mirrored, so I attach it again

Revision history for this message

Frank Heimes (fheimes) wrote on 2023-03-03: Re: [UBUNTU 22.04] OS installer exits for zfcp 32G adapter with an unknown error. An error occurred during installation

#5

Yes, I understood that the installer is crashing here
and that the ssh connection might have been dropped.
But one can usually reconnect via ssh,
go via the top-right menu to the installer shell (or I think via Alt-F2),
and navigate to the folders /var/crash and /var/log and even pack and scp them - that was the hope.

Revision history for this message

bugproxy (bugproxy) wrote on 2023-03-03: subiquity ssh terminal text log from the script tool

#6

subiquity ssh terminal text log from the script tool Edit (68.8 KiB, application/gzip)

------- Comment on attachment From <email address hidden> 2023-03-03 13:04 EDT-------

not sure the existing attachment of the installer console log was mirrored, so I attach it again

Revision history for this message

bugproxy (bugproxy) wrote on 2023-03-13: Comment bridged from LTC Bugzilla

#7

Download full text (6.2 KiB)

------- Comment From <email address hidden> 2023-03-13 08:25 EDT-------
(In reply to comment #11)
> We don't have a 32Gbit FCP adapter to test with.

I doubt this is related to a particular hardware as the HBAs have the same programming interface to the zfcp device driver.

(In reply to comment #15)
> Yes, I understood that the installer is crashing here
> and that the ssh connection might have been dropped.

I'm not sure the ssh connection dropped, as I see the installer TUI and installer debug data after the reported error, so I suppose the connection remained.

> But one can usually reconnect via ssh,
> go via the top-right menu to the installer shell

>(or I think via Alt-F2),

There are no Linux Virtual Terminals on s390, so I guess that key combo won't work.

> and navigate to the folders /var/crash and /var/log and even pack and scp
> them - that was the hope.

(In reply to comment #6)
> Installation of Ubuntu 22.04 on s390x failed with an unknown error just
> after having successfully activated a zfcp HBA with Fibre-Channel-attached
> SCSI disks.
>
> I do see 0.0.100d successfully being online and having paths attached:

> [host.ilabg13_9.11.116.213_ubuntu-22.04_ssh_installer.txt lines 41-41/9373
> byte 238696/1146881 21%]

The attachment should already contain a lot of debug data.

It was collected with the script tool recording the ssh session during the installation attempt. Unfortunately, there is no timing information so "scriptreplay" does not work. The 'best' thing I could come up is using "less -R" to let at least render the ansi color escape sequences instead of cluttering the output with escaped escape sequences. It's somewhat readable (with a column offset, though; and super slow to render with massive cpu consumption due to one super long line being continuously wrapped by the less pager) with 167 columns for me after trial and error resizing of my terminal.

> But immediatelly after that, the installer reports an error:
>
> An error occurred during installation

> ??????????????????????????????????????????????????????????????????????????
> ? Sorry, an unknown error occurred. ?
> ? Information is being collected from the system that will help the ?
> ? developers diagnose the report. |

> ? [ View full report ] ?
> ? If you want to help improve the installer, you can send an error ?
> ? report. ?
> ? [ Send to Canonical ] ?
> ? [ Close report ] |
>
> Next "View full report" was selected?

The attached file contains the installer debug data starting at line 42.

I hope this is the same information as would be in "Send to Canonical", which is not necessarily an option on s390 potentially not having a sufficient internet connection for security reasons.

> ProblemType: Bug
> Architecture: s390x
> CrashDB: {'impl': 'launchpad', 'project': 'subiquity'}
> CurrentDmesg:

> Date: Wed Jan 18 21:04:03 2023
> DistroRelease:...

------- Comment From MAIER@de.ibm.com 2023-03-13 08:25 EDT-------
(In reply to comment #11)
> We don't have a 32Gbit FCP adapter to test with.

I doubt this is related to a particular hardware as the HBAs have the same programming interface to the zfcp device driver.

(In reply to comment #15)
> Yes, I understood that the installer is crashing here
> and that the ssh connection might have been dropped.

I'm not sure the ssh connection dropped, as I see the installer TUI and installer debug data after the reported error, so I suppose the connection remained.

> But one can usually reconnect via ssh,
> go via the top-right menu to the installer shell

>(or I think via Alt-F2),

There are no Linux Virtual Terminals on s390, so I guess that key combo won't work.

> and navigate to the folders /var/crash and /var/log and even pack and scp
> them - that was the hope.

(In reply to comment #6)
> Installation of Ubuntu 22.04 on s390x failed with an unknown error just
> after having successfully activated a zfcp HBA with Fibre-Channel-attached
> SCSI disks.
>
> I do see 0.0.100d successfully being online and having paths attached:

> [host.ilabg13_9.11.116.213_ubuntu-22.04_ssh_installer.txt lines 41-41/9373
> byte 238696/1146881 21%]

The attachment should already contain a lot of debug data.

It was collected with the script tool recording the ssh session during the installation attempt. Unfortunately, there is no timing information so "scriptreplay" does not work. The 'best' thing I could come up is using "less -R" to let at least render the ansi color escape sequences instead of cluttering the output with escaped escape sequences. It's somewhat readable (with a column offset, though; and super slow to render with massive cpu consumption due to one super long line being continuously wrapped by the less pager) with 167 columns for me after trial and error resizing of my terminal.

> But immediatelly after that, the installer reports an error:
>
>                    An error occurred during installation

> ??????????????????????????????????????????????????????????????????????????
> ?  Sorry, an unknown error occurred.                                     ?
> ?  Information is being collected from the system that will help the     ?
> ?  developers diagnose the report.                                       |

> ?                       [ View full report      ]                        ?
> ?  If you want to help improve the installer, you can send an error      ?
> ?  report.                                                               ?
> ?                       [ Send to Canonical     ]                        ?
> ?                       [ Close report          ]                        |
>
> Next "View full report" was selected?

The attached file contains the installer debug data starting at line 42.

I hope this is the same information as would be in "Send to Canonical", which is not necessarily an option on s390 potentially not having a sufficient internet connection for security reasons.

> ProblemType: Bug
> Architecture: s390x
> CrashDB: {'impl': 'launchpad', 'project': 'subiquity'}
> CurrentDmesg:

> Date: Wed Jan 18 21:04:03 2023
> DistroRelease: Ubuntu 22.04
> ExecutablePath:
> /snap/subiquity/3699/lib/python3.8/site-packages/subiquity/cmd/server.py
> InstallerServerLog:
>  2023-01-18 20:23:57,901 INFO subiquity:112 Starting Subiquity server
> revision 3699

>  2023-01-18 21:04:03,114 ERROR subiquity.server.server:416 top level error
>  Traceback (most recent call last):
>    File "/snap/subiquity/3699/usr/lib/python3.8/asyncio/events.py", line 81, in _run
>      self._context.run(self._callback, *self._args)
>    File
> "/snap/subiquity/3699/lib/python3.8/site-packages/subiquity/server/controllers/filesystem.py", line 506, in _udev_event
>      action, dev = self._monitor.receive_device()
>    File
> "/snap/subiquity/3699/lib/python3.8/site-packages/pyudev/monitor.py", line 397, in receive_device
>      device = self.poll()
>    File
> "/snap/subiquity/3699/lib/python3.8/site-packages/pyudev/monitor.py", line 357, in poll
>      if eintr_retry_call(poll.Poll.for_events((self, 'r')).poll, timeout):
>    File "/snap/subiquity/3699/lib/python3.8/site-packages/pyudev/_util.py", line 163, in eintr_retry_call
>      return func(*args, **kwargs)
>    File
> "/snap/subiquity/3699/lib/python3.8/site-packages/pyudev/_os/poll.py", line 97, in poll
>      return list(self._parse_events(eintr_retry_call(self._notifier.poll, timeout)))
>    File
> "/snap/subiquity/3699/lib/python3.8/site-packages/pyudev/_os/poll.py", line 112, in _parse_events
>      raise IOError('Error while polling fd: {0!r}'.format(fd))
>  OSError: Error while polling fd: 20
>  2023-01-18 21:04:03,116 DEBUG subiquitycore.common.errorreport:384 generating crash report
>  2023-01-18 21:04:03,116 INFO subiquitycore.common.errorreport:406 saving crash report 'unknown error crashed with OSError' to /var/crash/1674075843.116781473.unknown.crash

Ah, I missed that: "/var/crash/1674075843.116781473.unknown.crash"

In contrast to my expectations, /var/crash/ of Ubuntu can contain debug data which are not kdumps.

Frank, does such file really contain additional information, which is not already in the attached installer debug data?

If this installer debug data is not sufficient, I would consider it an independent bug that should be fixed.

@finnegan@us.ibm.com, any chance you could get such additional debug information?

Frank, could we nonetheless involve a subiquity developer to help understand the python traceback in order to get closer to a root cause?

> InstallerServerLogInfo:

Package: subiquity 3699
ProbeData:

ProcCpuinfo:

SnapChannel:
SnapRevision: 3699
SnapUpdated: False
SnapVersion: 22.07.2
SourcePackage: subiquity
Title: unknown error crashed with OSError
Traceback:

==> the same as above

> UdevDb:

Uname: Linux 5.15.0-43-generic s390x

> Canonical, why did the installer get an error?
>
> Does the installer really have a busy(!) waiting loop calling udevadm settle
> with zero timeout?

Maybe it's not busy and instead driver by a poll/select loop (with its own timeout/sleep) in the installer and therefore called with zero timeout.

> But even if so, with the number of discovered devices and the settle finally
> returning with success errorlevel 0, it should just work?

Revision history for this message

bugproxy (bugproxy) wrote on 2023-03-13: subiquity ssh terminal text log from the script tool

#8

subiquity ssh terminal text log from the script tool Edit (68.8 KiB, application/gzip)

------- Comment on attachment From <email address hidden> 2023-03-03 13:04 EDT-------

not sure the existing attachment of the installer console log was mirrored, so I attach it again

Revision history for this message

Frank Heimes (fheimes) wrote on 2023-03-13: Re: [UBUNTU 22.04] OS installer exits for zfcp 32G adapter with an unknown error. An error occurred during installation

#9

Hi, yes, I also do not really believe that it's related to the 32Gbit adapters themselves.

Ah, ok, your ssh sessions stayed up, I see.
(The key shortcuts like Alt-F2, work fine in the installer ssh shell, but not needed in this case since you got dropped to the installer shell anyway.)

Well, since the installer is cross platform, the "Send to Canonical" exists as usual,
known that the network is not always in place (esp. on s390x)...

Yes, /var/crash/1674075843.116781473.unknown.crash can provide additional information.

The Launchpad bug got marked as affecting 'subiquity' (the installer) and subiquity developers got subscribed to it.

With this data I now noticed that the installer package version is pretty old (and outdated):
SnapRevision: 3699
SnapVersion: 22.07.2
(current is: subiquity - 4383 - 23.02.1)
as well as the kernel version:
Uname: Linux 5.15.0-43-generic s390x
(current is 5.15.0-60-generic)
which makes me think that an outdated ISO image was used (maybe 22.04.1 ?)

Well, please notice that a new 22.04 "point-release" makes all previous ones obsolete
AND includes updated installers (and kernel).
So the latest supported image is the 22.04.2:
https://cdimage.ubuntu.com/releases/22.04/release/ubuntu-22.04.2-live-server-s390x.iso

So I recommend to give it another try with this up-to-date image (updated kernel and updated installer).

Changed in subiquity (Ubuntu):
importance:	Undecided → High
Changed in ubuntu-z-systems:
importance:	Undecided → High

Revision history for this message

bugproxy (bugproxy) wrote on 2023-03-13: subiquity ssh terminal text log from the script tool

#10

subiquity ssh terminal text log from the script tool Edit (68.8 KiB, application/gzip)

------- Comment on attachment From <email address hidden> 2023-03-03 13:04 EDT-------

not sure the existing attachment of the installer console log was mirrored, so I attach it again

Revision history for this message

Dan Bungert (dbungert) wrote on 2023-03-13: Re: [UBUNTU 22.04] OS installer exits for zfcp 32G adapter with an unknown error. An error occurred during installation

#11

A retest with 22.04.2 is a good idea.
If that fails similarly, I would appreciate a tarball of the contents of /var/log/installer.

Changed in subiquity (Ubuntu):
status:	New → Incomplete

Revision history for this message

bugproxy (bugproxy) wrote on 2023-03-13: subiquity ssh terminal text log from the script tool

#12

subiquity ssh terminal text log from the script tool Edit (68.8 KiB, application/gzip)

------- Comment on attachment From <email address hidden> 2023-03-03 13:04 EDT-------

not sure the existing attachment of the installer console log was mirrored, so I attach it again

Frank Heimes (fheimes) on 2023-03-29

Changed in ubuntu-z-systems:
status:	New → Incomplete

Revision history for this message

bugproxy (bugproxy) wrote on 2023-03-29:

#13

subiquity ssh terminal text log from the script tool Edit (68.8 KiB, application/gzip)

------- Comment on attachment From <email address hidden> 2023-03-03 13:04 EDT-------

not sure the existing attachment of the installer console log was mirrored, so I attach it again

Revision history for this message

bugproxy (bugproxy) wrote on 2023-04-03:

#14

subiquity ssh terminal text log from the script tool Edit (68.8 KiB, application/gzip)

------- Comment on attachment From <email address hidden> 2023-03-03 13:04 EDT-------

not sure the existing attachment of the installer console log was mirrored, so I attach it again

Revision history for this message

bugproxy (bugproxy) wrote on 2023-04-04:

#15

subiquity ssh terminal text log from the script tool Edit (68.8 KiB, application/gzip)

------- Comment on attachment From <email address hidden> 2023-03-03 13:04 EDT-------

not sure the existing attachment of the installer console log was mirrored, so I attach it again

Revision history for this message

bugproxy (bugproxy) wrote on 2023-04-05:

#16

subiquity ssh terminal text log from the script tool Edit (68.8 KiB, application/gzip)

------- Comment on attachment From <email address hidden> 2023-03-03 13:04 EDT-------

not sure the existing attachment of the installer console log was mirrored, so I attach it again

Revision history for this message

bugproxy (bugproxy) wrote on 2023-04-05:

#17

subiquity ssh terminal text log from the script tool Edit (68.8 KiB, application/gzip)

------- Comment on attachment From <email address hidden> 2023-03-03 13:04 EDT-------

not sure the existing attachment of the installer console log was mirrored, so I attach it again

Revision history for this message

bugproxy (bugproxy) wrote on 2023-04-05: ubuntu-22.04.2_installer_04052023_1330_mst_1.txt

#18

ubuntu-22.04.2_installer_04052023_1330_mst_1.txt Edit (1.0 MiB, text/plain)

------- Comment on attachment From <email address hidden> 2023-04-05 18:23 EDT-------

Issue still occurs on ubuntu-22.04.2_installer
Linux script file ubuntu-22.04.2_installer_04052023_1330_mst_1.txt with output from ssh installer@<ip address>

Error after selecting 100d zfcp-host and enable

OSError to /var/crash/1680726785.177204132.unknown.crash

Revision history for this message

bugproxy (bugproxy) wrote on 2023-04-05: View_full_report_1680726785.177204132.unknown.crash.txt

#19

View_full_report_1680726785.177204132.unknown.crash.txt Edit (365.4 KiB, text/plain)

Download full text (9.4 KiB)

------- Comment on attachment From <email address hidden> 2023-04-05 18:28 EDT-------

View full report output copied to file View_full_report_1680726785.177204132.unknown.crash.txt

Next ssh installer@<ip address> returns OSError: [Errno 28] No space left on device
i.e.

Welcome to Ubuntu 22.04.2 LTS (GNU/Linux 5.15.0-60-generic s390x)

* Documentation: https://help.ubuntu.com
* Management: https://landscape.canonical.com
* Support: https://ubuntu.com/advantage

* Introducing Expanded Security Maintenance for Applications.
Receive updates to over 25,000 software packages with your
Ubuntu Pro subscription. Free for personal use.

https://ubuntu.com/pro

Expanded Security Maintenance for Applications is not enabled.

0 updates can be applied immediately.

Enable ESM Apps to receive additional future security updates.
See https://ubuntu.com/esm or run: sudo pro status

The list of available updates is more than a week old.
To check for new updates run: sudo apt update

Last login: Wed Apr 5 21:44:58 2023 from 9.11.56.94
--- Logging error ---
Traceback (most recent call last):
  File "/snap/subiquity/4383/usr/lib/python3.8/logging/__init__.py", line 1089, in emit
    self.flush()
  File "/snap/subiquity/4383/usr/lib/python3.8/logging/__init__.py", line 1069, in flush
    self.stream.flush()
OSError: [Errno 28] No space left on device
Call stack:
  File "/snap/subiquity/4383/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/snap/subiquity/4383/usr/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/snap/subiquity/4383/lib/python3.8/site-packages/subiquity/__main__.py", line 5, in <module>
    sys.exit(main())
  File "/snap/subiquity/4383/lib/python3.8/site-packages/subiquity/cmd/tui.py", line 126, in main
    logger.info("Starting Subiquity revision {}".format(version))
Message: 'Starting Subiquity revision 4383'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
  File "/snap/subiquity/4383/usr/lib/python3.8/logging/__init__.py", line 1089, in emit
    self.flush()
  File "/snap/subiquity/4383/usr/lib/python3.8/logging/__init__.py", line 1069, in flush
    self.stream.flush()
OSError: [Errno 28] No space left on device
Call stack:
  File "/snap/subiquity/4383/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/snap/subiquity/4383/usr/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/snap/subiquity/4383/lib/python3.8/site-packages/subiquity/__main__.py", line 5, in <module>
    sys.exit(main())
  File "/snap/subiquity/4383/lib/python3.8/site-packages/subiquity/cmd/tui.py", line 126, in main
    logger.info("Starting Subiquity revision {}".format(version))
Message: 'Starting Subiquity revision 4383'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
  File "/snap/subiquity/4383/usr/lib/python3.8/logging/__init__.py", line 1089, in emit
    self.flush()
  File "/snap/subiquity/4383/usr/lib/python3.8/logging/__init__.py", line 1069, in flush
    self.stream.flush()
OSError: [Errno 28]...

------- Comment on attachment From finnegan@us.ibm.com 2023-04-05 18:28 EDT-------

View full report output copied to file View_full_report_1680726785.177204132.unknown.crash.txt

Next  ssh installer@<ip address> returns OSError: [Errno 28] No space left on device
i.e.

Welcome to Ubuntu 22.04.2 LTS (GNU/Linux 5.15.0-60-generic s390x)

* Documentation:  https://help.ubuntu.com
 * Management:     https://landscape.canonical.com
 * Support:        https://ubuntu.com/advantage

* Introducing Expanded Security Maintenance for Applications.
   Receive updates to over 25,000 software packages with your
   Ubuntu Pro subscription. Free for personal use.

https://ubuntu.com/pro

Expanded Security Maintenance for Applications is not enabled.

0 updates can be applied immediately.

Enable ESM Apps to receive additional future security updates.
See https://ubuntu.com/esm or run: sudo pro status

The list of available updates is more than a week old.
To check for new updates run: sudo apt update

Last login: Wed Apr  5 21:44:58 2023 from 9.11.56.94
--- Logging error ---
Traceback (most recent call last):
  File "/snap/subiquity/4383/usr/lib/python3.8/logging/__init__.py", line 1089, in emit
    self.flush()
  File "/snap/subiquity/4383/usr/lib/python3.8/logging/__init__.py", line 1069, in flush
    self.stream.flush()
OSError: [Errno 28] No space left on device
Call stack:
  File "/snap/subiquity/4383/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/snap/subiquity/4383/usr/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/snap/subiquity/4383/lib/python3.8/site-packages/subiquity/__main__.py", line 5, in <module>
    sys.exit(main())
  File "/snap/subiquity/4383/lib/python3.8/site-packages/subiquity/cmd/tui.py", line 126, in main
    logger.info("Starting Subiquity revision {}".format(version))
Message: 'Starting Subiquity revision 4383'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
  File "/snap/subiquity/4383/usr/lib/python3.8/logging/__init__.py", line 1089, in emit
    self.flush()
  File "/snap/subiquity/4383/usr/lib/python3.8/logging/__init__.py", line 1069, in flush
    self.stream.flush()
OSError: [Errno 28] No space left on device
Call stack:
  File "/snap/subiquity/4383/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/snap/subiquity/4383/usr/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/snap/subiquity/4383/lib/python3.8/site-packages/subiquity/__main__.py", line 5, in <module>
    sys.exit(main())
  File "/snap/subiquity/4383/lib/python3.8/site-packages/subiquity/cmd/tui.py", line 126, in main
    logger.info("Starting Subiquity revision {}".format(version))
Message: 'Starting Subiquity revision 4383'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
  File "/snap/subiquity/4383/usr/lib/python3.8/logging/__init__.py", line 1089, in emit
    self.flush()
  File "/snap/subiquity/4383/usr/lib/python3.8/logging/__init__.py", line 1069, in flush
    self.stream.flush()
OSError: [Errno 28] No space left on device
Call stack:
  File "/snap/subiquity/4383/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/snap/subiquity/4383/usr/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/snap/subiquity/4383/lib/python3.8/site-packages/subiquity/__main__.py", line 5, in <module>
    sys.exit(main())
  File "/snap/subiquity/4383/lib/python3.8/site-packages/subiquity/cmd/tui.py", line 127, in main
    logger.info("Arguments passed: {}".format(sys.argv))
Message: "Arguments passed: ['/snap/subiquity/4383/lib/python3.8/site-packages/subiquity/__main__.py']"
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
  File "/snap/subiquity/4383/usr/lib/python3.8/logging/__init__.py", line 1089, in emit
    self.flush()
  File "/snap/subiquity/4383/usr/lib/python3.8/logging/__init__.py", line 1069, in flush
    self.stream.flush()
OSError: [Errno 28] No space left on device
Call stack:
  File "/snap/subiquity/4383/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/snap/subiquity/4383/usr/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/snap/subiquity/4383/lib/python3.8/site-packages/subiquity/__main__.py", line 5, in <module>
    sys.exit(main())
  File "/snap/subiquity/4383/lib/python3.8/site-packages/subiquity/cmd/tui.py", line 127, in main
    logger.info("Arguments passed: {}".format(sys.argv))
Message: "Arguments passed: ['/snap/subiquity/4383/lib/python3.8/site-packages/subiquity/__main__.py']"
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
  File "/snap/subiquity/4383/lib/python3.8/site-packages/subiquitycore/screen.py", line 131, in is_linux_tty
    r = fcntl.ioctl(sys.stdout.fileno(), KDGKBTYPE, ' ')
OSError: [Errno 25] Inappropriate ioctl for device

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/snap/subiquity/4383/usr/lib/python3.8/logging/__init__.py", line 1089, in emit
    self.flush()
  File "/snap/subiquity/4383/usr/lib/python3.8/logging/__init__.py", line 1069, in flush
    self.stream.flush()
OSError: [Errno 28] No space left on device
Call stack:
  File "/snap/subiquity/4383/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/snap/subiquity/4383/usr/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/snap/subiquity/4383/lib/python3.8/site-packages/subiquity/__main__.py", line 5, in <module>
    sys.exit(main())
  File "/snap/subiquity/4383/lib/python3.8/site-packages/subiquity/cmd/tui.py", line 143, in main
    subiquity_interface = SubiquityClient(opts)
  File "/snap/subiquity/4383/lib/python3.8/site-packages/subiquity/client/client.py", line 125, in __init__
    if is_linux_tty():
  File "/snap/subiquity/4383/lib/python3.8/site-packages/subiquitycore/screen.py", line 133, in is_linux_tty
    log.debug("KDGKBTYPE failed %r", e)
Message: 'KDGKBTYPE failed %r'
Arguments: (OSError(25, 'Inappropriate ioctl for device'),)
--- Logging error ---
Traceback (most recent call last):
  File "/snap/subiquity/4383/usr/lib/python3.8/logging/__init__.py", line 1089, in emit
    self.flush()
  File "/snap/subiquity/4383/usr/lib/python3.8/logging/__init__.py", line 1069, in flush
    self.stream.flush()
OSError: [Errno 28] No space left on device
Call stack:
  File "/snap/subiquity/4383/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/snap/subiquity/4383/usr/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/snap/subiquity/4383/lib/python3.8/site-packages/subiquity/__main__.py", line 5, in <module>
    sys.exit(main())
  File "/snap/subiquity/4383/lib/python3.8/site-packages/subiquity/cmd/tui.py", line 143, in main
    subiquity_interface = SubiquityClient(opts)
  File "/snap/subiquity/4383/lib/python3.8/site-packages/subiquity/client/client.py", line 131, in __init__
    super().__init__(opts)
  File "/snap/subiquity/4383/lib/python3.8/site-packages/subiquitycore/tui.py", line 72, in __init__
    super().__init__(opts)
  File "/snap/subiquity/4383/lib/python3.8/site-packages/subiquitycore/core.py", line 73, in __init__
    self.aio_loop = asyncio.get_event_loop()
  File "/snap/subiquity/4383/usr/lib/python3.8/asyncio/events.py", line 636, in get_event_loop
    self.set_event_loop(self.new_event_loop())
  File "/snap/subiquity/4383/usr/lib/python3.8/asyncio/events.py", line 656, in new_event_loop
    return self._loop_factory()
  File "/snap/subiquity/4383/usr/lib/python3.8/asyncio/unix_events.py", line 54, in __init__
    super().__init__(selector)
  File "/snap/subiquity/4383/usr/lib/python3.8/asyncio/selector_events.py", line 59, in __init__
    logger.debug('Using selector: %s', selector.__class__.__name__)
Message: 'Using selector: %s'
Arguments: ('EpollSelector',)
connecting... \^CTraceback (most recent call last):
  File "/snap/subiquity/4383/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/snap/subiquity/4383/usr/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/snap/subiquity/4383/lib/python3.8/site-packages/subiquity/__main__.py", line 5, in <module>
    sys.exit(main())
  File "/snap/subiquity/4383/lib/python3.8/site-packages/subiquity/cmd/tui.py", line 150, in main
    subiquity_interface.run()
  File "/snap/subiquity/4383/lib/python3.8/site-packages/subiquity/client/client.py", line 407, in run
    super().run()
  File "/snap/subiquity/4383/lib/python3.8/site-packages/subiquitycore/tui.py", line 381, in run
    super().run()
  File "/snap/subiquity/4383/lib/python3.8/site-packages/subiquitycore/core.py", line 132, in run
    self.aio_loop.run_forever()
  File "/snap/subiquity/4383/usr/lib/python3.8/asyncio/base_events.py", line 570, in run_forever
    self._run_once()
  File "/snap/subiquity/4383/usr/lib/python3.8/asyncio/base_events.py", line 1823, in _run_once
    event_list = self._selector.select(timeout)
  File "/snap/subiquity/4383/usr/lib/python3.8/selectors.py", line 468, in select
    fd_event_list = self._selector.poll(timeout, max_ev)
KeyboardInterrupt

Frank Heimes (fheimes) on 2023-04-06

Changed in ubuntu-z-systems:
status:	Incomplete → Triaged
Changed in subiquity (Ubuntu):
status:	Incomplete → Triaged

Revision history for this message

bugproxy (bugproxy) wrote on 2023-04-06: subiquity ssh terminal text log from the script tool

#20

subiquity ssh terminal text log from the script tool Edit (68.8 KiB, application/gzip)

------- Comment on attachment From <email address hidden> 2023-03-03 13:04 EDT-------

not sure the existing attachment of the installer console log was mirrored, so I attach it again

Revision history for this message

Frank Heimes (fheimes) wrote on 2023-04-06: Re: [UBUNTU 22.04] OS installer exits for zfcp 32G adapter with an unknown error. An error occurred during installation

#21

Thx for attaching the logs and the crash report, we'll investigate ...

What I'm just wondering about are the '"OSError: [Errno 28] No space left on device"' messages. Is there something with the FCP/SCSI LUN (size) or options to write to it?

Revision history for this message

bugproxy (bugproxy) wrote on 2023-04-06: subiquity ssh terminal text log from the script tool

#22

subiquity ssh terminal text log from the script tool Edit (68.8 KiB, application/gzip)

------- Comment on attachment From <email address hidden> 2023-03-03 13:04 EDT-------

not sure the existing attachment of the installer console log was mirrored, so I attach it again

Revision history for this message

bugproxy (bugproxy) wrote on 2023-04-06: Comment bridged from LTC Bugzilla

#23

Download full text (3.7 KiB)

------- Comment From <email address hidden> 2023-04-06 11:49 EDT-------
The installer reported
zfcp-host
0.0.100d --> Enable action by user
0.0.110d
0.0.120d
0.0.130d

The installer acreen then shows ? Updating... ? after the zfcp-host is enabled by user

Installer Screen reported:
0.0.100d -> Enable -> ? Updating... ? ---> List of scsi luns reported ---> "An error occurred during installation"
--
Output displayed a list of scsi luns on zfcp-host 0.0.100d, just before reporting "An error occurred during installation" screen views

i.e.
zfcp-host

0x500173800cef0131:0x0000000000000000 sg5
0x500173800cef0131:0x0001000000000000
0x500173800cef013
0000000000 sdg sg7
0x500507605ebff1f1:0x0000000000000000 sde sg4
0x500507630600d6d3:0x4001404300000000 sda sg0
0x500507630600d6d3:0x4004402f00

0x500507680b2541ba:0x0000000000000000 sdl sg12
0x50050768101702e1:0x0000000000000000 sdh sg8
0x50050768101702e1:0x0001000000000000
sdi sg9
;128;40m0x5005076810170cc9:0x0000000000000000 sdj sg10
0.0.110d ?
0.0.120d ?
0.0.13
0d

Log file shows chzdev and lszdev issued 0.0.100d

i.e.
2023-04-05 20:33:00,208 DEBUG subiquitycore.utils:92 arun_command called: ['chzdev', '--enable', '0.0.100d']
2023-04-05 20:33:01,062 DEBUG subiquitycore.utils:101 arun_command ['chzdev', '--enable', '0.0.100d'] exited with code 0
2023-04-05 20:33:01,062 DEBUG subiquitycore.utils:64 run_command called: ['lszdev', '--quiet', '--pairs', '--columns', 'id,type,on,exists,pers,auto,failed,names']
2023-04-05 20:33:01,129 DEBUG subiquitycore.utils:77 run_command ['lszdev', '--quiet', '--pairs', '--columns', 'id,type,on,exists,pers,auto,failed,names'] exited with code 0
2023-04-05 20:33:01,133 DEBUG root:37 finish: subiquity/Zdev/chzdev_POST: SUCCESS: 200 [{"id": "0.0.0009", "type": "generic-ccw", "on": true, "exists": true, "pers"...
2023-04-05 20:33:01,134 INFO aiohttp.access:233 [05/Apr/2023:20:33:00 +0000] "POST /zdev/chzdev?action=%22enable%22&zdev=%7B%22id%22:+%220.0.100d%22,+%22type%22:+%22zfcp-host%22,+%22on%22:+false,+%22exists%2 2:+true,+%22pers% ^H:^[[K^M^[[K22:+false,+%22auto%22:+false,+%22failed%22:+false,+%22names%22:+%22%22%7D HTTP/1.1" 200 4435 "-" "Python/3.8 aiohttp/3.6.2"

Later reports 2023-04-05 20:33:05,174 ERROR subiquity.server.server:424 top level error

i.e.

2023-04-05 20:33:05,174 ERROR subiquity.server.server:424 top level error
Traceback (most recent call last):
File "/snap/subiquity/4383/usr/lib/python3.8/asyncio/events.py", line 81, in _run
self._context.run(self._callback, *self._args)
File "/snap/subiquity/4383/lib/python3.8/site-packages/subiquity/server/controllers/filesystem.py", line 682, in _udev_event
action, dev = self._monitor.receive_device()
File "/snap/subiquity/4383/lib/python3.8/site-packages/pyudev/monitor.py", line 400, in receive_device
device = self.poll()
File "/snap/subiquity/4383/lib/python3.8/site-packages/pyudev/monitor.py", line 358, in poll
if eintr_retry_call(poll.Poll.for_events((self, "r")).poll, timeout):
File "/snap/subiquity/4383/lib/python3.8/site-packages/pyudev/_uti...

------- Comment From finnegan@us.ibm.com 2023-04-06 11:49 EDT-------
The installer reported
zfcp-host
0.0.100d   --> Enable action by user
0.0.110d
0.0.120d
0.0.130d

The installer acreen then shows ? Updating... ?  after the zfcp-host is enabled by user

Installer Screen reported:
0.0.100d -> Enable -> ? Updating... ? ---> List of scsi luns reported  ---> "An error occurred during installation"
--
Output displayed a list of scsi luns on zfcp-host  0.0.100d, just before reporting "An error occurred during installation" screen views

i.e.
zfcp-host

0x500173800cef0131:0x0000000000000000          sg5
0x500173800cef0131:0x0001000000000000
0x500173800cef013
0000000000          sdg sg7
0x500507605ebff1f1:0x0000000000000000          sde sg4
0x500507630600d6d3:0x4001404300000000          sda sg0
0x500507630600d6d3:0x4004402f00

0x500507680b2541ba:0x0000000000000000          sdl sg12
0x50050768101702e1:0x0000000000000000          sdh sg8
0x50050768101702e1:0x0001000000000000
sdi sg9
;128;40m0x5005076810170cc9:0x0000000000000000          sdj sg10
0.0.110d                                                 ?
0.0.120d                                                 ?
0.0.13
0d

Log file shows chzdev and lszdev issued  0.0.100d

i.e.
2023-04-05 20:33:00,208 DEBUG subiquitycore.utils:92 arun_command called: ['chzdev', '--enable', '0.0.100d']
2023-04-05 20:33:01,062 DEBUG subiquitycore.utils:101 arun_command ['chzdev', '--enable', '0.0.100d'] exited with code 0
2023-04-05 20:33:01,062 DEBUG subiquitycore.utils:64 run_command called: ['lszdev', '--quiet', '--pairs', '--columns', 'id,type,on,exists,pers,auto,failed,names']
2023-04-05 20:33:01,129 DEBUG subiquitycore.utils:77 run_command ['lszdev', '--quiet', '--pairs', '--columns', 'id,type,on,exists,pers,auto,failed,names'] exited with code 0
2023-04-05 20:33:01,133 DEBUG root:37 finish: subiquity/Zdev/chzdev_POST: SUCCESS: 200 [{"id": "0.0.0009", "type": "generic-ccw", "on": true, "exists": true, "pers"...
2023-04-05 20:33:01,134 INFO aiohttp.access:233  [05/Apr/2023:20:33:00 +0000] "POST /zdev/chzdev?action=%22enable%22&zdev=%7B%22id%22:+%220.0.100d%22,+%22type%22:+%22zfcp-host%22,+%22on%22:+false,+%22exists%2      2:+true,+%22pers% ^H:^[[K^M^[[K22:+false,+%22auto%22:+false,+%22failed%22:+false,+%22names%22:+%22%22%7D HTTP/1.1" 200 4435 "-" "Python/3.8 aiohttp/3.6.2"

Later reports 2023-04-05 20:33:05,174 ERROR subiquity.server.server:424 top level error

i.e.

2023-04-05 20:33:05,174 ERROR subiquity.server.server:424 top level error
Traceback (most recent call last):
File "/snap/subiquity/4383/usr/lib/python3.8/asyncio/events.py", line 81, in _run
self._context.run(self._callback, *self._args)
File "/snap/subiquity/4383/lib/python3.8/site-packages/subiquity/server/controllers/filesystem.py", line 682, in _udev_event
action, dev = self._monitor.receive_device()
File "/snap/subiquity/4383/lib/python3.8/site-packages/pyudev/monitor.py", line 400, in receive_device
device = self.poll()
File "/snap/subiquity/4383/lib/python3.8/site-packages/pyudev/monitor.py", line 358, in poll
if eintr_retry_call(poll.Poll.for_events((self, "r")).poll, timeout):
File "/snap/subiquity/4383/lib/python3.8/site-packages/pyudev/_util.py", line 164, in eintr_retry_call
return func(*args, **kwargs)
File "/snap/subiquity/4383/lib/python3.8/site-packages/pyudev/_os/poll.py", line 94, in poll
return list(self._parse_events(eintr_retry_call(self._notifier.poll, timeout)))
File "/snap/subiquity/4383/lib/python3.8/site-packages/pyudev/_os/poll.py", line 109, in _parse_events
raise IOError("Error while polling fd: {0!r}".format(fd))
OSError: Error while polling fd: 20
2023-04-05 20:33:05,177 INFO subiquity.common.errorreport:406 saving crash report 'unknown error crashed with OSError' to /var/crash/1680726785.177204132.unknown.crash

Revision history for this message

bugproxy (bugproxy) wrote on 2023-04-06: subiquity ssh terminal text log from the script tool

#24

subiquity ssh terminal text log from the script tool Edit (68.8 KiB, application/gzip)

------- Comment on attachment From <email address hidden> 2023-03-03 13:04 EDT-------

not sure the existing attachment of the installer console log was mirrored, so I attach it again

Revision history for this message

Frank Heimes (fheimes) wrote on 2023-04-06: Re: [UBUNTU 22.04] OS installer exits for zfcp 32G adapter with an unknown error. An error occurred during installation

#25

attempt_to_recreate.txt Edit (20.7 KiB, text/plain)

I tried to re-create this (or at least a situation that is similar) on our system, but I have to admit that I do not have the exact same hardware.
But I have a PR/SM system (not a DPM box) with FICON Express 16S (no 32s, but the driver is the same for both) and a DS8000 disk storage sub-system. I tried that on a z/VM guest that has one 64Gbit LUN attached via two HBAs each with two paths. (DASD ECKD devices were also available, but not used at all.)
With that I could successfully complete an Ubuntu Server 22.04.2 installation.
Please see the attached doc for the relevant storage related installer screens and at the end the lszdev output from the installer shell.

Unfortunately I don't have an XIV storage system (since I know that it behaves slightly different compared to a DS8k).

And you already retried on 22.04.2 with the same result you've reported.

And you did a (standard) interactive install (and no 'autoinstall'), right?

May I also ask if your system a DPM system?
And if you, could you please check if there is autoconf data set, like 'lszdev --auto-conf'?

Then I faced a special situation where a disk was previously used and had existing LVM config on it that the installer tries to read, but struggled with it - or where a very old LVM existed that had (meanwhile) incompatible meta data.
To rule out any issues like this I also want to recommend to try manually wiping out the disk, like:
Start the installer (which fortunately is a Ubuntu live system), enable the FCP LUN (either in the UI or in an installer shell, which can be reached via the help menu or Control-Z, respectively 'F2') and wipe the disks like:
# ls -la /dev/mapper/
control mpatha mpatha-part1
# wipefs -a -f /dev/mapper/mpatha-part1
# wipefs -a -f /dev/mapper/mpatha
( # or go via the scsi device:
# wipefs -a -f /dev/sda
# wipefs -a -f /dev/sda1 )
Afterwards it's needed to restart the installer from scratch (Load task).

Revision history for this message

bugproxy (bugproxy) wrote on 2023-04-06: subiquity ssh terminal text log from the script tool

#26

subiquity ssh terminal text log from the script tool Edit (68.8 KiB, application/gzip)

------- Comment on attachment From <email address hidden> 2023-03-03 13:04 EDT-------

not sure the existing attachment of the installer console log was mirrored, so I attach it again

Revision history for this message

bugproxy (bugproxy) wrote on 2023-04-06:

#27

subiquity ssh terminal text log from the script tool Edit (68.8 KiB, application/gzip)

------- Comment on attachment From <email address hidden> 2023-03-03 13:04 EDT-------

not sure the existing attachment of the installer console log was mirrored, so I attach it again

Revision history for this message

bugproxy (bugproxy) wrote on 2023-04-06: host_installer_shell_cmds_04062023_1.txt

#28

host_installer_shell_cmds_04062023_1.txt Edit (37.9 KiB, text/plain)

Download full text (3.1 KiB)

------- Comment on attachment From <email address hidden> 2023-04-06 17:43 EDT-------

Reference attached file host_installer_shell_cmds_04062023_1.txt on DMP details, from the shell for command executed in shell

Note after chzdev --enable, Quantity 1969 files generated in /var/log/crash filling up /
i.e.
ls -l /var/crash
chzdev --enable 0.0.100d
lszdev
ls -l | grep unknown | wc -l
1969

i.e.
  45590 Apr 6 12:17 1680808641.142762184.unknown.crash.gz
     90 Apr 6 12:17 1680808641.142762184.unknown.meta.gz
  45751 Apr 6 12:17 1680808641.190968275.unknown.crash.gz
     77 Apr 6 12:17 1680808641.190968275.unknown.meta.gz
.
.
.
142575 Apr 6 12:18 1680808725.929860353.unknown.crash.gz
     77 Apr 6 12:18 1680808725.929860353.unknown.meta.gz
162995 Apr 6 12:18 1680808726.069142342.unknown.crash.gz
     77 Apr 6 12:18 1680808726.069142342.unknown.meta.gz
141831 Apr 6 12:18 1680808726.164782763.unknown.crash.gz
     77 Apr 6 12:18 1680808726.164782763.unknown.meta.gz

Includes
lszdev --auto-conf
multipath -l
lszfcp
fdisk -l /dev/mapper/xxxxx
ls -l /var/crash | wc -l <=============== Quantity 1969 files generated filling up /

root@ubuntu-server:~# df -T
Filesystem Type 1K-blocks Used Available Use% Mounted on
tmpfs tmpfs 3294672 305604 2989068 10% /run
/dev/loop0 iso9660 1149188 1149188 0 100% /cdrom
/cow overlay 16473348 16473348 0 100% /
overlay overlay 292992 292992 0 100% /media/filesystem
tmpfs tmpfs 16473348 0 16473348 0% /dev/shm
tmpfs tmpfs 5120 0 5120 0% /run/lock
tmpfs tmpfs 16473348 0 16473348 0% /tmp
tmpfs tmpfs 3294668 4 3294664 1% /run/user/1000
overlay overlay 292992 292992 0 100% /tmp/tmpcsrrjbgt/root.dir

In response to
> try manually wiping out the disk that have old LVM

Some of the luns are older OS Boot luns similar to lun mpathb (Boot lun for ubuntu 20.04).

Wiping these luns would not be an option as they are needed for ongoing tests and/or support

Will use a work-around to unmap such luns from the host during a new OS install so a new lun [i.e. for 20.04.2) can be installed

Another OS Boot lun is 20.04.1 on zfcp-host 120d/130d [not enabled for latest run]

Boot lun for ubuntu 20.04
lun mpathb (360050762198c1fc2180000000b000132) = (Boot lun for ubuntu 20.04)

mpathb (360050762198c1fc2180000000b000132) dm-1 IBM,FlashSystem-9840
size=50G features='0' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=0 status=active
|- 0:0:1:0 sdc 8:32 active undef running
`- 0:0:4:0 sdi 8:128 active undef running

fdisk -l /dev/mapper/mpathb
Disk /dev/mapper/mpathb: 50 GiB, 53687091200 bytes, 104857600 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 65536 bytes
Disklabel type: gpt
Disk identifier: D3739269-492B-463E-927C-5441496FB8F3

Device Start End Sectors Size Type
/dev/mapper/mpathb-part1 2048 2099199 2097152 1G Linux filesystem
/dev/mapper/mpathb-part2 2099200 104855551 102756352 ...

------- Comment on attachment From finnegan@us.ibm.com 2023-04-06 17:43 EDT-------

Reference attached file host_installer_shell_cmds_04062023_1.txt on DMP details, from the shell for command executed in shell

Note after chzdev --enable,  Quantity 1969 files generated in /var/log/crash filling up /
i.e.
ls -l /var/crash
chzdev --enable 0.0.100d
lszdev
ls -l | grep unknown | wc -l
1969

i.e.
  45590 Apr  6 12:17 1680808641.142762184.unknown.crash.gz
     90 Apr  6 12:17 1680808641.142762184.unknown.meta.gz
  45751 Apr  6 12:17 1680808641.190968275.unknown.crash.gz
     77 Apr  6 12:17 1680808641.190968275.unknown.meta.gz
.
.
.
 142575 Apr  6 12:18 1680808725.929860353.unknown.crash.gz
     77 Apr  6 12:18 1680808725.929860353.unknown.meta.gz
 162995 Apr  6 12:18 1680808726.069142342.unknown.crash.gz
     77 Apr  6 12:18 1680808726.069142342.unknown.meta.gz
 141831 Apr  6 12:18 1680808726.164782763.unknown.crash.gz
     77 Apr  6 12:18 1680808726.164782763.unknown.meta.gz

Includes
lszdev --auto-conf
multipath -l
lszfcp
fdisk -l /dev/mapper/xxxxx
ls -l /var/crash | wc -l   <===============   Quantity 1969 files generated filling up /

root@ubuntu-server:~# df -T
Filesystem     Type    1K-blocks     Used Available Use% Mounted on
tmpfs          tmpfs     3294672   305604   2989068  10% /run
/dev/loop0     iso9660   1149188  1149188         0 100% /cdrom
/cow           overlay  16473348 16473348         0 100% /
overlay        overlay    292992   292992         0 100% /media/filesystem
tmpfs          tmpfs    16473348        0  16473348   0% /dev/shm
tmpfs          tmpfs        5120        0      5120   0% /run/lock
tmpfs          tmpfs    16473348        0  16473348   0% /tmp
tmpfs          tmpfs     3294668        4   3294664   1% /run/user/1000
overlay        overlay    292992   292992         0 100% /tmp/tmpcsrrjbgt/root.dir

In response to
> try manually wiping out the disk that have old LVM

Some of the luns are older OS Boot luns similar to lun mpathb (Boot lun for ubuntu 20.04).

Wiping these luns would not be an option as they are needed for ongoing tests and/or support

Will use a work-around to unmap such luns from the host during a new OS install so a new lun [i.e. for 20.04.2) can be installed

Another OS Boot lun is 20.04.1 on zfcp-host 120d/130d [not enabled for latest run]

Boot lun for ubuntu 20.04
lun mpathb (360050762198c1fc2180000000b000132) = (Boot lun for ubuntu 20.04)

mpathb (360050762198c1fc2180000000b000132) dm-1 IBM,FlashSystem-9840
size=50G features='0' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=0 status=active
  |- 0:0:1:0 sdc 8:32  active undef running
  `- 0:0:4:0 sdi 8:128 active undef running

fdisk -l /dev/mapper/mpathb
Disk /dev/mapper/mpathb: 50 GiB, 53687091200 bytes, 104857600 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 65536 bytes
Disklabel type: gpt
Disk identifier: D3739269-492B-463E-927C-5441496FB8F3

Device                     Start       End   Sectors Size Type
/dev/mapper/mpathb-part1    2048   2099199   2097152   1G Linux filesystem
/dev/mapper/mpathb-part2 2099200 104855551 102756352  49G Linux filesystem

Revision history for this message

bugproxy (bugproxy) wrote on 2023-04-06: subiquity ssh terminal text log from the script tool

#29

subiquity ssh terminal text log from the script tool Edit (68.8 KiB, application/gzip)

------- Comment on attachment From <email address hidden> 2023-03-03 13:04 EDT-------

not sure the existing attachment of the installer console log was mirrored, so I attach it again

Revision history for this message

bugproxy (bugproxy) wrote on 2023-04-06: var_log_installer_04062023_1.tar.gz

#30

var_log_installer_04062023_1.tar.gz Edit (42.4 MiB, application/x-gzip)

------- Comment on attachment From <email address hidden> 2023-04-06 17:51 EDT-------

For case of enabling zfcp-host 0.0.100d in the shell

chzdev --enable 0.0.100d
lszdev
ls -l | grep unknown | wc -l
1969

Revision history for this message

bugproxy (bugproxy) wrote on 2023-04-06: subiquity ssh terminal text log from the script tool

#31

subiquity ssh terminal text log from the script tool Edit (68.8 KiB, application/gzip)

------- Comment on attachment From <email address hidden> 2023-03-03 13:04 EDT-------

not sure the existing attachment of the installer console log was mirrored, so I attach it again

Revision history for this message

bugproxy (bugproxy) wrote on 2023-04-06: var_crash_04062023_a.tar.gz

#32

var_crash_04062023_a.tar.gz Edit (957.2 KiB, application/x-gzip)

------- Comment on attachment From <email address hidden> 2023-04-06 18:08 EDT-------

For case of enabling zfcp-host 0.0.100d in the shell

Included 42 of the 1969 files in this attachment

Revision history for this message

bugproxy (bugproxy) wrote on 2023-04-07: subiquity ssh terminal text log from the script tool

#33

subiquity ssh terminal text log from the script tool Edit (68.8 KiB, application/gzip)

------- Comment on attachment From <email address hidden> 2023-03-03 13:04 EDT-------

not sure the existing attachment of the installer console log was mirrored, so I attach it again

Revision history for this message

bugproxy (bugproxy) wrote on 2023-04-07: Comment bridged from LTC Bugzilla

#34

------- Comment From <email address hidden> 2023-04-07 12:36 EDT-------
Comment on attachment 157094
subiquity ssh terminal text log from the script tool

This attachment is a duplicate of attachment 157095 and seems to cause a hick-up in the IBM-Canonical bridge. As a result, the attachment comment is appended to the Launchpad entry over and over again every 24 hours.
By deleting the attachment, I am trying to fix this problem.

Revision history for this message

bugproxy (bugproxy) wrote on 2023-04-12:

#35

Download full text (5.7 KiB)

------- Comment From <email address hidden> 2023-04-12 13:24 EDT-------
(In reply to comment #25)
> Thx for attaching the logs and the crash report, we'll investigate ...
>
> What I'm just wondering about are the '"OSError: [Errno 28] No space left on
> device"' messages. Is there something with the FCP/SCSI LUN (size) or
> options to write to it?

Looking at the log, the installer has not made much progress yet. We just successfully(!) probed a few SCSI disks, but haven't configured any partitioning, let alone mount points. I take it that the installer must not write to any real disk at that point in time. So ENOSPC cannot come from zfcp-attached SCSI disks. Let's not get hung up on zfcp or on different FCP-attached storage arrays (DS8000, XIV, FlashSystem, etc.); they all present standard SCSI disks for which the common code Linux kernel driver sd_mod provides regular block devices; nothing special about this at all.

BTW, the kernel boot parameters look odd:

[ 0.440956] Kernel command line: @```@%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@

@<email address hidden>, what exact parm file content did you use to boot the installer?
Since the network interface ence0f appears without DPM auto conf being used and I don't see it being configured interactively in the installer, I wonder where its ccwgroup configuration came from. Maybe the parm file had enough leading zeros to get truncated during kernel console output, but maybe there was some boot parameter to group 0.0.0e0f,0.0.0e10,0.0.0e11 for ence0f?

(In reply to comment #29)
> host_installer_shell_cmds_04062023_1.txt
> Note after chzdev --enable, Quantity 1969 files generated in /var/log/crash
> filling up /
> ls -l /var/crash
> ls -l | grep unknown | wc -l
> 1969
...
> 142575 Apr 6 12:18 1680808725.929860353.unknown.crash.gz
> 77 Apr 6 12:18 1680808725.929860353.unknown.meta.gz
> 162995 Apr 6 12:18 1680808726.069142342.unknown.crash.gz
...

> root@ubuntu-server:~# df -T
> Filesystem Type 1K-blocks Used Available Use% Mounted on
> /cow overlay 16473348 16473348 0 100% /
> overlay overlay 292992 292992 0 100% /media/filesystem
> overlay overlay 292992 292992 0 100% /tmp/tmpcsrrjbgt/root.dir

I see ENOSPC also when the installer tries to log something. I assume this must happen towards some space in the ramdisk the installer runs within.
There seem to be a number of (too many?) installer "crash" files under /var/crash likely on the completely filled up overlay-fs.
Those "crash" files are neither created by chzdev nor lszdev.

However, if I read the logs correctly, these debug data files consuming too much space only get generated due to other earlier python tracebacks from subiquity. IOW, ENOSPC (or EMFILE) errors are just misleading follow-on errors.

The very first one of those tracebacks happens on udev settle for the network device (before any zfcp devices):

2023-04-06 19:17:21,023 DEBUG subiquity.server.controllers.filesystem:671 waiting 0.1 to let udev event queue settle
2023-04-06 19:17:21,124 DEBUG subiquitycore.utils:64 run_command called: ['udevadm', 'settle', '-t', '0']
2023-...

------- Comment From MAIER@de.ibm.com 2023-04-12 13:24 EDT-------
(In reply to comment #25)
> Thx for attaching the logs and the crash report, we'll investigate ...
>
> What I'm just wondering about are the '"OSError: [Errno 28] No space left on
> device"' messages. Is there something with the FCP/SCSI LUN (size) or
> options to write to it?

Looking at the log, the installer has not made much progress yet. We just successfully(!) probed a few SCSI disks, but haven't configured any partitioning, let alone mount points. I take it that the installer must not write to any real disk at that point in time. So ENOSPC cannot come from zfcp-attached SCSI disks. Let's not get hung up on zfcp or on different FCP-attached storage arrays (DS8000, XIV, FlashSystem, etc.); they all present standard SCSI disks for which the common code Linux kernel driver sd_mod provides regular block devices; nothing special about this at all.

BTW, the kernel boot parameters look odd:

[    0.440956] Kernel command line: @```@%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@

@finnegan@us.ibm.com, what exact parm file content did you use to boot the installer?
Since the network interface ence0f appears without DPM auto conf being used and I don't see it being configured interactively in the installer, I wonder where its ccwgroup configuration came from. Maybe the parm file had enough leading zeros to get truncated during kernel console output, but maybe there was some boot parameter to group 0.0.0e0f,0.0.0e10,0.0.0e11 for ence0f?

(In reply to comment #29)
> host_installer_shell_cmds_04062023_1.txt
> Note after chzdev --enable,  Quantity 1969 files generated in /var/log/crash
> filling up /
> ls -l /var/crash
> ls -l | grep unknown | wc -l
> 1969
...
>  142575 Apr  6 12:18 1680808725.929860353.unknown.crash.gz
>      77 Apr  6 12:18 1680808725.929860353.unknown.meta.gz
>  162995 Apr  6 12:18 1680808726.069142342.unknown.crash.gz
...

> root@ubuntu-server:~# df -T
> Filesystem     Type    1K-blocks     Used Available Use% Mounted on
> /cow           overlay  16473348 16473348         0 100% /
> overlay        overlay    292992   292992         0 100% /media/filesystem
> overlay        overlay    292992   292992         0 100% /tmp/tmpcsrrjbgt/root.dir

I see ENOSPC also when the installer tries to log something. I assume this must happen towards some space in the ramdisk the installer runs within.
There seem to be a number of (too many?) installer "crash" files under /var/crash likely on the completely filled up overlay-fs.
Those "crash" files are neither created by chzdev nor lszdev.

However, if I read the logs correctly, these debug data files consuming too much space only get generated due to other earlier python tracebacks from subiquity. IOW, ENOSPC (or EMFILE) errors are just misleading follow-on errors.

The very first one of those tracebacks happens on udev settle for the network device (before any zfcp devices):

2023-04-06 19:17:21,023 DEBUG subiquity.server.controllers.filesystem:671 waiting 0.1 to let udev event queue settle
2023-04-06 19:17:21,124 DEBUG subiquitycore.utils:64 run_command called: ['udevadm', 'settle', '-t', '0']
2023-04-06 19:17:21,139 DEBUG subiquitycore.utils:77 run_command ['udevadm', 'settle', '-t', '0'] exited with code 0
2023-04-06 19:17:21,139 ERROR subiquity.server.server:424 top level error
Traceback (most recent call last):
File "/snap/subiquity/4383/usr/lib/python3.8/asyncio/events.py", line 81, in _run
self._context.run(self._callback, *self._args)
File "/snap/subiquity/4383/lib/python3.8/site-packages/subiquity/server/controllers/filesystem.py", line 682, in _udev_event
action, dev = self._monitor.receive_device()
File "/snap/subiquity/4383/lib/python3.8/site-packages/pyudev/monitor.py", line 400, in receive_device
device = self.poll()
File "/snap/subiquity/4383/lib/python3.8/site-packages/pyudev/monitor.py", line 358, in poll
if eintr_retry_call(poll.Poll.for_events((self, "r")).poll, timeout):
File "/snap/subiquity/4383/lib/python3.8/site-packages/pyudev/_util.py", line 164, in eintr_retry_call
return func(*args, **kwargs)
File "/snap/subiquity/4383/lib/python3.8/site-packages/pyudev/_os/poll.py", line 94, in poll
return list(self._parse_events(eintr_retry_call(self._notifier.poll, timeout)))
File "/snap/subiquity/4383/lib/python3.8/site-packages/pyudev/_os/poll.py", line 109, in _parse_events
raise IOError("Error while polling fd: {0!r}".format(fd))
OSError: Error while polling fd: 20
2023-04-06 19:17:21,142 DEBUG subiquity.common.errorreport:384 generating crash report
2023-04-06 19:17:21,143 INFO subiquity.common.errorreport:406 saving crash report 'unknown error crashed with OSError' to /var/crash/1680808641.142762184.unknown.crash

This repeats (at 10Hz ?) often enough in the manual udev settle loop and each iteration creates one of those "crash" files of considerable size.

Unfortunately, 1680808641.142762184.unknown.crash.gz and 1680808641.142762184.unknown.meta.gz (the other crash files show the same traceback as above) do _not_ contain any more debug data than we had originally provided (what subiquity printed on the console when asking it to show debug data for the "unknown error").

So coming back to:
(In reply to comment #16)
> > Canonical, why did the installer get an error?
> >
> > Does the installer really have a busy(!) waiting loop calling udevadm settle
> > with zero timeout?
>
> Maybe it's not busy and instead driver by a poll/select loop (with its own
> timeout/sleep) in the installer and therefore called with zero timeout.
>
> > But even if so, with the number of discovered devices and the settle finally
> > returning with success errorlevel 0, it should just work?

Maybe you could share a link to the corresponding source code of subiquity performing the udev settle?

Revision history for this message

bugproxy (bugproxy) wrote on 2023-04-14:

#36

------- Comment From <email address hidden> 2023-04-13 20:16 EDT-------
(In reply to comment #34)
@<email address hidden>

> BTW, the kernel boot parameters look odd:
>
> [ 0.440956] Kernel command line:
> @```@%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
> @@@
>
> @<email address hidden>, what exact parm file content did you use to boot the
> installer?

I started out with the following, but my syntax sequence for the line was not accepted , as was getting "Unknown kernel command line parameters"
I need to check the proper syntax and sequence for the Kernel command line input to put in file PARMFILE UBUNTU

PARMFILE UBUNTU
ip=9.11.116.213::9.11.116.1:24:255.255.255.0:ilabg13:ence0f:none:9.11.227.25

i.e.
[ 0.452018] Kernel command line: ip=9.11.116.213::9.11.116.1:24:255.255.255.0:ilabg13:ence0f:none:9.11.227.25
[ 0.452037] Unknown kernel command line parameters "ip=9.11.116.213::9.11.116.1:24:255.255.255.0:ilabg13:ence0f:none:9.11.227.25", will be passed to user space

So I chose to go with the default PARMFILE UBUNTU, with just the "---" line and enter the info as requested per the the statement "will be passed to user space", as it prompts
i.e.
Default PARMFILE UBUNTU
---

Revision history for this message

bugproxy (bugproxy) wrote on 2023-04-14:

#37

Download full text (5.7 KiB)

------- Comment From <email address hidden> 2023-04-14 07:02 EDT-------
(In reply to comment #35)
> (In reply to comment #34)
> > BTW, the kernel boot parameters look odd:
> >
> > [ 0.440956] Kernel command line:
> > @```@%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
> > @@@
> >
> > @<email address hidden>, what exact parm file content did you use to boot the
> > installer?
>
> I started out with the following, but my syntax sequence for the line was
> not accepted , as was getting "Unknown kernel command line parameters"

Unfortunately, this kernel message can be misleading.

The Linux "kernel" command line can contain different types of parameters:
(1) actual kernel parameters
(2) arbitrary user space parameters

See also Section "Parameters other than kernel parameters" in https://www.ibm.com/docs/en/linux-on-systems?topic=skp-different-sources-1#ipl_kernparm_conflicts__title__4
and
https://www.ibm.com/docs/en/linux-on-systems?topic=s-kernel-parameter-line-1

Recently, the kernel started to check the syntax for (1). So any parameters from class (2) [before an optional separator "--"] get reported as unknown, but they can still be perfectly used by user space.

You can specify all non-kernel parameters from class (2) after a "--" separator on the kernel command line like this to avoid misleading kernel messages [see also below]:
<all actual kernel parameters> -- <any non-kernel parameters>

> I need to check the proper syntax and sequence for the Kernel command line
> input to put in file PARMFILE UBUNTU
>
> PARMFILE UBUNTU
> ip=9.11.116.213::9.11.116.1:24:255.255.255.0:ilabg13:ence0f:none:9.11.227.25

I think the duplicate specification of both a CIDR netmask and an IP netmask seems odd:
24:255.255.255.0

I'm not sure, which syntax the Ubuntu installer initrd actually uses.
There is a (user space) dracut cmdline option
https://mirrors.edge.kernel.org/pub/linux/utils/boot/dracut/dracut.html#_network
ip=<client-IP>:[<peer>]:<gateway-IP>:<netmask>:<client_hostname>:<interface>:{none|off|dhcp|on|any|dhcp6|auto6|ibft}[:[<mtu>][:<macaddr>]]
ip=<client-IP>:[<peer>]:<gateway-IP>:<netmask>:<client_hostname>:<interface>:{none|off|dhcp|on|any|dhcp6|auto6|ibft}[:[<dns1>][:<dns2>]]
And there is an actual kernel parameter
https://www.kernel.org/doc/html/latest/admin-guide/nfs/nfsroot.html#kernel-command-line
ip=<client-ip>:<server-ip>:<gw-ip>:<netmask>:<hostname>:<device>:<autoconf>:<dns0-ip>:<dns1-ip>:<ntp0-ip>

Comparing with my last Ubuntu on s390x installation, I would probably use this instead (dropping ":24"), no matter if used as kernel boot parameter or entered interactively (in case the latter understands ip= syntax, which I don't know):

ip=9.11.116.213::9.11.116.1:255.255.255.0:ilabg13:ence0f:none:9.11.227.25

FWIW, I had also added an url= pointing to the ISO in my parmfile when I manually installed Ubuntu on s390x the last time, but that might be optional(?):

url=http://bistro/ubuntu/UBUNTUxx.yy/zzz.-live-server-...-s390x.iso

Also, without DPM device auto configuration, I wonder how the network interface ence0f would get configured as prerequisite. As opposed to PCI(e) based network devices such as RoCE adapters, CC...

------- Comment From MAIER@de.ibm.com 2023-04-14 07:02 EDT-------
(In reply to comment #35)
> (In reply to comment #34)
> > BTW, the kernel boot parameters look odd:
> >
> >  [    0.440956] Kernel command line:
> > @```@%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
> > @@@
> >
> > @finnegan@us.ibm.com, what exact parm file content did you use to boot the
> > installer?
>
> I started out with the following, but my syntax sequence for the line was
> not accepted , as was getting "Unknown kernel command line parameters"

Unfortunately, this kernel message can be misleading.

The Linux "kernel" command line can contain different types of parameters:
(1) actual kernel parameters
(2) arbitrary user space parameters

See also Section "Parameters other than kernel parameters" in https://www.ibm.com/docs/en/linux-on-systems?topic=skp-different-sources-1#ipl_kernparm_conflicts__title__4
and
https://www.ibm.com/docs/en/linux-on-systems?topic=s-kernel-parameter-line-1

Recently, the kernel started to check the syntax for (1). So any parameters from class (2) [before an optional separator "--"] get reported as unknown, but they can still be perfectly used by user space.

You can specify all non-kernel parameters from class (2) after a "--" separator on the kernel command line like this to avoid misleading kernel messages [see also below]:
<all actual kernel parameters> -- <any non-kernel parameters>

> I need to check the proper syntax and sequence for the Kernel command line
> input to put in file PARMFILE UBUNTU
>
> PARMFILE UBUNTU
>  ip=9.11.116.213::9.11.116.1:24:255.255.255.0:ilabg13:ence0f:none:9.11.227.25

I think the duplicate specification of both a CIDR netmask and an IP netmask seems odd:
24:255.255.255.0

I'm not sure, which syntax the Ubuntu installer initrd actually uses.
There is a (user space) dracut cmdline option
https://mirrors.edge.kernel.org/pub/linux/utils/boot/dracut/dracut.html#_network
ip=<client-IP>:[<peer>]:<gateway-IP>:<netmask>:<client_hostname>:<interface>:{none|off|dhcp|on|any|dhcp6|auto6|ibft}[:[<mtu>][:<macaddr>]]
ip=<client-IP>:[<peer>]:<gateway-IP>:<netmask>:<client_hostname>:<interface>:{none|off|dhcp|on|any|dhcp6|auto6|ibft}[:[<dns1>][:<dns2>]]
And there is an actual kernel parameter
https://www.kernel.org/doc/html/latest/admin-guide/nfs/nfsroot.html#kernel-command-line
ip=<client-ip>:<server-ip>:<gw-ip>:<netmask>:<hostname>:<device>:<autoconf>:<dns0-ip>:<dns1-ip>:<ntp0-ip>

Comparing with my last Ubuntu on s390x installation, I would probably use this instead (dropping ":24"), no matter if used as kernel boot parameter or entered interactively (in case the latter understands ip= syntax, which I don't know):

ip=9.11.116.213::9.11.116.1:255.255.255.0:ilabg13:ence0f:none:9.11.227.25

FWIW, I had also added an url= pointing to the ISO in my parmfile when I manually installed Ubuntu on s390x the last time, but that might be optional(?):

url=http://bistro/ubuntu/UBUNTUxx.yy/zzz.-live-server-...-s390x.iso

Also, without DPM device auto configuration, I wonder how the network interface ence0f would get configured as prerequisite. As opposed to PCI(e) based network devices such as RoCE adapters, CCW based (the prefix enc stands for EtherNet on Ccw bus) network devices need an explicit s390-specific pre-config consisting of the assembly of a ccwgroup (subchannel/device triplet) and setting that ccwgroup online. I installed under z/VM (so definitely without DPM dev auto conf) and only had ip= and url=, so there must be some magic in the Ubuntu installer initrd automatically, performing the pre-req for an enc... network interface.

> i.e.
> [    0.452018] Kernel command line:
> ip=9.11.116.213::9.11.116.1:24:255.255.255.0:ilabg13:ence0f:none:9.11.227.25
> [    0.452037] Unknown kernel command line parameters
> "ip=9.11.116.213::9.11.116.1:24:255.255.255.0:ilabg13:ence0f:none:9.11.227.
> 25", will be passed to user space
>
>
> So I chose to go with the default PARMFILE UBUNTU, with just the "---" line
> and enter the info as requested per the the statement "will be passed to
> user space", as it prompts

Sorry, if I missed it as I'm a bit lost in all the debug data. Where can I see this prompt and what was interactively entered?

> i.e.
> Default PARMFILE UBUNTU
>  ---

ubuntu@bistro:~/UBUNTU22.04.1/CD/boot$ xxd parmfile.ubuntu
00000000: 202d 2d2d 200a                            --- .

@Canonical, I would like to understand the rationale behind this default kernel boot parameter.
I'm puzzled by the triple-dash, because the kernel parameter documentation
https://www.kernel.org/doc/html/latest/admin-guide/kernel-parameters.html
only seems to mention a double-dash to have the kernel end parsing for kernel parameters and pass the remainder to init:
<quote>
The kernel parses parameters from the kernel command line up to ?--?; if it doesn?t recognize a parameter and it doesn?t contain a ?.?, the parameter gets passed to init: parameters with ?=? go into init?s environment, others are passed as command line arguments to init. Everything after ?--? is passed as an argument to init.
</quote>
Does that mean the triple dash would (accidentally??) lead to "- " or "-" being passed as command line argument to init? What would be the consequences?

Also, I don't understand how that default could cause a completely unexpected garbled kernel command line with lots of special characters, and whether that could cause any follow-on problems (subiquity and other user space actually saw the same garbled content via /proc/cmdline as visible in the logs (let alone what might be passed to the environment or as arguments of init/PID1)):

> >  [    0.440956] Kernel command line:
> > @```@%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
> > @@@

Revision history for this message

bugproxy (bugproxy) wrote on 2023-04-14:

#38

Download full text (7.1 KiB)

------- Comment From <email address hidden> 2023-04-14 08:43 EDT-------
(In reply to comment #36)
> (In reply to comment #35)
> > (In reply to comment #34)

> > I need to check the proper syntax and sequence for the Kernel command line
> > input to put in file PARMFILE UBUNTU
> >
> > PARMFILE UBUNTU
> > ip=9.11.116.213::9.11.116.1:24:255.255.255.0:ilabg13:ence0f:none:9.11.227.25
>
> I think the duplicate specification of both a CIDR netmask and an IP netmask
> seems odd:
> 24:255.255.255.0

> Comparing with my last Ubuntu on s390x installation, I would probably use
> this instead (dropping ":24"), no matter if used as kernel boot parameter or
> entered interactively (in case the latter understands ip= syntax, which I
> don't know):
>
> ip=9.11.116.213::9.11.116.1:255.255.255.0:ilabg13:ence0f:none:9.11.227.25

host.ilabg13_9.11.116.213_ubuntu-22.04_ssh_installer.txt.gz seems to show an example with a correct ip syntax, which likely worked:

[ 0.360759] Kernel command line: ip=9.11.116.213::9.11.116.1:255.255.255.0:ilabg13:ence0f:none:9.11.227.25 https://cdimage.ubuntu.com/releases/jammy/release/ubuntu-22.04.1-live-server-s39
[ 0.360781] Unknown kernel command line parameters "ip=9.11.116.213::9.11.116.1:255.255.255.0:ilabg13:ence0f:none:9.11.227.25", will be passed to user space.

[ 1.852616] Run /init as init process
[ 1.852617] with arguments:
[ 1.852618] /init
[ 1.852618] with environment:
[ 1.852619] HOME=/
[ 1.852619] TERM=linux
[ 1.852620] ip=9.11.116.213::9.11.116.1:255.255.255.0:ilabg13:ence0f:none:9.11.227.25

[ 3.638269] qeth: register layer 2 discipline
[ 3.638983] qeth 0.0.0e0f: CHID: ffee CHPID: ee
[ 3.645883] qeth 0.0.0e11: qdio: OSA on SC 2 using AI:1 QEBSM:1 PRI:1 TDD:1 SIGA: W
[ 3.675469] qeth 0.0.0e0f: Device is a OSD Express card (level: 0173)
with link type OSD_1000.
[ 3.675884] qeth 0.0.0e0f: The device represents a Bridge Capable Port
[ 3.677706] qeth 0.0.0e0f: MAC address 02:76:54:00:00:05 successfully registered
[ 3.681817] qeth 0.0.0e0f ence0f: renamed from eth0

2023-01-18 20:23:57,906 DEBUG subiquitycore.netplan:109 config for ence0f = {'addresses': ['9.11.116.213/24'], 'gateway4': '9.11.116.1', 'nameservers': {'addresses': ['9.11.227.25']}}

> > So I chose to go with the default PARMFILE UBUNTU, with just the "---" line
> > and enter the info as requested per the the statement "will be passed to
> > user space", as it prompts
>
> Sorry, if I missed it as I'm a bit lost in all the debug data. Where can I
> see this prompt and what was interactively entered?

I think I found an example:

ilabg13_9.11.116.213_ubuntu-22.04.2_setup_install_04032023_9.txt

Two methods available for IP configuration:
* static: for static IP configuration
* dhcp: for automatic IP configuration
static dhcp (default 'dhcp'): static
ip: 9.11.116.213
netmask (default 255.255.255.0):
gateway (default 9.11.116.1):
dns (default 9.11.116.1): 9.11.227.25
vlan id (optional):
http://cdimage.ubuntu.com/releases/jammy/release/ubuntu-22.04-live-server-s390x.iso (default)
url: http://cdimage.ubuntu.com/releases/22.04.1/release/ubuntu-22.04.2-live-server-s390x.iso
http_proxy (optional):
C...

------- Comment From MAIER@de.ibm.com 2023-04-14 08:43 EDT-------
(In reply to comment #36)
> (In reply to comment #35)
> > (In reply to comment #34)

> > I need to check the proper syntax and sequence for the Kernel command line
> > input to put in file PARMFILE UBUNTU
> >
> > PARMFILE UBUNTU
> >  ip=9.11.116.213::9.11.116.1:24:255.255.255.0:ilabg13:ence0f:none:9.11.227.25
>
> I think the duplicate specification of both a CIDR netmask and an IP netmask
> seems odd:
> 24:255.255.255.0

> Comparing with my last Ubuntu on s390x installation, I would probably use
> this instead (dropping ":24"), no matter if used as kernel boot parameter or
> entered interactively (in case the latter understands ip= syntax, which I
> don't know):
>
> ip=9.11.116.213::9.11.116.1:255.255.255.0:ilabg13:ence0f:none:9.11.227.25

host.ilabg13_9.11.116.213_ubuntu-22.04_ssh_installer.txt.gz seems to show an example with a correct ip syntax, which likely worked:

[    0.360759] Kernel command line: ip=9.11.116.213::9.11.116.1:255.255.255.0:ilabg13:ence0f:none:9.11.227.25       https://cdimage.ubuntu.com/releases/jammy/release/ubuntu-22.04.1-live-server-s39
[    0.360781] Unknown kernel command line parameters "ip=9.11.116.213::9.11.116.1:255.255.255.0:ilabg13:ence0f:none:9.11.227.25", will be passed to user space.

[    1.852616] Run /init as init process
[    1.852617]   with arguments:
[    1.852618]     /init
[    1.852618]   with environment:
[    1.852619]     HOME=/
[    1.852619]     TERM=linux
[    1.852620]     ip=9.11.116.213::9.11.116.1:255.255.255.0:ilabg13:ence0f:none:9.11.227.25

[    3.638269] qeth: register layer 2 discipline
[    3.638983] qeth 0.0.0e0f: CHID: ffee CHPID: ee
[    3.645883] qeth 0.0.0e11: qdio: OSA on SC 2 using AI:1 QEBSM:1 PRI:1 TDD:1 SIGA: W
[    3.675469] qeth 0.0.0e0f: Device is a OSD Express card (level: 0173)
with link type OSD_1000.
[    3.675884] qeth 0.0.0e0f: The device represents a Bridge Capable Port
[    3.677706] qeth 0.0.0e0f: MAC address 02:76:54:00:00:05 successfully registered
[    3.681817] qeth 0.0.0e0f ence0f: renamed from eth0

2023-01-18 20:23:57,906 DEBUG subiquitycore.netplan:109 config for ence0f = {'addresses': ['9.11.116.213/24'], 'gateway4': '9.11.116.1', 'nameservers': {'addresses': ['9.11.227.25']}}

> > So I chose to go with the default PARMFILE UBUNTU, with just the "---" line
> > and enter the info as requested per the the statement "will be passed to
> > user space", as it prompts
>
> Sorry, if I missed it as I'm a bit lost in all the debug data. Where can I
> see this prompt and what was interactively entered?

I think I found an example:

ilabg13_9.11.116.213_ubuntu-22.04.2_setup_install_04032023_9.txt

Two methods available for IP configuration:
* static: for static IP configuration
* dhcp: for automatic IP configuration
static dhcp (default 'dhcp'): static
ip: 9.11.116.213
netmask (default 255.255.255.0):
gateway (default 9.11.116.1):
dns (default 9.11.116.1): 9.11.227.25
vlan id (optional):
http://cdimage.ubuntu.com/releases/jammy/release/ubuntu-22.04-live-server-s390x.iso (default)
url: http://cdimage.ubuntu.com/releases/22.04.1/release/ubuntu-22.04.2-live-server-s390x.iso
http_proxy (optional):
Configuring networking...
QETH device 0.0.0e0f:0.0.0e10:0.0.0e11 already configured
Begin: Waiting up to 180 secs for ence0f to become available ... done.
Begin: Trying netboot from 0.0.0.0: ... Begin: Trying to download and mount http://cdimage.ubuntu.com/releases/22.04.1/release/ubuntu-22.04.2-live-server-s390x
iso ... Connecting to cdimage.ubuntu.com (91.189.91.124:80)
ubuntu-22.04.2-live-   0% |                                |  429k  0:49:51 ETA
ubuntu-22.04.2-live-   2% |                                | 28.6M  0:01:16 ETA
ubuntu-22.04.2-live-   5% |*                               | 58.7M  0:00:54 ETA
...
It is possible to connect to the installer over the network, which
might allow the use of a more capable terminal and can offer more languages
than can be rendered in the Linux console.

> > i.e.
> > Default PARMFILE UBUNTU
> >  ---
>
> ubuntu@bistro:~/UBUNTU22.04.1/CD/boot$ xxd parmfile.ubuntu
> 00000000: 202d 2d2d 200a                            --- .
>
> @Canonical, I would like to understand the rationale behind this default
> kernel boot parameter.
> I'm puzzled by the triple-dash, because the kernel parameter documentation
> https://www.kernel.org/doc/html/latest/admin-guide/kernel-parameters.html
> only seems to mention a double-dash to have the kernel end parsing for
> kernel parameters and pass the remainder to init:
> <quote>
> The kernel parses parameters from the kernel command line up to ?--?; if it
> doesn?t recognize a parameter and it doesn?t contain a ?.?, the parameter
> gets passed to init: parameters with ?=? go into init?s environment, others
> are passed as command line arguments to init. Everything after ?--? is
> passed as an argument to init.
> </quote>
> Does that mean the triple dash would (accidentally??) lead to "- " or "-"
> being passed as command line argument to init? What would be the
> consequences?
>
>
> Also, I don't understand how that default could cause a completely
> unexpected garbled kernel command line with lots of special characters, and
> whether that could cause any follow-on problems (subiquity and other user
> space actually saw the same garbled content via /proc/cmdline as visible in
> the logs (let alone what might be passed to the environment or as arguments
> of init/PID1)):
>
> > >  [    0.440956] Kernel command line:
> > > @```@%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
> > > @@@

ubuntu-22.04.2_installer_04052023_1330_mst_1.txt

[    0.441412] Kernel command line: @```@%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
[    0.441442] Unknown kernel command line parameters "@```@%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@", will be passed to user space.

[    2.187026] Run /init as init process
[    2.187027]   with arguments:
[    2.187028]     /init
[    2.187029]     @```@%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@

>>> Ouch. <<<

[    2.187030]   with environment:
[    2.187030]     HOME=/
[    2.187031]     TERM=linux

[  367.843925] qeth: register layer 2 discipline
[  367.844768] qeth 0.0.0e0f: CHID: ffee CHPID: ee
[  367.852075] qeth 0.0.0e11: qdio: OSA on SC 2 using AI:1 QEBSM:1 PRI:1 TDD:1 SIGA: W
[  367.879859] qeth 0.0.0e0f: Device is a OSD Express card (level: 0175)
with link type OSD_1000.
[  367.880259] qeth 0.0.0e0f: The device represents a Bridge Capable Port
[  367.882083] qeth 0.0.0e0f: MAC address 02:76:54:00:00:11 successfully registered
[  367.887464] qeth 0.0.0e0f ence0f: renamed from eth0

2023-04-05 19:35:06,563 DEBUG subiquity:163 Kernel commandline: CommandLineParams(_raw='@```@%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@\n', _tokens={'@```@%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@:^M@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@'}, _values={})

2023-04-05 19:35:06,587 DEBUG subiquitycore.netplan:109 config for ence0f = {'addresses': ['9.11.116.213/24'], 'gateway4': '9.11.116.1', 'nameservers': {'addresses': ['9.11.227.25']}}

Revision history for this message

bugproxy (bugproxy) wrote on 2023-04-18:

#39

------- Comment From <email address hidden> 2023-04-18 15:04 EDT-------
@Canonical, A general comment on usability for users. Recommend updating reference to "zdev" for users highlighting what type of "zdev" is expected ?

A QETH network device (0e0f) and not a zfcp QDIO device [i.e. 0.0.100d,0.0.110d,0.0.120d,0.0.130d)

Even though a network device is implied, it is not clear to users what type of zdev is being asked for

Reference to "zdev" is open to interpretation as to what type of z device is requested
i.e. QETH network or zfcp QDIO device)

One variation
i.e.
Attempt interactive netboot from a URL?
yes no (default yes): yes
Available qeth devices:
0.0.0db0 0.0.0dc0 zdev to activate (comma separated, optional): 0.0.100d,0.0.110d,0.0.120d,0.0.130d <============ equals zfcp QDIO device

Second variation
Attempt interactive netboot from a URL?
yes no (default yes):
Available qeth devices:
0.0.0db0 0.0.0dc0 0.0.0e0f
zdev to activate (comma separated, optional): 0e0f <============ equals QETH network device

Revision history for this message

bugproxy (bugproxy) wrote on 2023-04-19:

#40

------- Comment From <email address hidden> 2023-04-19 06:07 EDT-------
(In reply to comment #39)
> @Canonical, A general comment on usability for users. Recommend updating
> reference to "zdev" for users highlighting what type of "zdev" is expected ?
>
> A QETH network device (0e0f) and not a zfcp QDIO device [i.e.
> 0.0.100d,0.0.110d,0.0.120d,0.0.130d)

Regarding terminology: "zfcp QDIO" sounds odd to me as zfcp maintainer. Both qeth [the q stands for qdio] and zfcp are based on qdio.

I would like to avoid starting to specify exclusion lists as it's incomplete and would need to be maintained. Instead, let's just clearly specify the allowed list of zdev types.

Just as a reference, the list of zdev types that the backend tool knows (not necessarily all of which might be supported by subiquity as frontend):

https://www.ibm.com/docs/en/linux-on-systems?topic=sr-persistent-configuration
https://www.ibm.com/docs/en/linux-on-systems?topic=pc-view-configuration

# lszdev --list-types
TYPE DESCRIPTION
dasd FICON-attached Direct Access Storage Devices (DASDs)
dasd-eckd Enhanced Count Key Data (ECKD) DASDs
dasd-fba Fixed Block Architecture (FBA) DASDs
zfcp SCSI-over-Fibre Channel (FCP) devices and SCSI devices
zfcp-host FCP devices
zfcp-lun zfcp-attached SCSI devices
qeth OSA-Express and HiperSockets network devices
ctc Channel-To-Channel (CTC) and CTC-MPC network devices
lcs LAN-Channel-Station (LCS) network devices
generic-ccw Generic Channel-Command-Word (CCW) devices

>
> Even though a network device is implied, it is not clear to users what type
> of zdev is being asked for

@Canonical: I would like to understand if the user dialog really means to imply a network device. (looks like it just passed whatever the user entered to zdev from s390-tools and can thus configure any of the above zdev types; but it also appears during something that seems to imply a network device context)

>
> Reference to "zdev" is open to interpretation as to what type of z device is
> requested
> i.e. QETH network or zfcp QDIO device)
>
> One variation
> i.e.
> Attempt interactive netboot from a URL?
> yes no (default yes): yes

> Available qeth devices:
> 0.0.0db0 0.0.0dc0

Which zdev device types does subiquity support here and could it ever enumerate available (network) devices of another type than qeth, such as lcs or ctc?

If so, maybe a suitable terminology would be
"channel-attached network devices to activate (comma separated, optional):"
That would implicitly exclude PCIe-attached RoCE adapters, which don't need such preconfiguration, and it would also implicitly exclude zfcp and other channel-attached device types such as generic-ccw.

If not, then maybe just
"qeth network device to activate (comma separated, optional):"

> zdev to activate (comma separated, optional):
> 0.0.100d,0.0.110d,0.0.120d,0.0.130d <============ equals zfcp QDIO device
>
>
> Second variation
> Attempt interactive netboot from a URL?
> yes no (default yes):
> Available qeth devices:
> 0.0.0db0 0.0.0dc0 0.0.0e0f
> zdev to activate (comma separated, optional): 0e0f <============ equals QETH
> network device

------- Comment From MAIER@de.ibm.com 2023-04-19 06:07 EDT-------
(In reply to comment #39)
> @Canonical, A general comment on usability for users. Recommend updating
> reference to "zdev" for users highlighting what type of "zdev" is expected ?
>
> A QETH network device (0e0f) and not a zfcp QDIO device  [i.e.
> 0.0.100d,0.0.110d,0.0.120d,0.0.130d)

Regarding terminology: "zfcp QDIO" sounds odd to me as zfcp maintainer. Both qeth [the q stands for qdio] and zfcp are based on qdio.

I would like to avoid starting to specify exclusion lists as it's incomplete and would need to be maintained. Instead, let's just clearly specify the allowed list of zdev types.

Just as a reference, the list of zdev types that the backend tool knows (not necessarily all of which might be supported by subiquity as frontend):

https://www.ibm.com/docs/en/linux-on-systems?topic=sr-persistent-configuration
https://www.ibm.com/docs/en/linux-on-systems?topic=pc-view-configuration

# lszdev --list-types
TYPE        DESCRIPTION
dasd        FICON-attached Direct Access Storage Devices (DASDs)
dasd-eckd   Enhanced Count Key Data (ECKD) DASDs
dasd-fba    Fixed Block Architecture (FBA) DASDs
zfcp        SCSI-over-Fibre Channel (FCP) devices and SCSI devices
zfcp-host   FCP devices
zfcp-lun    zfcp-attached SCSI devices
qeth        OSA-Express and HiperSockets network devices
ctc         Channel-To-Channel (CTC) and CTC-MPC network devices
lcs         LAN-Channel-Station (LCS) network devices
generic-ccw Generic Channel-Command-Word (CCW) devices

>
> Even though a network device is implied, it is not clear to users what type
> of zdev is being asked for

@Canonical: I would like to understand if the user dialog really means to imply a network device. (looks like it just passed whatever the user entered to zdev from s390-tools and can thus configure any of the above zdev types; but it also appears during something that seems to imply a network device context)

>
> Reference to "zdev" is open to interpretation as to what type of z device is
> requested
> i.e. QETH network or zfcp QDIO device)
>
> One variation
> i.e.
> Attempt interactive netboot from a URL?
> yes no (default yes): yes

> Available qeth devices:
> 0.0.0db0 0.0.0dc0

Which zdev device types does subiquity support here and could it ever enumerate available (network) devices of another type than qeth, such as lcs or ctc?

If so, maybe a suitable terminology would be
"channel-attached network devices to activate (comma separated, optional):"
That would implicitly exclude PCIe-attached RoCE adapters, which don't need such preconfiguration, and it would also implicitly exclude zfcp and other channel-attached device types such as generic-ccw.

If not, then maybe just
"qeth network device to activate (comma separated, optional):"

> zdev to activate (comma separated, optional):
> 0.0.100d,0.0.110d,0.0.120d,0.0.130d <============ equals zfcp QDIO device
>
>
> Second variation
> Attempt interactive netboot from a URL?
> yes no (default yes):
> Available qeth devices:
> 0.0.0db0 0.0.0dc0 0.0.0e0f
> zdev to activate (comma separated, optional): 0e0f <============ equals QETH
> network device

Revision history for this message

Frank Heimes (fheimes) wrote on 2023-04-19: Re: [UBUNTU 22.04] OS installer exits for zfcp 32G adapter with an unknown error. An error occurred during installation

#41

Download full text (18.6 KiB)

After being back from a few days pto, I'm catching up on this now ...

I redid my installation (on z/VM with FCP disks, but on a non-DPM system)
leaving here some details:

▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄┌──────▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀ &nb [ Asturianu [ Bahasa Indonesia [ Català [ Deutsch [ English [ English (UK) [ Español [ Français [ Galego [ Hrvatski [ Latviski [ Lietuviškai [ Magyar [ Nederlands [ Norsk bokmål [ Occitan (aprèp 1500) [ Polski [ Português />▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄
/>────────────[ Help ]┐
/>▀▀▀▀▀▀▀│ Help choosing a language │▀
sp; │ Enter shell │
│ View error reports │
├──────────────────────────┤
│ About this installer │
│ Help on SSH access │
└──────────────────────────┘
▸ ]█
▸ ]█
▸ ]█
▸ ]█
▸ ]█
▸ ]
▸ ]
▸ ]
▸ ]
▸ ]
▸ ]
▸ ]
▸ ]▾

...
Installer shell session activated.

This shell session is running inside the installer environment. You
will be returned to the installer when this shell is exited, for
example by typing Control-D or 'exit'.

Be aware that this is an ephemeral environment. Changes to this
environment will not survive a reboot. If the install has started, the
installed system will be mounted at /target.
root@ubuntu-server:/# cat /proc/cmdline
%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
root@ubuntu-server:/# uname -a
Linux ubuntu-server 5.15.0-60-generic #66-Ubuntu SMP Fri Jan 20 14:30:43 UTC 2023 s390x s390x s390x GNU/Linux
root@ubuntu-server:/# snap list subiquity
Name Version Rev Tracking Publisher Notes
subiquity 23.02.1 4383 latest/stable/… canonical** classic
root@ubuntu-server:/# python3 --version
Python 3.10.6
root@ubuntu-server:/# ls -lad /var/log
drwxrwxr-x 1 root syslog 320 Apr 19 10:18 /var/log
root@ubuntu-server:/# lszdev --online
TYPE ID ON PERS NAMES
qeth 0.0.0600:0.0.0601:0.0.0602 yes no enc600
generic-ccw 0.0.0009 ...

After being back from a few days pto, I'm catching up on this now ...

I redid my installation (on z/VM with FCP disks, but on a non-DPM system)
leaving here some details:

▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄
  Willkommen! Bienvenue! Welcome! Добро пожаловать!┌──────────────────[ Help ]┐
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀│ Help choosing a language │▀
  Use UP, DOWN and ENTER keys to select your langua│ Keyboard shortcuts       │
                                                   │ Enter shell              │
                [ Asturianu                        │ View error reports       │
                [ Bahasa Indonesia                 ├──────────────────────────┤
                [ Català                           │ About this installer     │
                [ Deutsch                          │ Help on SSH access       │
                [ English                          └──────────────────────────┘
                [ English (UK)                              ▸ ]█              
                [ Español                                   ▸ ]█              
                [ Français                                  ▸ ]█              
                [ Galego                                    ▸ ]█              
                [ Hrvatski                                  ▸ ]█              
                [ Latviski                                  ▸ ]               
                [ Lietuviškai                               ▸ ]               
                [ Magyar                                    ▸ ]               
                [ Nederlands                                ▸ ]               
                [ Norsk bokmål                              ▸ ]               
                [ Occitan (aprèp 1500)                      ▸ ]               
                [ Polski                                    ▸ ]               
                [ Português                                 ▸ ]▾

...
Installer shell session activated.

This shell session is running inside the installer environment.  You
will be returned to the installer when this shell is exited, for
example by typing Control-D or 'exit'.

Be aware that this is an ephemeral environment.  Changes to this
environment will not survive a reboot. If the install has started, the
installed system will be mounted at /target.
root@ubuntu-server:/# cat /proc/cmdline 
%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
root@ubuntu-server:/# uname -a
Linux ubuntu-server 5.15.0-60-generic #66-Ubuntu SMP Fri Jan 20 14:30:43 UTC 2023 s390x s390x s390x GNU/Linux
root@ubuntu-server:/# snap list subiquity
Name       Version  Rev   Tracking         Publisher    Notes
subiquity  23.02.1  4383  latest/stable/…  canonical**  classic
root@ubuntu-server:/# python3 --version
Python 3.10.6
root@ubuntu-server:/# ls -lad /var/log 
drwxrwxr-x 1 root syslog 320 Apr 19 10:18 /var/log
root@ubuntu-server:/# lszdev --online
TYPE         ID                          ON   PERS  NAMES
qeth         0.0.0600:0.0.0601:0.0.0602  yes  no    enc600
generic-ccw  0.0.0009                    yes  no    
root@ubuntu-server:/#

▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄
  Zdev setup                                           ┌──────────────[ Help ]┐
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀│ Help on this screen  │▀
  0.0.0400                                             │ Keyboard shortcuts   │
  0.0.0592                                             │ Enter shell          │
                                                       │ View error reports   │
  qeth                                                 ├──────────────────────┤
  0.0.0600:0.0.0601:0.0.0602                     enc600│ About this installer │
  0.0.0603:0.0.0604:0.0.0605                           │ Help on SSH access   │
                                                       └──────────────────────┘
  dasd-eckd                                                                   
  0.0.1607                                                ▸                  █
                                                                             █
  zfcp-host                                                                  █
  0.0.f00b                               online           ▸                  █
  0x50050763060b16b6:0x4026400600000000          sdb sg1                     █
  0x50050763061b16b6:0x4026400600000000          sda sg0                     █
  0.0.f10b                               online           ▸                  █
  0x50050763060b16b6:0x4026400600000000          sdc sg2                     █
  0x50050763061b16b6:0x4026400600000000          sdd sg3                     ▾
                                                                              
                                 [ Continue   ]                               
                                 [ Back       ]                               
                                                                              
root@ubuntu-server:/# lszdev --online
TYPE         ID                                              ON   PERS  NAMES
zfcp-host    0.0.f00b                                        yes  yes   
zfcp-host    0.0.f10b                                        yes  yes   
zfcp-lun     0.0.f00b:0x50050763060b16b6:0x4026400600000000  yes  no    sdb sg1
zfcp-lun     0.0.f00b:0x50050763061b16b6:0x4026400600000000  yes  no    sda sg0
zfcp-lun     0.0.f10b:0x50050763060b16b6:0x4026400600000000  yes  no    sdc sg2
zfcp-lun     0.0.f10b:0x50050763061b16b6:0x4026400600000000  yes  no    sdd sg3
qeth         0.0.0600:0.0.0601:0.0.0602                      yes  no    enc600
generic-ccw  0.0.0009                                        yes  no    
root@ubuntu-server:/# multipath -ll
mpatha (36005076306ffd6b60000000000002606) dm-0 IBM,2107900
size=64G features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
`-+- policy='service-time 0' prio=50 status=active
  |- 0:0:1:1074151462 sdb 8:16 active ready running
  |- 0:0:0:1074151462 sda 8:0  active ready running
  |- 1:0:1:1074151462 sdd 8:48 active ready running
  `- 1:0:0:1074151462 sdc 8:32 active ready running
root@ubuntu-server:/#

▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄
  Guided storage configuration                                        [ Help ]
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
  Configure a guided storage layout, or create a custom one:                  
                                                                              
  (X)  Use an entire disk                                                    ▴
                                                                             █
       [ 0x6005076306 multipath device 64.000G                            ▾ ]█
         ffd6b6000000                                                        █
         0000002606                                                          █
                                                                             █
       [X]  Set up this disk as an LVM group                                 █
                                                                             █
            [ ]  Encrypt the LVM group with LUKS                             █
                                                                             █
                         Passphrase:                                         █
                                                                             █
                                                                              
                 Confirm passphrase:                                          
                                                                             ▾
                                                                              
                                 [ Done       ]                               
                                 [ Back       ]

▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄
  Storage configuration                                               [ Help ]
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
  FILE SYSTEM SUMMARY                                                        ▴
                                                                             █
    MOUNT POINT     SIZE    TYPE      DEVICE TYPE                            █
  [ /              30.996G  new ext4  new LVM logical volume            ▸ ]  █
  [ /boot           2.000G  new ext4  new partition of multipath device ▸ ]  █
                                                                             █
                                                                             █
  AVAILABLE DEVICES                                                          █
                                                                             █
    DEVICE                                   TYPE                 SIZE        
  [ ubuntu-vg (new)                          LVM volume group    61.996G  ▸ ] 
    free space                                                   31.000G  ▸   
                                                                              
  [ Create software RAID (md) ▸ ]                                             
  [ Create volume group (LVM) ▸ ]                                             
                                                                             ▾
                                                                              
                                 [ Done       ]                               
                                 [ Reset      ]                               
                                 [ Back       ]                               
                                                                              
▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄
  Storage configuration                                               [ Help ]
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
  FILE SYSTEM SUMMARY                                                        ▴
                                                                             █

┌────────────────────── Confirm destructive action ──────────────────────┐
   │                                                                        │
   │  Selecting Continue below will begin the installation process and      │
   │  result in the loss of data on the disks selected to be formatted.     │
   │                                                                        │
   │  You will not be able to return to this or a previous screen once the  │
   │  installation has started.                                             │
   │                                                                        │
   │  Are you sure you want to continue?                                    │
   │                                                                        │
   │                             [ No         ]                             │
   │                             [ Continue   ]                             │
   │                                                                        │
   └────────────────────────────────────────────────────────────────────────┘

[ Reset      ]                               
                                 [ Back       ]                               
                                                                              
▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄
  Install complete!                                                   [ Help ]
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
  ┌──────────────────────────────────────────────────────────────────────────┐
  │            configuring multipath                                        ▴│
  │            updating packages on target system                            │
  │            configuring pollinate user-agent on target                    │
  │            updating initramfs configuration                              │
  │            configuring target system bootloader                          │
  │final system configuration                                                │
  │  configuring cloud-init                                                  │
  │  calculating extra packages to install                                   │
  │  installing openssh-server                                               │
  │    curtin command system-install                                         │
  │  downloading and installing security updates                             │
  │    curtin command in-target                                              │
  │  restoring apt configuration                                             │
  │    curtin command in-target                                             █│
  │subiquity/Late/run                                                       ▾│
  └──────────────────────────────────────────────────────────────────────────┘

[ View full log ]
                               [ Reboot Now    ]

So all good here.

But a couple of obversations / comments:

You mentioned that wiping out older or addditional LUNs is not an option.
And I think it's not needed, only thought about wiping the LUNs for the OS itself and only enable this OS LUN during installaltion. Any additional LUNs can be easily added post-install and should not be enabled at install time (here, for testing and to be on the safe side).
(I think that's easier compared to disabling them on the SAN side ...)

If you do a 'normal' install, there is no need to add anything to the parmfile,
and by default the parmfile only containes a single line like this (without quotes):
" --- "
And make sure the parm file is in the correct encoding (fix length "F 80" or variable, "Trunc=80"):
 PARMFILE UBUNTU   O1  F 80  Trunc=80 Size=4 Line=0 Col=1 Alt=0                                                                           
00000 * * * Top of File * * *
00001  ---
00002 * * * End of File * * *

The 3 dashes (" --- ") are to separate installer from kernel arguments.
"<installer> --- <kernel>"
And if you do such a plain and simple install, you will be asked at the console to specify some basic network related data, incl. the qeth that should be used for the installation.
(I guess I should double-check if this is clearly stated in the docs ...)

If you don't want to specify this data manually at the early boot stage, you can add this to the parmfile (like I guess you tried).
Looking at your parmfile:
"ip=9.11.116.213::9.11.116.1:24:255.255.255.0:ilabg13:ence0f:none:9.11.227.25" 
I think it's not fully correct, since the ":24" is not needed or even wrong there.
It should be like:
"ip=9.11.116.213::9.11.116.1:255.255.255.0:ilabg13:ence0f:none:9.11.227.25 --- " 
(so in your case gw and dns are differen systems, right?)

But again, for figuring out any install issues, I would go with a standard installation first (means with specifying this data at the early boot stage at the console.)

At the early boot stage it's about "interactive netboot" and asks for network information only
and all network devices are qeth (of course except RoCE) - so don't specify any other devices here (like HBAs or LUNS)'or whatever).
And the term "zdev" comes from the lszdev and chzdev commands, hence it was also used for the 'zdev" UI installer screen, so in my case:
"
Attempt interactive netboot from a URL?
yes no (default yes): yes
Available qeth devices:
0.0.0600 0.0.0603
zdev to activate (comma separated, optional): 
"
If you would like to see some changes in the terminology that is used here, we are of course open for any ...

I believe that the content of the kernel parameter with all the "@" is more a representaton issue (nevertheless, not very nice though ...), but since it works for me on my system - and even with much more kernel args that are needed in case of a fully non-interactive "autoinstall".

What I noticed in the crash file is the following snippet:
"
 2023-04-06 19:17:21,139 DEBUG subiquitycore.utils:77 run_command ['udevadm', 'settle', '-t', '0'] exited with code 0
 2023-04-06 19:17:21,139 ERROR subiquity.server.server:424 top level error
 Traceback (most recent call last):
   File "/snap/subiquity/4383/usr/lib/python3.8/asyncio/events.py", line 81, in _run
     self._context.run(self._callback, *self._args)
   File "/snap/subiquity/4383/lib/python3.8/site-packages/subiquity/server/controllers/filesystem.py", line 682, in _udev_event
     action, dev = self._monitor.receive_device()
   File "/snap/subiquity/4383/lib/python3.8/site-packages/pyudev/monitor.py", line 400, in receive_device
     device = self.poll()
   File "/snap/subiquity/4383/lib/python3.8/site-packages/pyudev/monitor.py", line 358, in poll
     if eintr_retry_call(poll.Poll.for_events((self, "r")).poll, timeout):
   File "/snap/subiquity/4383/lib/python3.8/site-packages/pyudev/_util.py", line 164, in eintr_retry_call
     return func(*args, **kwargs)
   File "/snap/subiquity/4383/lib/python3.8/site-packages/pyudev/_os/poll.py", line 94, in poll
     return list(self._parse_events(eintr_retry_call(self._notifier.poll, timeout)))
   File "/snap/subiquity/4383/lib/python3.8/site-packages/pyudev/_os/poll.py", line 109, in _parse_events
     raise IOError("Error while polling fd: {0!r}".format(fd))
 OSError: Error while polling fd: 20
 2023-04-06 19:17:21,142 DEBUG subiquity.common.errorreport:384 generating crash report
 2023-04-06 19:17:21,143 INFO subiquity.common.errorreport:406 saving crash report 'unknown error crashed with OSError' to /var/crash/1680808641.142762184.unknown.crash
 2023-04-06 19:17:21,143 INFO root:37 start: subiquity/ErrorReporter/1680808641.142762184.unknown/add_info: 
 2023-04-06 19:17:21,143 INFO root:37 finish: subiquity/Meta/status_GET: SUCCESS: 200 {"state": "ERROR", "confirming_tty": "", "error": {"state": "INCOMPLETE", "ba...
 2023-04-06 19:17:21,144 INFO root:37 finish: subiquity/Meta/status_GET: SUCCESS: 200 {"state": "ERROR", "confirming_tty": "", "error": {"state": "INCOMPLETE", "ba...
 2023-04-06 19:17:21,144 INFO root:37 finish: subiquity/Meta/status_GET: SUCCESS: 200 {"state": "ERROR", "confirming_tty": "", "error": {"state": "INCOMPLETE", "ba...
 2023-04-06 19:17:21,144 INFO root:37 finish: subiquity/Meta/status_GET: SUCCESS: 200 {"state": "ERROR", "confirming_tty": "", "error": {"state": "INCOMPLETE", "ba...
 2023-04-06 19:17:21,144 DEBUG subiquitycore.utils:64 run_command called: ['udevadm', 'settle', '-t', '0']
InstallerServerLogInfo:
 2023-04-06 19:04:23,061 INFO subiquity:161 Starting Subiquity server revision 4383
"

That could be a problem with asyncio (I remember that there was an issue with asyncio in the past) or a race condition.
I'll ask my installer colleague to have a look at this ...

Revision history for this message

bugproxy (bugproxy) wrote on 2023-04-19: Comment bridged from LTC Bugzilla

#42

------- Comment From <email address hidden> 2023-04-19 08:20 EDT-------
(In reply to comment #41)
> You mentioned that wiping out older or addditional LUNs is not an option.
> And I think it's not needed, only thought about wiping the LUNs for the OS

> itself and only enable this OS LUN during installaltion. Any additional LUNs
> can be easily added post-install and should not be enabled at install time
> (here, for testing and to be on the safe side).
> (I think that's easier compared to disabling them on the SAN side ...)

That would require disabling zfcp auto lun scan during installation [zfcp.allow_lun_scan=0]. I intentionally did not suggest this as changing the host-mapping of volumes on the storage is cleaner and does not change the scanning behavior of Linux. That said, it would be an option.

> And make sure the parm file is in the correct encoding (fix length "F 80" or
> variable, "Trunc=80"):
> PARMFILE UBUNTU O1 F 80 Trunc=80 Size=4 Line=0 Col=1 Alt=0

I'm not sure it needs fixed record length under z/VM. I use variable record length for parm files successfully in my z/VM guests.
In contrast to that, the binaries for kernel and initrd must indeed be fixed record length 80.

> The 3 dashes (" --- ") are to separate installer from kernel arguments.
> "<installer> --- <kernel>"

As stated recently, the kernel documentation says the separator is a double dash and kernel parameters go before the separator and user space stuff after it. I'm confused.

> At the early boot stage it's about "interactive netboot" and asks for
> network information only
> and all network devices are qeth (of course except RoCE) - so don't specify
> any other devices here (like HBAs or LUNS)'or whatever).

good info, I wasn't aware; then it should explicitly state so [see earlier comments from today]

> I believe that the content of the kernel parameter with all the "@" is more
> a representaton issue (nevertheless, not very nice though ...), but since it
> works for me on my system - and even with much more kernel args that are
> needed in case of a fully non-interactive "autoinstall".

ok, but this can confuse users (or even init/systemd) so it would be good to find the root cause and fix that as well (with lower prio than the actual installation issue)

> What I noticed in the crash file is the following snippet:
> "
> 2023-04-06 19:17:21,139 DEBUG subiquitycore.utils:77 run_command ['udevadm',
> 'settle', '-t', '0'] exited with code 0

> 2023-04-06 19:17:21,143 INFO subiquity.common.errorreport:406 saving crash
> report 'unknown error crashed with OSError' to

yes, I've been pointing to this multiple times and it even occurs early for settling after the network interface (IP address) setup and before any zfcp device config

> That could be a problem with asyncio (I remember that there was an issue
> with asyncio in the past) or a race condition.
> I'll ask my installer colleague to have a look at this ...

looking forward

------- Comment From MAIER@de.ibm.com 2023-04-19 08:20 EDT-------
(In reply to comment #41)
> You mentioned that wiping out older or addditional LUNs is not an option.
> And I think it's not needed, only thought about wiping the LUNs for the OS

> itself and only enable this OS LUN during installaltion. Any additional LUNs
> can be easily added post-install and should not be enabled at install time
> (here, for testing and to be on the safe side).
> (I think that's easier compared to disabling them on the SAN side ...)

That would require disabling zfcp auto lun scan during installation [zfcp.allow_lun_scan=0]. I intentionally did not suggest this as changing the host-mapping of volumes on the storage is cleaner and does not change the scanning behavior of Linux. That said, it would be an option.

> And make sure the parm file is in the correct encoding (fix length "F 80" or
> variable, "Trunc=80"):
> PARMFILE UBUNTU   O1  F 80  Trunc=80 Size=4 Line=0 Col=1 Alt=0