Comment 7 for bug 2009141

Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2023-03-13 08:25 EDT-------
(In reply to comment #11)
> We don't have a 32Gbit FCP adapter to test with.

I doubt this is related to a particular hardware as the HBAs have the same programming interface to the zfcp device driver.

(In reply to comment #15)
> Yes, I understood that the installer is crashing here
> and that the ssh connection might have been dropped.

I'm not sure the ssh connection dropped, as I see the installer TUI and installer debug data after the reported error, so I suppose the connection remained.

> But one can usually reconnect via ssh,
> go via the top-right menu to the installer shell

>(or I think via Alt-F2),

There are no Linux Virtual Terminals on s390, so I guess that key combo won't work.

> and navigate to the folders /var/crash and /var/log and even pack and scp
> them - that was the hope.

(In reply to comment #6)
> Installation of Ubuntu 22.04 on s390x failed with an unknown error just
> after having successfully activated a zfcp HBA with Fibre-Channel-attached
> SCSI disks.
>
> I do see 0.0.100d successfully being online and having paths attached:

> [host.ilabg13_9.11.116.213_ubuntu-22.04_ssh_installer.txt lines 41-41/9373
> byte 238696/1146881 21%]

The attachment should already contain a lot of debug data.

It was collected with the script tool recording the ssh session during the installation attempt. Unfortunately, there is no timing information so "scriptreplay" does not work. The 'best' thing I could come up is using "less -R" to let at least render the ansi color escape sequences instead of cluttering the output with escaped escape sequences. It's somewhat readable (with a column offset, though; and super slow to render with massive cpu consumption due to one super long line being continuously wrapped by the less pager) with 167 columns for me after trial and error resizing of my terminal.

> But immediatelly after that, the installer reports an error:
>
> An error occurred during installation

> ??????????????????????????????????????????????????????????????????????????
> ? Sorry, an unknown error occurred. ?
> ? Information is being collected from the system that will help the ?
> ? developers diagnose the report. |

> ? [ View full report ] ?
> ? If you want to help improve the installer, you can send an error ?
> ? report. ?
> ? [ Send to Canonical ] ?
> ? [ Close report ] |
>
> Next "View full report" was selected?

The attached file contains the installer debug data starting at line 42.

I hope this is the same information as would be in "Send to Canonical", which is not necessarily an option on s390 potentially not having a sufficient internet connection for security reasons.

> ProblemType: Bug
> Architecture: s390x
> CrashDB: {'impl': 'launchpad', 'project': 'subiquity'}
> CurrentDmesg:

> Date: Wed Jan 18 21:04:03 2023
> DistroRelease: Ubuntu 22.04
> ExecutablePath:
> /snap/subiquity/3699/lib/python3.8/site-packages/subiquity/cmd/server.py
> InstallerServerLog:
> 2023-01-18 20:23:57,901 INFO subiquity:112 Starting Subiquity server
> revision 3699

> 2023-01-18 21:04:03,114 ERROR subiquity.server.server:416 top level error
> Traceback (most recent call last):
> File "/snap/subiquity/3699/usr/lib/python3.8/asyncio/events.py", line 81, in _run
> self._context.run(self._callback, *self._args)
> File
> "/snap/subiquity/3699/lib/python3.8/site-packages/subiquity/server/controllers/filesystem.py", line 506, in _udev_event
> action, dev = self._monitor.receive_device()
> File
> "/snap/subiquity/3699/lib/python3.8/site-packages/pyudev/monitor.py", line 397, in receive_device
> device = self.poll()
> File
> "/snap/subiquity/3699/lib/python3.8/site-packages/pyudev/monitor.py", line 357, in poll
> if eintr_retry_call(poll.Poll.for_events((self, 'r')).poll, timeout):
> File "/snap/subiquity/3699/lib/python3.8/site-packages/pyudev/_util.py", line 163, in eintr_retry_call
> return func(*args, **kwargs)
> File
> "/snap/subiquity/3699/lib/python3.8/site-packages/pyudev/_os/poll.py", line 97, in poll
> return list(self._parse_events(eintr_retry_call(self._notifier.poll, timeout)))
> File
> "/snap/subiquity/3699/lib/python3.8/site-packages/pyudev/_os/poll.py", line 112, in _parse_events
> raise IOError('Error while polling fd: {0!r}'.format(fd))
> OSError: Error while polling fd: 20
> 2023-01-18 21:04:03,116 DEBUG subiquitycore.common.errorreport:384 generating crash report
> 2023-01-18 21:04:03,116 INFO subiquitycore.common.errorreport:406 saving crash report 'unknown error crashed with OSError' to /var/crash/1674075843.116781473.unknown.crash

Ah, I missed that: "/var/crash/1674075843.116781473.unknown.crash"

In contrast to my expectations, /var/crash/ of Ubuntu can contain debug data which are not kdumps.

Frank, does such file really contain additional information, which is not already in the attached installer debug data?

If this installer debug data is not sufficient, I would consider it an independent bug that should be fixed.

@<email address hidden>, any chance you could get such additional debug information?

Frank, could we nonetheless involve a subiquity developer to help understand the python traceback in order to get closer to a root cause?

> InstallerServerLogInfo:

Package: subiquity 3699
ProbeData:

ProcCpuinfo:

SnapChannel:
SnapRevision: 3699
SnapUpdated: False
SnapVersion: 22.07.2
SourcePackage: subiquity
Title: unknown error crashed with OSError
Traceback:

==> the same as above

> UdevDb:

Uname: Linux 5.15.0-43-generic s390x

> Canonical, why did the installer get an error?
>
> Does the installer really have a busy(!) waiting loop calling udevadm settle
> with zero timeout?

Maybe it's not busy and instead driver by a poll/select loop (with its own timeout/sleep) in the installer and therefore called with zero timeout.

> But even if so, with the number of discovered devices and the settle finally
> returning with success errorlevel 0, it should just work?