Button in subiquity network config dialog says "Continue without network", even if NIC is configured and shown

Bug #1854965 reported by Frank Heimes
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ubuntu on IBM z Systems
Fix Released
High
Canonical Foundations Team
subiquity
Fix Released
Undecided
Unassigned

Bug Description

This "Continue without network" option/button at the Network connections screen is a bit misleading,
since I see "Continue without network" even if the NIC is configured (and even displayed):

================================================================================
Network connections [ Help ]
================================================================================
  Configure at least one interface this server can use to talk to other
  machines, and which preferably provides sufficient access for updates.
                                                                           ^
    [ enP2p0s0d1 eth - > ]
      disabled autoconfiguration failed
      82:0d:2d:0c:b7:01 / Mellanox Technologies / MT27500/MT27520 Family
     [ConnectX-3/ConnectX-3 Pro Virtual Function]

    [ encc000 eth - > ]
      2a:57:f5:89:64:33 / Unknown Vendor / Unknown Model

    [ encc000.2653 vlan - > ]
      static 10.245.236.14/24
      VLAN 2653 on interface encc000
                                                                           │
    [ Create bond > ] v

                            [ Continue without network ]
                            [ Back ]

I don't see that every time - sometimes I just see a "Done".
And if I just see "Done", I usually recognize a flickering before (something refreshing in the background or so - maybe a subiquity restart ?!) and I will end up again at the initial installer screen (see the two attached screenshots).

If I proceed with "Continue without network" the installation completed,
but after the reboot and an "apt update" I see some updateable packages.

Revision history for this message
Frank Heimes (fheimes) wrote :
Changed in ubuntu-z-systems:
importance: Undecided → Medium
assignee: nobody → Canonical Foundations Team (canonical-foundations)
Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote : Re: [Bug 1854965] [NEW] Button in subiquity network config dialog says "Continue without network", even if NIC is configured and shown

Which version of subiquity is this? 19.11.1 fixed some bugs in this area.

Revision history for this message
Frank Heimes (fheimes) wrote :

I've used the latest Focal daily-live from Dec 2nd.
(What's a good way to figure out the subiquity version that's in?)

Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote : Re: [Bug 1854965] Re: Button in subiquity network config dialog says "Continue without network", even if NIC is configured and shown

On Wed, 4 Dec 2019 at 12:35, Frank Heimes <email address hidden>
wrote:

> I've used the latest Focal daily-live from Dec 2nd.
>

Hmm.

> (What's a good way to figure out the subiquity version that's in?)
>

It's in "about this installer" in the help menu.

Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

Oh, and if you are getting installer restarts there should be files in
/var/crash. Can you extract them and get them to me somehow?

>

Revision history for this message
Frank Heimes (fheimes) wrote :

I just checked the version via the help menu, it seems to be:
"This is version 19.11.1 of the installer."

And I see a couple of crash files:
root@ubuntu-server:/var/crash# ls installer.1575387908.716259003.block_probe_fail.meta installer.1575387908.724575281.block_probe_fail.crash installer.1575387908.724575281.block_probe_fail.meta installer.1575387908.728675842.disk_probe_fail.meta installer.1575387908.736153364.disk_probe_fail.crash installer.1575387908.736153364.disk_probe_fail.meta installer.1575387908.760161877.block_probe_fail.crash installer.1575387908.760161877.block_probe_fail.meta installer.1575387908.771270037.disk_probe_fail.crash installer.1575387908.771270037.disk_probe_fail.meta installer.1575387912.601643562.block_probe_fail.crash installer.1575387912.601643562.block_probe_fail.meta installer.1575387912.611478090.disk_probe_fail.crash installer.1575387912.611478090.disk_probe_fail.meta installer.1575388002.636119604.block_probe_fail.crash installer.1575388002.636119604.block_probe_fail.meta installer.1575388002.657082558.block_probe_fail.crash installer.1575388002.657082558.block_probe_fail.meta

But it's a bit difficult to copy these files over.
Initially I thought that the network configuration that's done in the parmfile will be picked up by the installer, so that it already has a network access while running in UI mode, but that wasn't the case.
So I configured the network manually, but it wasn't enabled immediately - I couldn't connect to that LPAR (that was still in install mode), nor I was able to connect (from Exit to shell) to any other system. [I probably should have setup the network from the shell and not the UI, will re-try that later...]]
Hence I completed the installation and rebooted - hoping to still find the files.
There is as usual the /var/log/installer stuff - that I attached here, but the crash (and meta) files were gone (wouldn't it be good to copy them over at the end of an installation to /var/log/installer ?)
Anyway, I hope that the /var/log/installer files are helpful.

Revision history for this message
Frank Heimes (fheimes) wrote :

I finally managed to copy over the crash files.
At the end of the installation I entered the installer shell and was able to ping another system but ssh did not really work:

root@ubuntu-server:/# ping -c 3 <IP>
PING <IP> (<IP>) 56(84) bytes of data.
64 bytes from <IP>: icmp_seq=1 ttl=0 time=0.231 ms
64 bytes from <IP>: icmp_seq=2 ttl=0 time=0.236 ms
64 bytes from <IP>: icmp_seq=3 ttl=0 time=0.264 ms

--- <IP> ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2042ms
rtt min/avg/max/mdev = 0.231/0.243/0.264/0.014 ms
root@ubuntu-server:/#

root@ubuntu-server:/# scp -r /var/crash ubuntu@10.245.236.15:/home/ubuntu
The authenticity of host '<IP> (<IP>)' can't be established.
ECDSA key fingerprint is SHA256:Ju/VrPcnoieelZXY4gHgG/Zh/KnNgf+AN9nx4nlY1bm1w.
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
<no response here anymore - so Ctrl+c >

I think I reapplied netplan (w/o any changes in the netplan yaml) and it worked.

See the crash and meta files attached.

Revision history for this message
Frank Heimes (fheimes) wrote :

after extracting some of the crash files and looking at the Dmesg files, it looks like there is a severe crash happening:

[ 148.303944] User process fault: interruption code 0001 ilc:1 in libstdc++.so.6.0.21[3ffae130000+9000]
[ 148.303952] CPU: 2 PID: 1972 Comm: python3 Not tainted 5.3.0-18-generic #19-Ubuntu
[ 148.303953] Hardware name: IBM 2964 N63 400 (LPAR)
[ 148.303955] User PSW : 0705200180000000 000003ffae13252a
[ 148.303956] R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:1 AS:0 CC:2 PM:0 RI:0 EA:3
[ 148.303958] User GPRS: 00000000000000c8 000003ffae132528 000003ff9c0fef60 000003ff9c0008d0
[ 148.303959] 000003ffae2e3466 000003ff9c0114c0 000000000085f839 000003ff9c03bac0
[ 148.303960] 000003ff9c0127f8 000003ff9c09c5a8 0000000000000001 000003ff9c0127f0
[ 148.303961] 000003ffae3d5518 000003ff9c0127f0 000003ffae2e34ec 000003ffa6fb8d58
[ 148.303969] User Code: 000003ffae132524: ae132500 sigp %r1,%r3,1280(%r2)
                         #000003ffae132528: 0000 illegal
                         >000003ffae13252a: 03ff unknown
                          000003ffae13252c: ae0275a8 sigp %r0,%r2,1448(%r7)
                          000003ffae132530: 0000 illegal
                          000003ffae132532: 03ff unknown
                          000003ffae132534: ae0275c0 sigp %r0,%r2,1472(%r7)
                          000003ffae132538: 0000 illegal
[ 148.303978] Last Breaking-Event-Address:
[ 148.303981] [<000003ffae2e34ea>] 0x3ffae2e34ea

Frank Heimes (fheimes)
Changed in ubuntu-z-systems:
importance: Medium → High
tags: added: req4focal
Revision history for this message
Frank Heimes (fheimes) wrote :

It looks like this is fixed due to several changes in the installer (19.12.2+git6.b024f284).
Looks like an automatic network config is now always tried, but if it's not possible (as in my case), the network card stays unconfigured. I can then configure it (manually) and activate it afterwards.
Hence this seems to be solved for me - and I don't see the misleading msgs about 'configure w/o network'...

Revision history for this message
Frank Heimes (fheimes) wrote :

I tried this again carefully today and think the originally reported problem is solved.

Subiquity now correctly reports that the network is not up by "Continue without network" (since subiquity seems to try a automatic network configuration by default that is not possible in my case/environment), allows to configure the network (what I did statically), I can then proceed with "Save" and "Done" and the network is properly configured, up and running. So regarding this flow everything is fine and this ticket can be closed.

However, all the network configuration data was already passed over via the parm-file.
So it would be good and convenient if it's just picked up by subiquity and automatically activated, so that this second static config becomes obsolete ...

Changed in ubuntu-z-systems:
status: New → Fix Released
Revision history for this message
Frank Heimes (fheimes) wrote :

Btw. later in the installation process I ran into another severe problem (independent from this) that I separately reported in LP 1857042:
https://bugs.launchpad.net/bugs/1857042

Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

Pretty sure this issue is now fixed so closing from the subiquity side too.

Changed in subiquity:
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.