subiquity crashes if netplan apply fails

Bug #1868712 reported by Frank Heimes
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ubuntu on IBM z Systems
Fix Released
High
Canonical Foundations Team
subiquity
Fix Released
Undecided
Unassigned

Bug Description

By accident the (external) network of my s390x system was down, hence I decided during a subiquity installation to choose 'Continue without network', expecting that I'm still able to complete the installation, but subiquity crashed.

Then netplan apply failed, and crashes the UI:

Traceback:
 Traceback (most recent call last):
   File "/snap/subiquity/1570/lib/python3.6/site-packages/subiquity/core.py", line 165, in run
     super().run()
   File "/snap/subiquity/1570/lib/python3.6/site-packages/subiquitycore/core.py", line 680, in run
     self.urwid_loop.run()
   File "/snap/subiquity/1570/usr/lib/python3/dist-packages/urwid/main_loop.py", line 286, in run
     self._run()
   File "/snap/subiquity/1570/usr/lib/python3/dist-packages/urwid/main_loop.py", line 384, in _run
     self.event_loop.run()
   File "/snap/subiquity/1570/usr/lib/python3/dist-packages/urwid/main_loop.py", line 1484, in run
     reraise(*exc_info)
   File "/snap/subiquity/1570/usr/lib/python3/dist-packages/urwid/compat.py", line 58, in reraise
     raise value
   File "/snap/subiquity/1570/usr/lib/python3.6/asyncio/events.py", line 145, in _run
     self._callback(*self._args)
   File "/snap/subiquity/1570/lib/python3.6/site-packages/subiquitycore/async_helpers.py", line 25, in _done
     fut.result()
   File "/snap/subiquity/1570/lib/python3.6/site-packages/subiquitycore/controllers/network.py", line 410, in _apply_config
     await arun_command(['netplan', 'apply'], check=True)
   File "/snap/subiquity/1570/lib/python3.6/site-packages/subiquitycore/utils.py", line 85, in arun_command
     raise subprocess.CalledProcessError(proc.returncode, cmd)
 subprocess.CalledProcessError: Command '['netplan', 'apply']' returned non-zero exit status 1.

Revision history for this message
Frank Heimes (fheimes) wrote :
Changed in ubuntu-z-systems:
importance: Undecided → High
assignee: nobody → Canonical Foundations Team (canonical-foundations)
Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

I think this is the requests library misbehaving, it's only supposed to raise subclasses of requests.exceptions.RequestException but clearly sometimes it doesn't. We can make our except clauses broader I guess.

Frank Heimes (fheimes)
Changed in ubuntu-z-systems:
status: New → Triaged
Revision history for this message
Dimitri John Ledkov (xnox) wrote :

I guess reproducer is static ip configuration with invalid DNS server

Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

The tracebacks in the description are not what is crashing the UI, that seems to be because netplan apply is failing. That shouldn't be crashing the UI either!

description: updated
summary: - subiquity crashes if no network connection is in place, even if before
- 'Continue w/o nw' was chosen
+ subiquity crashes if netplan apply fails
Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

Here is the netplan that is causing the failure:

 network:
   ethernets: {}
   version: 2
   vlans:
     encc000.2653:
       id: 2653
       link: encc000
       nameservers:
         addresses:
         - 10.245.236.1

this does indeed look bogus: it has a vlan, but no definition for the physical nic.

Revision history for this message
Frank Heimes (fheimes) wrote :

Yes that looks strange, since it does not have the physical base device, but also IP address and gateway (gateway4) are missing.

The /etc/netplan/50-cloud-init.yaml files that I largely saw during my installations looked like this:

# This file is generated from information provided by the datasource. Changes
# to it will not persist across an instance reboot. To disable cloud-init's
# network configuration capabilities, write a file
# /etc/cloud/cloud.cfg.d/99-disable-network-config.cfg with the following:
# network: {config: disabled}
network:
    version: 2
    ethernets:
        encc000.2653:
            addresses:
            - 10.245.236.15/24
            gateway4: 10.245.236.1
            match:
                macaddress: 36:5e:5e:ce:a0:a0
            nameservers:
                addresses:
                - 10.245.236.1
            set-name: encc000.2653

They have at least an ip address and a gateway.

However, a fully correct one should be close to this:

01-netcfg.yaml
network:
    version: 2
    renderer: networkd
    ethernets:
        encc000:
            dhcp4: no
            dhcp6: no
    vlans:
        encc000.2653:
            link: encc000
            id: 2653
            addresses:
            - 10.245.236.15/24
            gateway4: 10.245.236.1
            nameservers:
                addresses:
                - 10.245.236.1

Changed in subiquity:
status: New → Fix Committed
Revision history for this message
Dimitri John Ledkov (xnox) wrote :

I wonder if initrd generated wrong .yaml, because i think in vlan case it does not emit "dhcp4: no\n dhcp6: no" lines, but that should be default.

Frank Heimes (fheimes)
Changed in ubuntu-z-systems:
status: Triaged → Fix Committed
Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

The point of this bug vs all the other similar ones was that netplan apply crashing should not crash the UI, that's fixed now. Hopefully all the other network woes are being tracked in other bugs.

Revision history for this message
Frank Heimes (fheimes) wrote :

With image from Apr 14th I didn't ran into this anymore.
So looks like this is fixed now - thx.

Changed in ubuntu-z-systems:
status: Fix Committed → Fix Released
Changed in subiquity:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.