Have to netplan apply again, after reboot

Bug #1996007 reported by liwbj@cn.ibm.com
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ubuntu on IBM z Systems
New
Undecided
Unassigned
subiquity
New
Undecided
Unassigned

Bug Description

I am trying to install ubuntu-22.04.1-live-server-s390x.iso on our s390 env.
And I am using vlan to set the network, the installation process looks good.

But after I reboot the OS, I found we can not ssh it, and lost all of route setting.
Have to input the command of netplan apply again to set route info again.

If you need more info, please let me know.

ubuntu@ubut13:~$ sudo netplan --debug apply
[sudo] password for ubuntu:
** (generate:2388): DEBUG: 02:35:03.714: starting new processing pass
** (generate:2388): DEBUG: 02:35:03.714: enc1000.1300: adding new route
** (generate:2388): DEBUG: 02:35:03.714: We have some netdefs, pass them through a final round of validation
** (generate:2388): DEBUG: 02:35:03.714: enc1000: setting default backend to 1
** (generate:2388): DEBUG: 02:35:03.714: Configuration is valid
** (generate:2388): DEBUG: 02:35:03.714: enc1000.1300: setting default backend to 1
** (generate:2388): DEBUG: 02:35:03.714: Configuration is valid
** (generate:2388): DEBUG: 02:35:03.715: Generating output files..
** (generate:2388): DEBUG: 02:35:03.715: openvswitch: definition enc1000 is not for us (backend 1)
** (generate:2388): DEBUG: 02:35:03.715: NetworkManager: definition enc1000 is not for us (backend 1)
** (generate:2388): DEBUG: 02:35:03.715: openvswitch: definition enc1000.1300 is not for us (backend 1)
** (generate:2388): DEBUG: 02:35:03.715: NetworkManager: definition enc1000.1300 is not for us (backend 1)
DEBUG:netplan generated networkd configuration changed, reloading networkd
DEBUG:enc1000 not found in {}
DEBUG:enc1000.1300 not found in {}
DEBUG:Merged config:
network:
  ethernets:
    enc1000:
      accept-ra: false
      dhcp4: false
      dhcp6: false
  renderer: networkd
  version: 2
  vlans:
    enc1000.1300:
      addresses:
      - 10.20.103.63/24
      id: 1300
      link: enc1000
      nameservers:
        addresses:
        - 10.20.0.2
      routes:
      - to: default
        via: 10.20.103.254

DEBUG:no netplan generated NM configuration exists
DEBUG:enc1000 not found in {}
DEBUG:enc1000.1300 not found in {}
DEBUG:Merged config:
network:
  ethernets:
    enc1000:
      accept-ra: false
      dhcp4: false
      dhcp6: false
  renderer: networkd
  version: 2
  vlans:
    enc1000.1300:
      addresses:
      - 10.20.103.63/24
      id: 1300
      link: enc1000
      nameservers:
        addresses:
        - 10.20.0.2
      routes:
      - to: default
        via: 10.20.103.254

DEBUG:Link changes: {}
DEBUG:netplan triggering .link rules for lo
DEBUG:netplan triggering .link rules for enc1000
DEBUG:netplan triggering .link rules for enc1000.1300
** (process:2386): DEBUG: 02:35:04.072: starting new processing pass
** (process:2386): DEBUG: 02:35:04.072: enc1000.1300: adding new route
** (process:2386): DEBUG: 02:35:04.072: We have some netdefs, pass them through a final round of validation
** (process:2386): DEBUG: 02:35:04.072: enc1000: setting default backend to 1
** (process:2386): DEBUG: 02:35:04.072: Configuration is valid
** (process:2386): DEBUG: 02:35:04.072: enc1000.1300: setting default backend to 1
** (process:2386): DEBUG: 02:35:04.072: Configuration is valid
DEBUG:enc1000 not found in {}
DEBUG:enc1000.1300 not found in {}
DEBUG:Merged config:
network:
  ethernets:
    enc1000:
      accept-ra: false
      dhcp4: false
      dhcp6: false
  renderer: networkd
  version: 2
  vlans:
    enc1000.1300:
      addresses:
      - 10.20.103.63/24
      id: 1300
      link: enc1000
      nameservers:
        addresses:
        - 10.20.0.2
      routes:
      - to: default
        via: 10.20.103.254

liwbj@cn.ibm.com (liwbj)
affects: subiquity → ubuntu-z-systems
Frank Heimes (fheimes)
tags: added: installer s390x
Changed in ubuntu-z-systems:
assignee: nobody → Skipper Bug Screeners (skipper-screen-team)
Revision history for this message
Frank Heimes (fheimes) wrote :

Hi 'liwbj',
I see some advanced configurations like openvswitch, special routes, etc.
Usually the installer is supposed to deal with the initial and more basic setup,
and it's recommended to perform further and more advanced configurations after the installation got completed.

So if you follow the steps here:
https://ubuntu.com/server/docs/install/s390x-lpar
especially the basic network configuration for a VLAN environment - search for "Proceed with the interactive network configuration, here in this case in a VLAN environment"
and then just accept the network config in the subiquity network config screen as it is,
your system should come up after the post-install reboot.

Please could you confirm this?

If that's the case I recommend to create a backup of the existing (basic) netplan yaml file:
/etc/netplan/00-installer-config.yaml
perform your specific modifications
verify your netplan modifications using 'dryrun':
sudo netplan apply --dryrun
and if no problems are reported, apply it finally with re-running the command, just w/o 'dryrun':
sudo netplan apply

(And yes, always use the latest point release of an Ubuntu LTS, like you did here by using 'ubuntu-22.04.1-live-server-s390x.iso'.)

Revision history for this message
liwbj@cn.ibm.com (liwbj) wrote :

First of all, about the gateway4. I follow the installation process set the gateway4. But when I network apply it, the message of '`gateway4` has been deprecated, use default routes instead.
See the 'Default routes' section of the documentation for more details.' Popped up.

I just follow the standard the process, why it pop up? I have to manually modify the netplan file.
It's very confuse, not sure I need open an other defect for this problem, just let you know fistly.

Revision history for this message
liwbj@cn.ibm.com (liwbj) wrote :
Download full text (3.7 KiB)

Hi Frank,

I followed above steps and did not install any pakcages, just keep it as a simple server.
But got the same result, after the reboot, lost the route info.
And sudo netplan apply --dryrun did not pop any prblems,

This is some runlog.

Welcome to Ubuntu 22.04.1 LTS (GNU/Linux 5.15.0-52-generic s390x)
 * Documentation: https://help.ubuntu.com
 * Management: https://landscape.canonical.com
 * Support: https://ubuntu.com/advantage
  System information as of Fri Nov 11 02:53:35 AM UTC 2022
  System load: 0.02783203125 Memory usage: 6% Processes: 538
  Usage of /: 9.0% of 15.60GB Swap usage: 0% Users logged in: 0
41 updates can be applied immediately.
To see these additional updates run: apt list --upgradable
Failed to connect to https://changelogs.ubuntu.com/meta-release-lts. Check your Internet connection or proxy settings
Last login: Fri Nov 11 02:50:21 UTC 2022 from 10.20.92.70 on pts/0
[?2004hubuntu@a90ubtu24:~$
ping -c 5 10.20.103.64
[?2004l
ping: connect: Network is unreachable
[?2004hubuntu@a90ubtu24:~$
route
[?2004l
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
[?2004hubuntu@a90ubtu24:~$
ip addr
[?2004l
1: lo: mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: enc1000: mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 02:ca:72:80:04:39 brd ff:ff:ff:ff:ff:ff
3: enc1000.1300@enc1000: mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 02:ca:72:80:04:39 brd ff:ff:ff:ff:ff:ff
    inet 10.20.103.69/24 brd 10.20.103.255 scope global enc1000.1300
       valid_lft forever preferred_lft forever
[?2004hubuntu@a90ubtu24:~$
sudo netplan apply --dryrun
[?2004l
[sudo] password for ubuntu:
[?2004hubuntu@a90ubtu24:~$
sudo netplan apply
[?2004l
[?2004hubuntu@a90ubtu24:~$
route
[?2004l
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
default _gateway 0.0.0.0 UG 0 0 0 enc1000.1300
10.20.103.0 0.0.0.0 255.255.255.0 U 0 0 0 enc1000.1300
[?2004hubuntu@a90ubtu24:~$
ip addr
[?2004l
1: lo: mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: enc1000: mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether 02:ca:72:80:04:39 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::ca:72ff:fe80:439/64 scope link
       valid_lft forever preferred_lft forever
3: enc1000.1300@enc1000: mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 02:ca:72:80:04:39 brd ff:ff:ff:ff:ff:ff
    inet 10.20.103.69/24 brd 10.20.103.255 scope global enc1000.1300
       valid_lft forever preferred_lft forever
    inet6 fe80::ca:72ff:fe80:439/64 scope link
      ...

Read more...

Revision history for this message
Frank Heimes (fheimes) wrote (last edit ):

Well, that's interesting, since I'm doing installations nearly on a day-to-day basis - interactive (using subiquity, I think that's what you do, too) and non-interactive (using autoinstall) and didn't ran into this so far - and are also not able to recreate this.

Are you doing the installation in a system that has internet connectivity during install?
In other words, are any potential updates applied during the installation or not?

And which package version of netplan.io is installed?
$ apt-cache policy netplan.io

The interface name tells me that you are using OSA adapters as network devices and I think you are using a common VLAN id 1300 (non of the reserved ones), so in this regard everything seems to be fine.

Is there anything special in your system - is it a DPM or a PR/SM system? Which hw generation is it?
Did you used any special kernel options (like needed for autoinstall basic network config setup)?

Is it possible to share the content of /var/log/syslog, trying to find more hints about what could have happened?
(And is /var/crash still empty in your case/system)?
And is it an LPAR, z/VM or KVM installation?

Revision history for this message
liwbj@cn.ibm.com (liwbj) wrote :
Download full text (65.9 KiB)

>Are you doing the installation in a system that has internet connectivity during install?
Yes, I set proxy server, and make sure installation process can access internet.

>In other words, are any potential updates applied during the installation or not?
Not sure for that, how can I confirm that? Need the full runlog?

ubuntu@a90ubtu24:~$ apt-cache policy netplan.io
netplan.io:
  Installed: 0.104-0ubuntu2.1
  Candidate: 0.104-0ubuntu2.1
  Version table:
 *** 0.104-0ubuntu2.1 500
        500 http://ports.ubuntu.com/ubuntu-ports jammy-updates/main s390x Packages
        100 /var/lib/dpkg/status
     0.104-0ubuntu2 500
        500 http://ports.ubuntu.com/ubuntu-ports jammy/main s390x Packages
ubuntu@a90ubtu24:~$

>I think you are using a common VLAN id 1300 (non of the reserved ones)
Yes, we are using internal network and trunk mode adatper.

>Is there anything special in your system - is it a DPM or a PR/SM system?
Yes, we are DPM solution test team, we are using DPM R5.1 version.

>Did you used any special kernel options (like needed for autoinstall basic network config setup)?
No, I do not think so.

ubuntu@a90ubtu24:~$ cat /var/log/syslog
Nov 13 00:00:06 a90ubtu24 systemd[1]: rsyslog.service: Sent signal SIGHUP to main process 1853 (rsyslogd) on client request.
Nov 13 00:00:06 a90ubtu24 systemd[1]: logrotate.service: Deactivated successfully.
Nov 13 00:00:06 a90ubtu24 systemd[1]: Finished Rotate log files.
Nov 13 00:10:06 a90ubtu24 rsyslogd: [origin software="rsyslogd" swVersion="8.2112.0" x-pid="1853" x-info="https://www.rsyslog.com"] rsyslogd was HUPed
Nov 13 00:17:01 a90ubtu24 CRON[72507]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly)
Nov 13 00:27:31 a90ubtu24 systemd-timesyncd[1721]: Timed out waiting for reply from 91.189.94.4:123 (ntp.ubuntu.com).
Nov 13 00:27:41 a90ubtu24 systemd-timesyncd[1721]: Timed out waiting for reply from 185.125.190.56:123 (ntp.ubuntu.com).
Nov 13 00:27:52 a90ubtu24 systemd-timesyncd[1721]: Timed out waiting for reply from 185.125.190.57:123 (ntp.ubuntu.com).
Nov 13 00:28:02 a90ubtu24 systemd-timesyncd[1721]: Timed out waiting for reply from 185.125.190.58:123 (ntp.ubuntu.com).
Nov 13 00:28:12 a90ubtu24 systemd-timesyncd[1721]: Timed out waiting for reply from 91.189.91.157:123 (ntp.ubuntu.com).
Nov 13 00:31:06 a90ubtu24 systemd[1]: Starting Message of the Day...
Nov 13 00:36:06 a90ubtu24 systemd[1]: motd-news.service: Deactivated successfully.
Nov 13 00:36:06 a90ubtu24 systemd[1]: Finished Message of the Day.
Nov 13 00:42:06 a90ubtu24 systemd[1]: Starting Ubuntu Advantage Timer for running repeated jobs...
Nov 13 00:42:06 a90ubtu24 systemd[1]: ua-timer.service: Deactivated successfully.
Nov 13 00:42:06 a90ubtu24 systemd[1]: Finished Ubuntu Advantage Timer for running repeated jobs.
Nov 13 01:02:31 a90ubtu24 systemd-timesyncd[1721]: Timed out waiting for reply from 185.125.190.58:123 (ntp.ubuntu.com).
Nov 13 01:02:41 a90ubtu24 systemd-timesyncd[1721]: Timed out waiting for reply from 91.189.91.157:123 (ntp.ubuntu.com).
Nov 13 01:02:51 a90ubtu24 systemd-timesyncd[1721]: Timed out waiting for reply from 91.189.94.4:123 (ntp.ubuntu.com).
Nov 13 01:03:01 a90ubtu24 systemd-timesyncd[1721]: Timed ou...

Revision history for this message
liwbj@cn.ibm.com (liwbj) wrote :

Not sure what it is, but I found a job is waiting until time out.
This is a part of installation process log from the console.

M[K[ [0;31m*[0;1;31m*[0m[0;31m*[0m] A start job is running for Wait unt d is fully seeded (45s / no limit)
/
M[K[ [0;31m*[0;1;31m*[0m] A start job is running for Wait unt d is fully seeded (46s / no limit)
-
M[K[ [0;31m*[0m] A start job is running for Wait unt d is fully seeded (46s / no limit)
\
M[K[ [0;31m*[0;1;31m*[0m] A start job is running for Wait unt d is fully seeded (47s / no limit)
|
M[K[ [0;31m*[0;1;31m*[0m[0;31m*[0m] A start job is running for Wait unt d is fully seeded (47s / no limit)
/
M[K[ [0;31m*[0;1;31m*[0m[0;31m* [0m] A start job is running for Wait unt d is fully seeded (48s / no limit)
-
M[K[ [0;31m*[0;1;31m*[0m[0;31m* [0m] A start job is running for Wait unt d is fully seeded (48s / no limit)
\
M[K[[0;31m*[0;1;31m*[0m[0;31m* [0m] A start job is running for Wait unt d is fully seeded (49s / no limit)
|
M[K[[0;1;31m*[0m[0;31m* [0m] A start job is running for Wait unt d is fully seeded (49s / no limit)
/
M[K[[0m[0;31m* [0m] A start job is running for Wait unt d is fully seeded (50s / no limit)
-

Revision history for this message
Frank Heimes (fheimes) wrote :
Download full text (4.7 KiB)

Hi liwbj,
that looks all good - your system got and installed the updates during the installation.
netplan package version is correct and also all the log messages are reasonable (from comment #5: some facilities are not reachable, probably because your proxy does not allow them, which is fine; from comment #6: there is a seeding ongoing that needs to be completed, also normal, if it occurs it takes a just while).

I just did an interactive 22.04.1 installation on our z15 (actually a L1III) that is a DPM system (but I don't know which version, where can I find it? Haven't seen it in 'System Information') and it booted nicely after the install and I could immediately login from remote afterwards, hence the netplan was properly applied.

So I am not able to recreate your case...

Just a few things to double check (and please be careful, because by coincidence my OSA device address is '1300', which is in your case the same as your VLAN Id).

How is your OSA device configured in DPM?
(mine is configured like shown in the attached screenshot).

Have you specified your OSA address at the 'zdev to activate' step (here my '1300'):
--------%<----------------%<----------------%<----------------%<--------
Attempt interactive netboot from a URL?
yes no (default yes):
yes
Available qeth devices:
lszdev: No device was selected!

zdev to activate (comma separated, optional):
1300
[ 114.051390] lcs: Loading LCS driver
A manual update of the initial RAM-disk is required.
QETH device 0.0.1300:0.0.1301:0.0.1302 configured
Note: The initial RAM-disk must be updated for these changes to take effect:
       - QETH device 0.0.1300:0.0.1301:0.0.1302
Two methods available for IP configuration:
  * static: for static IP configuration
  * dhcp: for automatic IP configuration
static dhcp (default 'dhcp'):
--------%<----------------%<----------------%<----------------%<--------
(That is just out of curiosity, since devices are on DPM systems usually auto-enabled).

Was your OSA device listed at the zdev screen, like this:

--------%<----------------%<----------------%<----------------%<--------
▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄
  Zdev setup [ Help ]
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
  ID ONLINE NAMES

  qeth
  0.0.1300:0.0.1301:0.0.1302 auto enc1300 ▸

                                 [ Continue ]
                                 [ Back ]

--------%<----------------%<----------------%<----------------%<-------
(Btw. there is nothing you need to do at that screen - since we are on a DPM system.)

And where your interfaces properly listed at the network connections...

Read more...

Revision history for this message
liwbj@cn.ibm.com (liwbj) wrote :

This is boot from FTP server and installation log from OSM.

Revision history for this message
liwbj@cn.ibm.com (liwbj) wrote :

This is installation screen shot.

Revision history for this message
liwbj@cn.ibm.com (liwbj) wrote :

This is after installation, directly boot.
At this moment, IP setting is workable.

Revision history for this message
liwbj@cn.ibm.com (liwbj) wrote :

This is that I change boot from Disk(Boot volume).
Now, after OS started, I have to netplan apply for reset IP.

Hi Frank,
Looks like there is a job time out here. Could you help me take a look, thank you.

Revision history for this message
Frank Heimes (fheimes) wrote :

Hello Wei WA,
such messages during boot up like "A start job is running for wait for ..."
are because a (systemd-) service takes some time to be completed.
It's probably systemd-networkd, which just may need some time - and uses 'systemd-networkd-wait-online' (maybe in combination with cloud-init, but after the post-install reboot cloud-init should be disabled, indicated by the existence of this file '/etc/cloud/cloud-init.disabled').
In case of multiple interfaces, and some without connectivity, it may wait up to 2 mins for a connection - there are different ways to prevent this potential delay ...
I see and have these messages also on my system, but there should be no harm (other than a some delay), except:
- if things break or crash while being executed
  (in this case you should find a related crash file in /var/crash)
- or if the network changes significantly
  for example you had devices during the installation that were configured,
  but are no longer available (for example in case they were removed from the LPAR config in DPM)
  In this case the yaml need to be cleaned-up or at least 'optional: true' added
  to the(temporarily) missing interface in the yaml.
- or - sometimes in case of ipv6 is enabled in your environment (saw that once) -
  there can be an issue getting network config bits fast enough (e.g. in case of 'dhcpv6').
  To figure that out, try to temp. disable ipv6.

And what I forgot to mention (well, maybe too obvious) systemd-networkd must be active and running:
'sudo systemctl status systemd-networkd'

Revision history for this message
liwbj@cn.ibm.com (liwbj) wrote :
Download full text (3.3 KiB)

systemd-networkd looks good.

192.168.122.0 is a new network bridge for create kvm guest, looks like virt-install create it automatically today. Not sure where it define.
192.168.122.0 looks good, and can be started after reboot, but no 10.20.103.0.

?2004hubuntu@a257seubut:~$
route
[?2004l
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
192.168.122.0 0.0.0.0 255.255.255.0 U 0 0 0 virbr0
[?2004hubuntu@a257seubut:~$
sudo systemctl status systemd-networkd
[?2004l
[sudo] password for ubuntu:
[0;1;32m [0m systemd-networkd.service - Network Configuration[m
     Loaded: loaded (/lib/systemd/system/systemd-networkd.service; enabled; ven[m[7m>[27m
     Active: [0;1;32mactive (running)[0m since Fri 2022-11-18 11:11:29 UTC; 3min 3s ago[m
TriggeredBy: [0;1;32m [0m systemd-networkd.socket[m
       Docs: man:systemd-networkd.service(8)[m
   Main PID: 1404 (systemd-network)[m
     Status: "Processing requests..."[m
      Tasks: 1 (limit: 38403)[m
     Memory: 2.7M[m
        CPU: 38ms[m
     CGroup: /system.slice/systemd-networkd.service[m
               1404 /lib/systemd/systemd-networkd[m
[m
Nov 18 11:11:29 a257seubut systemd[1]: Starting Network Configuration...[m
Nov 18 11:11:29 a257seubut systemd-networkd[1404]: lo: Link UP[m
Nov 18 11:11:29 a257seubut systemd-networkd[1404]: lo: Gained carrier[m
Nov 18 11:11:29 a257seubut systemd-networkd[1404]: Enumeration completed[m
Nov 18 11:11:29 a257seubut systemd[1]: Started Network Configuration.[m
Nov 18 11:11:29 a257seubut systemd-networkd[1404]: enc1000.1300: netdev ready[m
Nov 18 11:11:29 a257seubut systemd-networkd[1404]: [0;1;38;5;185m[0;1;39m[0;1;38;5;185menc1000: Could not bring up [m[7m>[27m
Nov 18 11:11:29 a257seubut systemd-networkd[1404]: [0;1;38;5;185m[0;1;39m[0;1;38;5;185menc1000.1300: Could not brin[m[7m>[27m
Nov 18 11:13:32 a257seubut systemd-networkd[1404]: virbr0: Link UP[m
[7mlines 1-22/22 (END)[27m[K
[K[K[7mlines 1-22/22 (END)[27m[K[K[K[7mlines 1-22/22 (END)[27m[K
[K[K[7mlines 1-22/22 (END)[27m[K[K[K[7mlines 1-22/22 (END)[27m[K
[K
[?2004hubuntu@a257seubut:~$
sudo netplan apply
[?2004l
** (generate:2085): [1;33mWARNING[0m **: [34m11:15:00.981[0m: `gateway4` has been deprecated, use default routes instead.
See the 'Default routes' section of the documentation for more details.
** (process:2083): [1;33mWARNING[0m **: [34m11:15:01.349[0m: `gateway4` has been deprecated, use default routes instead.
See the 'Default routes' section of the documentation for more details.
[?2004hubuntu@a257seubut:~$
route
[?2004l
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
default _gateway 0.0.0.0 UG 0 0 0 enc1000.1300
10.20.103.0 0.0.0.0 255.255.255.0 U 0 0 0 enc1000.1300
192.168.122.0 0.0.0.0 255.255.255.0 U 0 0 0 virbr0
[?2004hubuntu@a257seubut:~$

ubuntu@a257seubut:/etc/netplan$ cat 00-installer-config.yaml
# This is the network config written by 'subiquity'
network:
  ethernets:
    enc1000: {}
  version: 2
  vlans:
    enc1000.1300:
      addresses:
      - 10.20.103....

Read more...

Revision history for this message
Frank Heimes (fheimes) wrote (last edit ):

Ok, I see that you called:
$ sudo netplan apply
but what is the status afterwards - did it had an effect?
$ ip addr

Do you also see the messages:
"qeth 0.0.1000: A recovery process has been started for the device
...
[ +0.000025] qeth 0.0.1000: Device successfully recovered!"
on this system?

At the end it could be a similar problem like here: LP#1996006
So check the same as mentioned here:
https://bugs.launchpad.net/ubuntu-z-systems/+bug/1996006/comments/17
(especially the device, NIC, the correct port, and if it has a link)
You may also check the routing/switch. (During installation a different network might have been used, e.g. accessing an install server that in not in your VLAN 1300.)

Revision history for this message
liwbj@cn.ibm.com (liwbj) wrote :

Reboot again, this is some output

Revision history for this message
Frank Heimes (fheimes) wrote :

Well, on this system you got an IP address after boot, that is indicated by log line:
"[ 133.968355] cloud-init[1070]: ci-info: | enc1000.1300 | False | 10.20.103.32 | 255.255.255.0 | global | 82:ca:fd:00"

And the very first command you've executed after you've logged in is "ip addr" that had the output:
1: lo: mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: enc1000: mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 82:ca:fd:00:f9:1e brd ff:ff:ff:ff:ff:ff
3: enc7400: mtu 32768 qdisc noop state DOWN group default qlen 1000
    link/ether 86:ca:fd:00:7e:72 brd ff:ff:ff:ff:ff:ff
4: enc1000.1300@enc1000: mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 82:ca:fd:00:f9:1e brd ff:ff:ff:ff:ff:ff
    inet 10.20.103.32/24 brd 10.20.103.255 scope global enc1000.1300

Again under "4:" interface "enc1000.1300@enc1000" there the ip: "10.20.103.32".
So this system definitely got an ip address!

I see that you have another interface defined here (enc7400),
and by default systemd-networkd is waiting until ALL interfaces are properly configured and up (this is done by service "systemd-networkd-wait-online", that is called one-off during boot).
In case enc7400 has no link or takes to long, networkd-systemd waits for quite some time (2min), indicated by messages like this "A start job is running for Wait for".

You can avoid such a potential delay by adding "--all" to "systemd-networkd-wait-online".
So edit the service:
sudo systemctl edit systemd-networkd-wait-online.service
and make sure it contains the following three lines:
"
[Service]
ExecStart=
ExecStart=/usr/lib/systemd/systemd-networkd-wait-online --any
"
(There are two 'ExecStart' lines by intention, the first one is to clear any previous content.)
Run "sudo systemctl daemon-reload" afterwards to active the config change.

(Alternatively you may add "optional: true" to the interface that may come up late (or sometimes not at all) in your netplan yaml - if it's configured there.)

Revision history for this message
liwbj@cn.ibm.com (liwbj) wrote :

There is one thing that if we used access mode adapter, this issue will not happen, and route info will not be lost.

Revision history for this message
Frank Heimes (fheimes) wrote :

Hi liwbj, ok, knowing this difference is important.

Having a look at the OSA implementation Guide (Redbook): https://www.redbooks.ibm.com/redbooks/pdfs/sg245948.pdf

Trunk mode
Trunk mode indicates that the switch should allow all VLAN ID tagged packets to pass
through the switch port without altering the VLAN ID. This mode is intended for servers that
are VLAN-capable, so the switch filters and processes all VLAN ID tagged packets. In trunk
mode, the switch is programmed to receive VLAN ID-tagged packets that are inbound to the
switch port.

Access mode
Access mode indicates that the switch should filter on specific VLAN IDs and allow only
packets that match the configured VLAN IDs to pass through the switch port. The VLAN ID is
then removed from the packet before it is sent to the server. That is, VLAN ID filtering is
controlled by the switch. In access mode, the switch is programmed to receive packets
without VLAN ID tags that are inbound to the switch port.

That let's me assume that it's related to the current switch configuration.

Frank Heimes (fheimes)
Changed in ubuntu-z-systems:
assignee: Skipper Bug Screeners (skipper-screen-team) → nobody
Frank Heimes (fheimes)
tags: added: installation
removed: installer
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.