Primary slave on the bond not getting set.

Bug #1817651 reported by Shahaan Ayyub
26
This bug affects 4 people
Affects Status Importance Assigned to Milestone
netplan.io (Ubuntu)
Fix Released
Medium
Lukas Märdian
Bionic
Fix Released
Medium
Seyeong Kim
Focal
Fix Released
Medium
Unassigned

Bug Description

[Impact]

primary slave fails to get set in netplan bonding configuration

[Test Case]

0. created vm with 3 nics ( ens33, ens38, ens39 )
1. setup netplan as below
- https://pastebin.ubuntu.com/p/JGqhYXYY6r/
- ens38, ens39 is virtual nic, and dummy2 is not.
2. netplan apply
3. shows error

[Where problems could occur]
As this patch is related to bond, bond may have issue if there is problem.

[Others]

original description

The primary slave fails to get set in netplan bonding configuration:

network:
    version: 2
    ethernets:
        e1p1:
            addresses:
            - x.x.x.x/x
            gateway4: x.x.x.x
            match:
                macaddress: xyz
            mtu: 9000
            nameservers:
                addresses:
                - x.x.x.x
            set-name: e1p1
        p1p1:
            match:
                macaddress: xx
            mtu: 1500
            set-name: p1p1
        p1p2:
            match:
                macaddress: xx
            mtu: 1500
            set-name: p1p2

bonds:
    bond0:
      mtu: 9000
      interfaces: [p1p1, p1p2]
      parameters:
        mode: active-backup
        mii-monitor-interval: 100
        primary: p1p2

~$ sudo netplan --debug apply
sudo netplan --debug apply
** (generate:7353): DEBUG: 13:22:31.480: Processing input file /etc/netplan/50-cloud-init.yaml..
** (generate:7353): DEBUG: 13:22:31.480: starting new processing pass
** (generate:7353): DEBUG: 13:22:31.480: Processing input file /etc/netplan/60-puppet-netplan.yaml..
** (generate:7353): DEBUG: 13:22:31.480: starting new processing pass
** (generate:7353): DEBUG: 13:22:31.480: recording missing yaml_node_t bond0
** (generate:7353): DEBUG: 13:22:31.480: recording missing yaml_node_t bond0
** (generate:7353): DEBUG: 13:22:31.480: recording missing yaml_node_t bond0
** (generate:7353): DEBUG: 13:22:31.480: recording missing yaml_node_t bond0
** (generate:7353): DEBUG: 13:22:31.480: recording missing yaml_node_t bond0
** (generate:7353): DEBUG: 13:22:31.480: starting new processing pass
Error in network definition /etc/netplan/60-puppet-netplan.yaml line 68 column 17: bond0: bond already has a primary slave: p1p2

What's wrong here??

#apt-cache policy netplan.io
netplan.io:
  Installed: 0.40.1~18.04.4
  Candidate: 0.40.1~18.04.4
  Version table:
 *** 0.40.1~18.04.4 500
        500 http://mirrors.rc.nectar.org.au/ubuntu bionic-security/main amd64 Packages
        500 http://mirrors.rc.nectar.org.au/ubuntu bionic-updates/main amd64 Packages
        100 /var/lib/dpkg/status
     0.36.1 500
        500 http://mirrors.rc.nectar.org.au/ubuntu bionic/main amd64 Packages

#cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=18.04
DISTRIB_CODENAME=bionic
DISTRIB_DESCRIPTION="Ubuntu 18.04.2 LTS"

regards,

Shahaan

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in netplan.io (Ubuntu):
status: New → Confirmed
Revision history for this message
Nikolas Britton (nbritton) wrote :
Download full text (3.7 KiB)

I'm having this problem too with Ubuntu 19.10. I was able to workaround by commenting out "primary: enp65s0", but this solution is far from optimal because eno1 is 10 GbE and enp65s0 is 56 GbE. I always want enp65s0 to be the primary, unless it is unavailable.

root@lab02:~# cat /etc/netplan/01-netcfg.yaml
network:
  version: 2
  renderer: networkd

  ethernets:
    eno1: {}
    eno2: {}
    eno3: {}
    eno4: {}
    enp65s0: {}
    dummy1: {}
    dummy2: {}
    dummy3: {}
    dummy4: {}
    dummy5: {}
    dummy6: {}

  bonds:
    bond0:
      interfaces: [eno1, enp65s0]
      parameters:
        #primary: enp65s0
        mode: balance-tlb
        mii-monitor-interval: 100
        transmit-hash-policy: layer3+4
        primary-reselect-policy: better

  vlans:
    vbridge2-vlan10:
      id: 10
      link: vbridge2
      addresses: [10.11.1.2/24]
    vbridge2-vlan20:
      id: 20
      link: vbridge2
      addresses: [10.11.2.2/24]
    vbridge4-vlan10:
      id: 10
      link: vbridge4
      addresses: [10.12.1.2/24]
    vbridge4-vlan20:
      id: 20
      link: vbridge4
      addresses: [10.12.2.2/24]
    vbridge6-vlan10:
      id: 10
      link: vbridge6
      addresses: [10.13.1.2/24]
    vbridge6-vlan20:
      id: 20
      link: vbridge6
      addresses: [10.13.2.2/24]

  bridges:
    internet:
      interfaces: [bond0]
      dhcp4: yes
      nameservers:
        addresses: [8.8.8.8, 1.1.1.1]
    vbridge1:
      interfaces: [dummy1]
      addresses: [10.11.0.2/24]
    vbridge2:
      interfaces: [dummy2]
    vbridge3:
      interfaces: [dummy3]
      addresses: [10.12.0.2/24]
    vbridge4:
      interfaces: [dummy4]
    vbridge5:
      interfaces: [dummy5]
      addresses: [10.13.0.2/24]
    vbridge6:
      interfaces: [dummy6]

root@lab02:~# cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=19.10
DISTRIB_CODENAME=eoan
DISTRIB_DESCRIPTION="Ubuntu 19.10"

root@lab02:~# ethtool eno1
Settings for eno1:
 Supported ports: [ FIBRE ]
 Supported link modes: 1000baseT/Full
                         10000baseT/Full
 Supported pause frame use: Symmetric Receive-only
 Supports auto-negotiation: No
 Supported FEC modes: Not reported
 Advertised link modes: 10000baseT/Full
 Advertised pause frame use: No
 Advertised auto-negotiation: No
 Advertised FEC modes: Not reported
 Speed: 10000Mb/s
 Duplex: Full
 Port: Direct Attach Copper
 PHYAD: 1
 Transceiver: internal
 Auto-negotiation: off
 Supports Wake-on: g
 Wake-on: d
 Current message level: 0x00000000 (0)

 Link detected: yes

root@lab02:~# ethtool enp65s0
Settings for enp65s0:
 Supported ports: [ FIBRE ]
 Supported link modes: 1000baseKX/Full
                         10000baseKX4/Full
                         10000baseKR/Full
                         40000baseCR4/Full
                         40000baseSR4/Full
                         56000baseCR4/Full
                         56000baseSR4/Full
 Supported pause frame use: Symmetric Receive-only
 Supports auto-negotiation: Yes
 Supported FEC modes: Not reported
 Advertised link modes: 1000baseKX/Full
                         10000baseKX4/Full
                         10000baseKR/Full
                         40000baseCR4/Full
                ...

Read more...

Revision history for this message
Nikolas Britton (nbritton) wrote :

root@lab02:~# cat /proc/net/bonding/bond0
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)

Bonding Mode: transmit load balancing
Primary Slave: None
Currently Active Slave: eno1
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0
Peer Notification Delay (ms): 0

Slave Interface: enp65s0
MII Status: up
Speed: 56000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: f4:52:14:40:79:31
Slave queue ID: 0

Slave Interface: eno1
MII Status: up
Speed: 10000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 00:0e:1e:4d:05:e0
Slave queue ID: 0

Revision history for this message
Lukas Märdian (slyon) wrote :

I can confirm this issue on Focal, netplan 0.99.

It looks like if netplan is recording missing yaml nodes (vbridge[246] in Nikolas' case), it is triggering a 2nd processing pass. In this 2nd pass it tries to re-assign the previously assigned primary slave for the 2nd time and thus fails.

Revision history for this message
Lukas Märdian (slyon) wrote :
Lukas Märdian (slyon)
Changed in netplan.io (Ubuntu):
assignee: nobody → Lukas Märdian (slyon)
status: Confirmed → In Progress
Revision history for this message
Lukas Märdian (slyon) wrote :

A PPA for testing this fix can be found here:
https://launchpad.net/~slyon/+archive/ubuntu/fix-1817651

Revision history for this message
Lukas Märdian (slyon) wrote :
Changed in netplan.io (Ubuntu):
status: In Progress → Fix Committed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package netplan.io - 0.100-0ubuntu4

---------------
netplan.io (0.100-0ubuntu4) groovy; urgency=medium

  * debian/tests/cloud-init
    - Improve reboot test to avoid failure on arm64

 -- Lukas Märdian <email address hidden> Mon, 21 Sep 2020 12:23:02 +0200

Changed in netplan.io (Ubuntu):
status: Fix Committed → Fix Released
Seyeong Kim (seyeongkim)
Changed in netplan.io (Ubuntu Focal):
status: New → Fix Released
Revision history for this message
Seyeong Kim (seyeongkim) wrote :
description: updated
Changed in netplan.io (Ubuntu Bionic):
status: New → In Progress
assignee: nobody → Seyeong Kim (seyeongkim)
tags: added: sts
Revision history for this message
Lukas Märdian (slyon) wrote :

Thank you Seyeong for preparing the debdiff for Bionic. It is malformatted at line 36, but I fixed that.

The patch was taken from upstream and LGTM, it passed all build tests (incl. the new one from the patch)!

For SRU verification, please also attach the Bionic test logs of our integration test-suite, in addition to the test case you described above.

This way we can verify that we do not regress, according to: https://wiki.ubuntu.com/NetplanUpdates

Logs can be found here, once they are run for version 0.99-0ubuntu3~18.04.4: https://autopkgtest.ubuntu.com/packages/netplan.io (bionic/*)

Revision history for this message
Mathew Hodson (mhodson) wrote :

Already backported to Focal.
---

netplan.io (0.100-0ubuntu4~20.04.2) focal; urgency=medium

  * Backport netplan.io 0.100-0ubuntu4 to 20.04 (LP: #1894197)
    - Includes fix for OVS/WPA first-time boot issues
  * Drop distro patches, which are included in upstream release
  * Ignore openvswitch-switch Build-Depends on riscv64, due to missing package
    - Failing unit-/integration tests will be ignored on riscv64 as well
  * Skip specific unit-tests on riscv64

 -- Lukas Märdian <email address hidden> Wed, 30 Sep 2020 14:32:36 +0200

Changed in netplan.io (Ubuntu):
importance: Undecided → Medium
Changed in netplan.io (Ubuntu Bionic):
importance: Undecided → Medium
Changed in netplan.io (Ubuntu Focal):
importance: Undecided → Medium
Revision history for this message
Seyeong Kim (seyeongkim) wrote :

Hey @slyon

There is no exact version in https://autopkgtest.ubuntu.com/packages/netplan.io

Does somebody need to upload it to there or I need to do something for this?

Thanks.

Revision history for this message
Brian Murray (brian-murray) wrote :

@seyeongkim - nothing will appear at autopkgtest until the package has been accepted into the archive i.e. -proposed in this case.

Changed in netplan.io (Ubuntu Bionic):
status: In Progress → Fix Committed
tags: added: verification-needed verification-needed-bionic
Revision history for this message
Brian Murray (brian-murray) wrote : Please test proposed package

Hello Shahaan, or anyone else affected,

Accepted netplan.io into bionic-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/netplan.io/0.99-0ubuntu3~18.04.4 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-bionic to verification-done-bionic. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-bionic. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Revision history for this message
Seyeong Kim (seyeongkim) wrote :
Revision history for this message
Seyeong Kim (seyeongkim) wrote :
Revision history for this message
Seyeong Kim (seyeongkim) wrote :
Revision history for this message
Seyeong Kim (seyeongkim) wrote :
Revision history for this message
Seyeong Kim (seyeongkim) wrote :
Revision history for this message
Seyeong Kim (seyeongkim) wrote :
Revision history for this message
Seyeong Kim (seyeongkim) wrote :

Verification is done for Bionic

# dpkg -l | grep netplan.io
ii netplan.io 0.99-0ubuntu3~18.04.4 amd64 YAML network configuration abstraction for various backends

Test step

deploy bionic vm
set netplan conf as description said.
netplan apply.
faced error
upgrade pkg from -proposed repository
netplan apply

no error

tags: added: verification-done-bionic
removed: verification-needed-bionic
Mathew Hodson (mhodson)
tags: removed: bonding verification-needed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package netplan.io - 0.99-0ubuntu3~18.04.4

---------------
netplan.io (0.99-0ubuntu3~18.04.4) bionic; urgency=medium

  * d/p/0001-Don-t-fail-if-same-primary-slave-was-set-before-LP-1.patch
    - Fix primary slave on the bond not getting set (LP: #1817651)

 -- Seyeong Kim <email address hidden> Fri, 15 Jan 2021 07:38:20 +0000

Changed in netplan.io (Ubuntu Bionic):
status: Fix Committed → Fix Released
Revision history for this message
Brian Murray (brian-murray) wrote : Update Released

The verification of the Stable Release Update for netplan.io has completed successfully and the package is now being released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.