Race condition between bridge-utils pre-up and udev scripts

Bug #1294172 reported by Paul Donohue on 2014-03-18
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
bridge-utils (Ubuntu)
High
Unassigned

Bug Description

I have the following in my /etc/network/interfaces to configure a GRE-TAP tunnel and attach it to a bridge:

auto lo
iface lo inet loopback

auto eth0
iface eth0 inet static
  address 10.11.12.13
  netmask 255.255.255.0
  gateway 10.11.12.1

auto l2gre0
iface l2gre0 inet manual
  pre-up ip link add name $IFACE type gretap local 10.11.12.13 remote 10.13.14.15
  up ip link set dev $IFACE up
  down ip link set dev $IFACE down
  post-down ip link delete $IFACE

auto sw0
iface sw0 inet manual
  pre-up ip link set dev eth1 promisc on
  bridge_ports eth1 l2gre0
  bridge_stp off
  bridge_fd 0.1
  bridge_maxwait 0
  post-up sysctl -q net.ipv4.conf.sw0.rp_filter=0

I've found that on reboot, about 50% of the time the bridge comes up fine, and about 50% of the time the bridge comes up half-configured (it has l2gre0 attached but not eth1, and the stp and fd settings are not configured). When it fails, "device sw0 already exists; can't create bridge with the same name" shows up in the logs.

The problem seems to be a race condition between /lib/bridge-utils/ifupdown.sh and /lib/udev/bridge-network-interface ... The sequence of events is the following:
  udev event triggered for eth0
  udev event triggered for eth1 (ignored by bridge-network-interface because /run/network doesn't exist yet)
  udev event triggered for lo
  ifup called for eth0
  ifup called for lo
  ifup called for l2gre0
Then, the following two operations happen in parallel:
  udev event triggered for l2gre0
  ifup called for sw0
If ifup happens to reach the 'brctl addbr' command in /lib/bridge-utils/ifupdown.sh first, everything works fine. However, if the udev trigger happens to reach the 'brctl addbr' command in /lib/udev/bridge-network-interface first, then /lib/bridge-utils/ifupdown.sh will fail, and the bridge will not be properly configured.

This race condition appears to have been introduced by the fix for Bug #1003656 ... Before that fix, /lib/udev/bridge-network-interface would call 'ifup sw0' instead of just 'brctl addbr sw0', and 'ifup sw0' would ensure that only one of the two threads called 'brctl addbr'.

As for fixing this problem, I can't say that I'm aware of every possible use case here, but it seems to me that /lib/udev/bridge-network-interface simply should not call 'brctl addbr'. According to https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=626152 this udev trigger was added to ensure that devices are properly added to the bridge if the bridge happens to come up before all of the physical network interfaces have been probed by the kernel. However, if /lib/udev/bridge-network-interface is called before the bridge comes up, shouldn't it simply ignore the event and let /lib/bridge-utils/ifupdown.sh bring the bridge up and add the interface later?

Changed in bridge-utils (Ubuntu):
importance: Undecided → High
Changed in bridge-utils (Ubuntu):
status: New → Triaged
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers