Ubuntu

[SRU] update to include stable fixes for OVS 1.4

Reported by dan wendlandt on 2012-07-06
14
This bug affects 1 person
Affects Status Importance Assigned to Milestone
openvswitch (Ubuntu)
Undecided
Unassigned
Precise
Medium
Adam Gandelman
Quantal
Undecided
Unassigned

Bug Description

[IMPACT]

 * ovs-vsswitchd crashes when processing IPv6 neighbor discovery packets with VLAN headers received in a tunnel configured with key=flow or in_key=flow.

 * This bug affects us from pushing a change to to the OpenStack Quantum project, which uses OVS on Ubuntu precise quite heavily.

[TESTCASE]

 * Install the current openvswitch package from precise (1.4.0) on two machines. Then, create tunnel ports between the two machines. Next attach a VM to each of the ovs_bridges and add flow table entries which cause traffic entering the bridge to be appended with a VLAN_ID and then output a tunnel port. When IPv6 neighbor discovery packets from the vm enter this port ovs-vswitchd will crash.

[Regression Potential]

 * None

[Other Info]
 * Patch: http://openvswitch.org/cgi-bin/gitweb.cgi?p=openvswitch;a=commitdiff_plain;h=11f84dce005dd14a374ab2ef5f8c25bcf8285a36;hp=776cf91b42f631b4929fffe8ddd2aa06b40ea24c
 * Mailing list discussion: http://openvswitch.org/pipermail/dev/2012-May/017249.html
 * Link to patch that exposes this issue in openstack Quantum: https://review.openstack.org/#/c/9416/

>> Original bug report <<

Hello,

I'm the lead of the OpenStack Quantum project, which uses OVS on Ubuntu precise quite heavily.

We have a bug fix for an issue in Quantum that necessarily triggers a bug in OVS 1.4.0 that has been fixed in the stable 1.4.2 release. We want to push our fix, but pushing it before precise has the corresponding OVS bugfix would cause many people to experience the OVS bug, which is actually worse than the bug we've fixed in Quantum.

Can we get precise upgraded to OVS 1.4.2?

If this isn't the right venue to request this, please point me to the right place.

Adam Gandelman (gandelman-a) wrote :

Dan--

Thanks for reporting and helping Ubuntu better! As per the policies surrounding stable release updates [1], at this point its not really possible to release a new upstream version of OVS into 12.04. Best bet would be to provide some details of the OVS bug, links to upstream bug reports and commits that fixed the issue and see about possibly back-porting the fix to Precise via an SRU.

Thanks,
Adam

[1] https://wiki.ubuntu.com/StableReleaseUpdates

Changed in openvswitch (Ubuntu):
status: New → Incomplete
James Page (james-page) on 2012-07-06
Changed in openvswitch (Ubuntu Precise):
status: New → Incomplete
Changed in openvswitch (Ubuntu Quantal):
status: Incomplete → Fix Released
Changed in openvswitch (Ubuntu Precise):
milestone: none → ubuntu-12.04.1
importance: Undecided → Medium
dan wendlandt (danwent) wrote :

Hi Adam, James,

Thanks for the quick response.

The upstream change is here: http://openvswitch.org/cgi-bin/gitweb.cgi?p=openvswitch;a=commit;h=11f84dce005dd14a374ab2ef5f8c25bcf8285a36

In our testing, VMs frequently send out IPv6 neighbor discovery packets (the v6 equivalent of ARP) during boot. When the OVS Quantum plugin is running in tunnel mode (new default), this causes OVS to crash and wipe all flow state from its configuration.

I can put you in touch with the the appropriate OVS developer if you like, as we both work at Nicira.

Adam Gandelman (gandelman-a) wrote :

Dan-

I've applied that commit to our current Precise package and uploaded it a PPA for testing. Please find 'openvswitch - 1.4.0-1ubuntu1.1' in the PPA detailed @ https://launchpad.net/~gandelman-a/+archive/ppa. The patch mostly applied okay. tests/test-odp.c seems to have evolved a bit since 1.4.0, but the changes from that commit still look okay there (to my eye, at least). Please test that package and see if it resolves your issue. If so, we can propose the update for an SRU.

You can find the applied patch @ http://paste.ubuntu.com/1078602

dan wendlandt (danwent) wrote :

Thanks Adam. I had the developer confirm that the path looks good.

Adding Aaron Rosen from Nicira to the bug, as he will be the one to grab the new PPA and confirm that it fixes the issue. Thanks again for the quick response.

James Page (james-page) on 2012-07-12
Changed in openvswitch (Ubuntu Precise):
status: Incomplete → In Progress
assignee: nobody → Adam Gandelman (gandelman-a)
Aaron Rosen (arosen) wrote :

Hi Adam,

I just tested out your package and unfortunately that one patch does not resolve the issue. Could you repackage this using the 1.4.2 tag (http://openvswitch.org/cgi-bin/gitweb.cgi?p=openvswitch;a=commit;h=b9a30fe2ebb1d300178d6ba7628b3dc40ff1512b)?

The 1.4.2 release only contains bug fixes for 1.4.1 and 1.4.0.

Thanks,

Aaron

Aaron Rosen (arosen) wrote :

Hi Adam,

When when investigating this a little more closely I was able to do the following (applying the patch Dan linked on top of the v1.4.0 tag) and it resolved the issue.

git checkout v1.4.0
git cherry-pick 11f84dce005dd14a374ab2ef5f8c25bcf8285a36

Also it seems that what you applied here http://paste.ubuntu.com/1078602 doesn't seem to exactly match the patch that Dan provided here: http://openvswitch.org/cgi-bin/gitweb.cgi?p=openvswitch;a=commitdiff_plain;h=11f84dce005dd14a374ab2ef5f8c25bcf8285a36;hp=776cf91b42f631b4929fffe8ddd2aa06b40ea24c (though the only difference seems to be in dpif-netdev.c for an include. Was this intentional? )

Thanks,

Aaron

dan wendlandt (danwent) wrote :

hi aaron, I checked with a member of the ovs team and they confirmed that the additional include header was not strictly necessary for the patch.

Aaron Rosen (arosen) wrote :

Hi Adam,

It seems like this patch wasn't actually applied to the package you provided?

 I downloaded openvswitch-datapath-source_1.4.0-1ubuntu1.1_all.deb (http://ppa.launchpad.net/gandelman-a/ppa/ubuntu/pool/main/o/openvswitch/openvswitch-datapath-source_1.4.0-1ubuntu1.1_all.deb) and navigated to data.tar.gz->./usr/src/openvswitch-datapath.tar.bz2->/modules/openvswitch-datapath/->openvswitch.tar.gz->openvswitch/lib/odp-util.h and :

#define ODPUTIL_FLOW_KEY_BYTES 144
not
#define ODPUTIL_FLOW_KEY_BYTES 200

I also confirmed the same thing with the package built with debug symbols.

Thanks,

Aaron

Adam Gandelman (gandelman-a) wrote :

@Aaron-

Apologies, looks like the patch file got left out of my local VCS before package build. I've rebuilt a new package in the same PPA (now version 1.4.0-1ubuntu1.2) that actually includes the originally intended patch. Thanks

Adam

Aaron Rosen (arosen) wrote :

Hi Adam,

I just tested out the new package and it resolves the issue.

Thanks!

Aaron

Aaron Rosen (arosen) wrote :

[IMPACT]

 * ovs-vsswitchd crashes when processing IPv6 neighbor discovery packets with VLAN headers received in a tunnel configured with key=flow or in_key=flow.

 * This bug affects us from pushing a change to to the OpenStack Quantum project, which uses OVS on Ubuntu precise quite heavily.

[TESTCASE]

 * Install the current openvswitch package from precise (1.4.0) on two machines. Then, create tunnel ports between the two machines. Next attach a VM to each of the ovs_bridges and add flow table entries which cause traffic entering the bridge to be appended with a VLAN_ID and then output a tunnel port. When IPv6 neighbor discovery packets from the vm enter this port ovs-vswitchd will crash.

[Regression Potential]

 * None

[Other Info]

 * Link to patch that exposes this issue in openstack Quantum https://review.openstack.org/#/c/9416/

Aaron Rosen (arosen) wrote :

[IMPACT]

 * ovs-vsswitchd crashes when processing IPv6 neighbor discovery packets with VLAN headers received in a tunnel configured with key=flow or in_key=flow.

 * This bug affects us from pushing a change to to the OpenStack Quantum project, which uses OVS on Ubuntu precise quite heavily.

[TESTCASE]

 * Install the current openvswitch package from precise (1.4.0) on two machines. Then, create tunnel ports between the two machines. Next attach a VM to each of the ovs_bridges and add flow table entries which cause traffic entering the bridge to be appended with a VLAN_ID and then output a tunnel port. When IPv6 neighbor discovery packets from the vm enter this port ovs-vswitchd will crash.

[Regression Potential]

 * None

[Other Info]
 * Patch: http://openvswitch.org/cgi-bin/gitweb.cgi?p=openvswitch;a=commitdiff_plain;h=11f84dce005dd14a374ab2ef5f8c25bcf8285a36;hp=776cf91b42f631b4929fffe8ddd2aa06b40ea24c

 * Mailing list discussion: http://openvswitch.org/pipermail/dev/2012-May/017249.html
 * Link to patch that exposes this issue in openstack Quantum: https://review.openstack.org/#/c/9416/

summary: - update to include stable fixes for OVS 1.4
+ [SRU] update to include stable fixes for OVS 1.4
James Page (james-page) wrote :

Uploaded to -proposed for SRU team review.

description: updated

Hello dan, or anyone else affected,

Accepted openvswitch into precise-proposed. The package will build now and be available at http://launchpad.net/ubuntu/+source/openvswitch/1.4.0-1ubuntu1.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please change the bug tag from verification-needed to verification-done. If it does not, change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in openvswitch (Ubuntu Precise):
status: In Progress → Fix Committed
tags: added: verification-needed
dan wendlandt (danwent) wrote :

Great, thanks Clint! Aaron (subscribed to this bug) will do verification on this. Thanks!

Aaron Rosen (arosen) wrote :

Hi Clint,

I followed the instructions here (https://wiki.ubuntu.com/Testing/EnableProposed) but it's giving me 1.4.0-1ubuntu1 not 1.4.0-1ubuntu1.1 . I guess it hasn't been placed there yet?

Thanks,

Aaron

Aaron Rosen (arosen) wrote :

Hi Clint, Same thing this morning.

Also, if I go here: https://launchpad.net/ubuntu/+source/openvswitch/1.4.0-1ubuntu1.1/+build/3682163 I don't see a package built for openvswitch-datapath-dkms.

Thanks,

Aaron

Adam Gandelman (gandelman-a) wrote :

@Aaron,
 I believe there is a PPC build failure that is blocking it. Do the failures in the build log look like anything obvious to you?

https://launchpadlibrarian.net/111175199/buildlog_ubuntu-precise-powerpc.openvswitch_1.4.0-1ubuntu1.1_FAILEDTOBUILD.txt.gz

Ben Pfaff (blp-nicira) wrote :

> I believe there is a PPC build failure that is blocking it. Do the failures in the build log look like anything obvious to you?

It's obviously the bug fixed here:
http://openvswitch.org/cgi-bin/gitweb.cgi?p=openvswitch;a=commitdiff;h=f79982ba930364f2d3935196dc13b422ba711ef2

I guess we get to re-apply every bug fix from branch-1.4 one at a time as you guys notice them. Sigh.

On Thursday, July 26, 2012 04:18:27 PM you wrote:
> I guess we get to re-apply every bug fix from branch-1.4 one at a time
> as you guys notice them. Sigh.

The criteria for post-release fixes can be found here:

https://wiki.ubuntu.com/StableReleaseUpdates#When

I think any fixes you have that fit within those bounds can be included. You
would know better than us what it makes sense to apply.

Aaron Rosen (arosen) wrote :

Any idea why this happened? The previous package Adam gave me fixed this bug and built correctly.

Adam Gandelman (gandelman-a) wrote :

Aaron, the package posted earlier was built on a PPA, which only targets i386 and amd64. I've just tested Ben's patch against PPC and proposed a new upload to precise-proposed.

Aaron Rosen (arosen) wrote :

Thanks Adam, will you let me know when it should be on precise-proposed so I can give it a quick test? Doesn't seem to be there yet..

Clint Byrum (clint-fewbar) wrote :

A PPC failure would not block publishing of other arches into precise-proposed. In fact, openvswitch failed on PPC for the release as well. Its a "port" architecture so it won't block the release/cause removal if it only FTBFS on PPC.

The package update may have been delayed a bit, but i see openvswitch-datapath-dkms in http://archive.ubuntu.com/ubuntu/dists/precise-proposed/universe/binary-amd64/Packages.gz

Ben, we'd love to know what bugs have been fixed in the 1.4 branch and bring them into the distro as updates if they meet our usual criteria for updates. We often import whole upstream patch-only releases, we just ask that either all bugs fixed be documented so we can review the changes and test that the fix works, or that you document your own regression avoidance procedures so we can make a judgement call on whether that meets ubuntu users' needs for stability. If it looks good, we can look at doing a micro release exception for openvswitch.

Aaron Rosen (arosen) on 2012-07-31
tags: added: verification-done
removed: verification-needed
Aaron Rosen (arosen) wrote :

Hi, any idea when this will get pushed to Precise?

Clint Byrum (clint-fewbar) wrote :

This should land in precise-updates very soon. There was a bit of confusion caused by the fix to building on PPC that caused this to fall off the SRU team radar. It should be back on track soon.

Clint Byrum (clint-fewbar) wrote :

Ok, after confirming with the release team, this will land in precise-updates some time on Monday. Thanks everyone for testing and getting this fix in!

Aaron Rosen (arosen) wrote :

Hi Clint, It doesn't seem to have landed yet. Hopefully this will get in today?

Thanks,

Aaron

Aaron Rosen (arosen) wrote :

This still hasn't landed. Any ideas why not?

Changed in openvswitch (Ubuntu Precise):
milestone: ubuntu-12.04.1 → precise-updates
Adam Conrad (adconrad) wrote :

This has been released to precise-updates.

Changed in openvswitch (Ubuntu Precise):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers