Restarting nova-network removes ip packet filters

Bug #1000853 reported by Derek Higgins on 2012-05-17
16
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
High
Russell Bryant
Essex
Undecided
Unassigned
nova (Ubuntu)
Undecided
Unassigned
Precise
Undecided
Unassigned

Bug Description

Running essex on RHEL

On a running nova setup (all in one), I have FlatDHCP working successfully until I restart nova-network,
shortly after restarting nova-network my instances loose their ip addresses

Before the nova-network restart I have the following filters reported by iptables (in addition to others)

Chain nova-network-INPUT (1 references)
 pkts bytes target prot opt in out source destination
   34 15516 ACCEPT udp -- demonetbr0 * 0.0.0.0/0 0.0.0.0/0 udp dpt:67
    0 0 ACCEPT tcp -- demonetbr0 * 0.0.0.0/0 0.0.0.0/0 tcp dpt:67
    0 0 ACCEPT udp -- demonetbr0 * 0.0.0.0/0 0.0.0.0/0 udp dpt:53
    0 0 ACCEPT tcp -- demonetbr0 * 0.0.0.0/0 0.0.0.0/0 tcp dpt:53

after restarting nova-network these rules are gone
Chain nova-network-INPUT (1 references)
 pkts bytes target prot opt in out source destination

as a result of these filter rules not being present the instances can no longer communicate
with dnsmasq to obtain ipaddresses

as a workaroud, the only way I can see to get the filters defined again is to
 manually kill dnsmasq
and
 start up a new instance

as these filters are added when nova starts dnsmasq (if it isn't running) in
nova.network.linux_net.restart_dhcp()
which only happens when we start an instance

perhapes the filters could be added on startup?

Related branches

Changed in nova:
status: New → Confirmed
importance: Undecided → High

Fix proposed to branch: master
Review: https://review.openstack.org/7784

Changed in nova:
assignee: nobody → Tom Fifield (fifieldt)
status: Confirmed → In Progress
Tom Fifield (fifieldt) on 2012-05-27
Changed in nova:
assignee: Tom Fifield (fifieldt) → nobody
Thierry Carrez (ttx) on 2012-06-07
Changed in nova:
status: In Progress → Confirmed
Changed in nova:
assignee: nobody → Russell Bryant (russellb)

Fix proposed to branch: master
Review: https://review.openstack.org/8552

Changed in nova:
status: Confirmed → In Progress

Reviewed: https://review.openstack.org/8552
Committed: http://github.com/openstack/nova/commit/aa1e71d1b313f80f5581b1422e3f3e5719569e50
Submitter: Jenkins
Branch: master

commit aa1e71d1b313f80f5581b1422e3f3e5719569e50
Author: Russell Bryant <email address hidden>
Date: Thu Jun 14 12:34:08 2012 -0400

    Ensure dnsmasq accept rules are preset at startup.

    Fix bug 1000853.

    This bug reported that after restarting nova-network, the dnsmasq ACCEPT
    iptables rules were no longer present, causing instances to lose their
    IP addresses. This patch updates the restart_dhcp() function in the
    linux_net driver to ensure these rules are present even if dnsmasq was
    already running. Before this was only done when first starting dnsmasq.

    Change-Id: Icfbe6177d4c913c3d7755ca40a71752bcdaa4448

Changed in nova:
status: In Progress → Fix Committed
Mark McLoughlin (markmc) wrote :

As a little bit of background on this, these dnsmasq rules were originally added to fix bug #844935

The interesting thing to note is that the issue is Fedora/RHEL specific because the default policy there is DROP, where other distros default to ACCEPT

Reviewed: https://review.openstack.org/8572
Committed: http://github.com/openstack/nova/commit/bc621bca08d51076bd81f15e29e8b89ea946503a
Submitter: Jenkins
Branch: stable/essex

commit bc621bca08d51076bd81f15e29e8b89ea946503a
Author: Russell Bryant <email address hidden>
Date: Thu Jun 14 12:34:08 2012 -0400

    Ensure dnsmasq accept rules are preset at startup.

    Fix bug 1000853.

    This bug reported that after restarting nova-network, the dnsmasq ACCEPT
    iptables rules were no longer present, causing instances to lose their
    IP addresses. This patch updates the restart_dhcp() function in the
    linux_net driver to ensure these rules are present even if dnsmasq was
    already running. Before this was only done when first starting dnsmasq.

    (cherry picked from commit aa1e71d1b313f80f5581b1422e3f3e5719569e50)

    Change-Id: Icda3364d3a61018b912cea7a4c96b2cbcc1fbdd7

Thierry Carrez (ttx) on 2012-07-04
Changed in nova:
milestone: none → folsom-2
status: Fix Committed → Fix Released
Dave Walker (davewalker) on 2012-08-24
Changed in nova (Ubuntu):
status: New → Fix Released
Changed in nova (Ubuntu Precise):
status: New → Confirmed

Please find the attached test log from the Ubuntu Server Team's CI infrastructure. As part of the verification process for this bug, Nova has been deployed and configured across multiple nodes using precise-proposed as an installation source. After successful bring-up and configuration of the cluster, a number of exercises and smoke tests have be invoked to ensure the updated package did not introduce any regressions. A number of test iterations were carried out to catch any possible transient errors.

Please Note the list of installed packages at the top and bottom of the report.

For records of upstream test coverage of this update, please see the Jenkins links in the comments of the relevant upstream code-review(s):

Trunk review: https://review.openstack.org/8552
Stable review: https://review.openstack.org/8572

As per the provisional Micro Release Exception granted to this package by the Technical Board, we hope this contributes toward verification of this update.

Adam Gandelman (gandelman-a) wrote :

Test coverage log.

tags: added: verification-done
Launchpad Janitor (janitor) wrote :
Download full text (5.4 KiB)

This bug was fixed in the package nova - 2012.1.3+stable-20120827-4d2a4afe-0ubuntu1

---------------
nova (2012.1.3+stable-20120827-4d2a4afe-0ubuntu1) precise-proposed; urgency=low

  * New upstream snapshot, fixes FTBFS in -proposed. (LP: #1041120)
  * Resynchronize with stable/essex (4d2a4afe):
    - [5d63601] Inappropriate exception handling on kvm live/block migration
      (LP: #917615)
    - [ae280ca] Deleted floating ips can cause instance delete to fail
      (LP: #1038266)

nova (2012.1.3+stable-20120824-86fb7362-0ubuntu1) precise-proposed; urgency=low

  * New upstream snapshot. (LP: #1041120)
  * Dropped, superseded by new snapshot:
    - debian/patches/CVE-2012-3447.patch: [d9577ce]
    - debian/patches/CVE-2012-3371.patch: [25f5bd3]
    - debian/patches/CVE-2012-3360+3361.patch: [b0feaff]
  * Resynchronize with stable/essex (86fb7362):
    - [86fb736] Libvirt driver reports incorrect error when volume-detach fails
      (LP: #1029463)
    - [272b98d] nova delete lxc-instance umounts the wrong rootfs (LP: #971621)
    - [09217ab] Block storage connections are NOT restored on system reboot
      (LP: #1036902)
    - [d9577ce] CVE-2012-3361 not fully addressed (LP: #1031311)
    - [e8ef050] pycrypto is unused and the existing code is potentially insecure
      to use (LP: #1033178)
    - [3b4ac31] cannot umount guestfs (LP: #1013689)
    - [f8255f3] qpid_heartbeat setting in ineffective (LP: #1030430)
    - [413c641] Deallocation of fixed IP occurs before security group refresh
      leading to potential security issue in error / race conditions
      (LP: #1021352)
    - [219c5ca] Race condition in network/deallocate_for_instance() leads to
      security issue (LP: #1021340)
    - [f2bc403] cleanup_file_locks does not remove stale sentinel files
      (LP: #1018586)
    - [4c7d671] Deleting Flavor currently in use by instance creates error
      (LP: #994935)
    - [7e88e39] nova testsuite errors on newer versions of python-boto (e.g.
      2.5.2) (LP: #1027984)
    - [80d3026] NoMoreFloatingIps: Zero floating ips available after repeatedly
      creating and destroying instances over time (LP: #1017418)
    - [4d74631] Launching with source groups under load produces lazy load error
      (LP: #1018721)
    - [08e5128] API 'v1.1/{tenant_id}/os-hosts' does not return a list of hosts
      (LP: #1014925)
    - [801b94a] Restarting nova-compute removes ip packet filters (LP: #1027105)
    - [f6d1f55] instance live migration should create virtual_size disk image
      (LP: #977007)
    - [4b89b4f] [nova][volumes] Exceeding volumes, gigabytes and floating_ips
      quotas returns general uninformative HTTP 500 error (LP: #1021373)
    - [6e873bc] [nova][volumes] Exceeding volumes, gigabytes and floating_ips
      quotas returns general uninformative HTTP 500 error (LP: #1021373)
    - [7b215ed] Use default qemu-img cluster size in libvirt connection driver
    - [d3a87a2] Listing flavors with marker set returns 400 (LP: #956096)
    - [cf6a85a] nova-rootwrap hardcodes paths instead of using
      /sbin:/usr/sbin:/usr/bin:/bin (LP: #1013147)
    - [2efc87c] affinity filters don't work if scheduler_hints is None
      (LP: #1007573)
  ...

Read more...

Changed in nova (Ubuntu Precise):
status: Confirmed → Fix Released

The verification of this Stable Release Update has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regresssions.

Thierry Carrez (ttx) on 2012-09-27
Changed in nova:
milestone: folsom-2 → 2012.2
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers