radvd >= 2.0 blocks router update processing

Bug #1398779 reported by Ihar Hrachyshka
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
neutron
Fix Released
High
Ihar Hrachyshka
Juno
Fix Released
High
Ihar Hrachyshka

Bug Description

In radvd 2.0+, daemonization code was rewritten, switching from libdaemon's daemon_fork() to Linux daemon() call.

If no logging method (-m option) is passed to radvd, and the default logging method is used (which is L_STDERR_SYSLOG), then daemon() is called with (1, 1) arguments, meaning no chroot (fine) and not closing stderr (left there for logging) (not fine). So execute() call that spawns radvd and expects it to daemonize and return code never actually completes, blocked on stderr.

The fix is to pass e.g. -m syslog to radvd to make it close stderr and return.

Tags: ipv6
Changed in neutron:
assignee: nobody → Ihar Hrachyshka (ihar-hrachyshka)
tags: added: ipv6 juno-backport-potential
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/138688

Changed in neutron:
status: New → In Progress
Changed in neutron:
importance: Undecided → High
milestone: none → kilo-1
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.openstack.org/138688
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=72d41174765540bb7672b545c336fb7aaad075e8
Submitter: Jenkins
Branch: master

commit 72d41174765540bb7672b545c336fb7aaad075e8
Author: Ihar Hrachyshka <email address hidden>
Date: Wed Dec 3 12:44:57 2014 +0100

    radvd: pass -m syslog to avoid thread lock for radvd 2.0+

    Since radvd 2.0, the daemon does not use daemon_fork() function from
    libdaemon, but instead calls Linux daemon() function directly. It also
    passes (1, 1) arguments when logging method (-m) is either stderr (the
    default) or stderr_syslog. The second argument's value = 1 means that
    stderr is not closed and left there for (some) log messages.

    For neutron, it means that corresponding execute() call that spawns
    radvd and expects the invoked process to close stderr does not ever get
    completed. The current thread that spawned radvd is locked waiting for
    radvd to exit, which does not ever occur unless the process crashes or
    receives a signal.

    Since L3 agent gives exclusive access to updates queue for each router
    to one of processing threads only, it means that the thread that got to
    serve a radvd-powered subnet will not proceed and not update any new
    ports or other changes to the router anymore.

    Passing -m syslog makes radvd 2.0+ close stderr and return to execute()
    caller, proceeding with router update processing. The same arguments
    should work for old (pre 2.0) versions of radvd too, so passing them
    unconditionally.

    We could instead use -m logfile and pass appropriate -l <logfile>
    argument to radvd to make it log to a log file located in router's
    namespace storage path. Though that would be not in line with what
    dnsmasq processes currently do for dhcp agent, where we log all messages
    to syslog, so sticking to syslog for radvd for consistency.

    Change-Id: I131db0639bc46d332ed48faa2bbe68a214264062
    Closes-Bug: #1398779

Changed in neutron:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/juno)

Fix proposed to branch: stable/juno
Review: https://review.openstack.org/141575

Thierry Carrez (ttx)
Changed in neutron:
status: Fix Committed → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/juno)

Reviewed: https://review.openstack.org/141575
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=6c490a984af690e732f8b96a18493ac1afef892f
Submitter: Jenkins
Branch: stable/juno

commit 6c490a984af690e732f8b96a18493ac1afef892f
Author: Ihar Hrachyshka <email address hidden>
Date: Wed Dec 3 12:44:57 2014 +0100

    radvd: pass -m syslog to avoid thread lock for radvd 2.0+

    Since radvd 2.0, the daemon does not use daemon_fork() function from
    libdaemon, but instead calls Linux daemon() function directly. It also
    passes (1, 1) arguments when logging method (-m) is either stderr (the
    default) or stderr_syslog. The second argument's value = 1 means that
    stderr is not closed and left there for (some) log messages.

    For neutron, it means that corresponding execute() call that spawns
    radvd and expects the invoked process to close stderr does not ever get
    completed. The current thread that spawned radvd is locked waiting for
    radvd to exit, which does not ever occur unless the process crashes or
    receives a signal.

    Since L3 agent gives exclusive access to updates queue for each router
    to one of processing threads only, it means that the thread that got to
    serve a radvd-powered subnet will not proceed and not update any new
    ports or other changes to the router anymore.

    Passing -m syslog makes radvd 2.0+ close stderr and return to execute()
    caller, proceeding with router update processing. The same arguments
    should work for old (pre 2.0) versions of radvd too, so passing them
    unconditionally.

    We could instead use -m logfile and pass appropriate -l <logfile>
    argument to radvd to make it log to a log file located in router's
    namespace storage path. Though that would be not in line with what
    dnsmasq processes currently do for dhcp agent, where we log all messages
    to syslog, so sticking to syslog for radvd for consistency.

    Juno changes:
    - tests: we need to construct RouterInfo to get router namespace name.

    Change-Id: I131db0639bc46d332ed48faa2bbe68a214264062
    Closes-Bug: #1398779
    (cherry picked from commit 72d41174765540bb7672b545c336fb7aaad075e8)

tags: added: in-stable-juno
Alan Pevec (apevec)
tags: removed: in-stable-juno juno-backport-potential
Thierry Carrez (ttx)
Changed in neutron:
milestone: kilo-1 → 2015.1.0
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.