swift unable to start in gate-tempest-dsvm-neutron-large-ops

Bug #1262906 reported by Joe Gordon
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
devstack
Fix Released
Undecided
Sean Dague
neutron
Won't Fix
Undecided
Unassigned
Joe Gordon (jogo)
Changed in neutron:
status: New → Confirmed
Revision history for this message
Samuel Merritt (torgomatic) wrote :

Based on that stack trace, it's definitely swift attempting to connect to syslog that's causing the error in screen-s-container.txt.

Revision history for this message
Joe Gordon (jogo) wrote :

This is only happening in gate-tempest-dsvm-neutron-large-ops which is why this is filed under neutron.

Revision history for this message
Anita Kuno (anteaya) wrote :

In this log file http://logs.openstack.org/70/62770/4/check/gate-tempest-dsvm-neutron-large-ops/d5f6251/console.html#_2013-12-19_22_00_00_881
there is a three second gap in the log. This indicates that the cpu was not available for 3 seconds.

This job ran on hpcloud-az2 Linux 3.2.0-57-virtual (devstack-precise-hpcloud-az2-899237) 12/19/2013 _x86_64_ (4 CPU) which may have been having host issues at this time.

It will be interesting to see the result of running this job on a different host.

Revision history for this message
Joe Gordon (jogo) wrote :

After further investigation, it looks like rsyslog is restarted during nova-manage db_sync , and for some reason when running gate-tempest-dsvm-neutron-large-op it doesn't restart properly.

Nov 26 00:45:36 devstack-precise-check-rax-iad-737920 rsyslogd: [origin software="rsyslogd" swVersion="5.8.6" x-pid="3834" x-info="http://www.rsyslog.com"] exiting on signal 15.

Revision history for this message
Joe Gordon (jogo) wrote :

marking as critical because this is a gate issue, and it appears to be related to nova (at least indirectly)

Changed in nova:
importance: Undecided → Critical
Revision history for this message
Davanum Srinivas (DIMS) (dims-v) wrote :

setup_host method in [1] seems to have a stop/start of rsyslog and is called twice!

[1] https://git.openstack.org/cgit/openstack-infra/devstack-gate/tree/devstack-vm-gate-wrap.sh

Revision history for this message
James E. Blair (corvus) wrote :

Yes, I noticed that and have a patch up to fix that: https://review.openstack.org/#/c/62983/

However, as all of that happens well before devstack starts (and on every job) I don't believe it's the problem.

Revision history for this message
Joe Gordon (jogo) wrote :

Jim got a hold of a box having this problem and we were able to do a bit more digging.

The nova-manage db_sync was a red herring, the timestamps in console.html and syslog.txt our out of sync.

We set up devstack to not use rsyslog, but swift is still using it. So for now the solution is to make swift not use syslog when we set SYSLOG=False in localrc.

It looks like there is possibly a strange interaction between
openvswitch and syslog that is making syslog not shutdown properly

http://paste.openstack.org/show/55760/

http://paste.openstack.org/show/55767/

Joe Gordon (jogo)
Changed in nova:
importance: Critical → Undecided
no longer affects: nova
Revision history for this message
Davanum Srinivas (DIMS) (dims-v) wrote :

Strange. Don't see any hits in last 12 hours!

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to devstack (master)

Fix proposed to branch: master
Review: https://review.openstack.org/81663

Changed in devstack:
assignee: nobody → Sean Dague (sdague)
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to devstack (master)

Reviewed: https://review.openstack.org/81663
Committed: https://git.openstack.org/cgit/openstack-dev/devstack/commit/?id=ad7e8c63e6891c59eb4387a01d94838f60370930
Submitter: Jenkins
Branch: master

commit ad7e8c63e6891c59eb4387a01d94838f60370930
Author: Sean Dague <email address hidden>
Date: Wed Mar 19 19:13:20 2014 -0400

    move the rsyslogd restart

    it's not clear why swift start is the place where an rsyslogd start
    is happening, we should really only make this change when we actually
    change a file on disk.

    Also, use rsyslogd's -HUP signal directly instead of the system init
    scripts which are typically doing a stop and start, and apparently
    racing under some circumstances.

    Change-Id: I1b9891313d67b1da2ca2582e532b2536a81f9b25
    Closes-Bug: #1262906

Changed in devstack:
status: In Progress → Fix Released
Revision history for this message
Cedric Brandily (cbrandily) wrote :

This bug is > 365 days without activity. We are unsetting assignee and milestone and setting status to Incomplete in order to allow its expiry in 60 days.

If the bug is still valid, then update the bug status.

Changed in neutron:
status: Confirmed → Incomplete
Revision history for this message
Armando Migliaccio (armando-migliaccio) wrote :

This bug is > 240 days without activity. We are unsetting assignee and milestone and setting status to Incomplete in order to allow its expiry in 60 days.

If the bug is still valid, then update the bug status.

Revision history for this message
Rodolfo Alonso (rodolfo-alonso-hernandez) wrote :

Neutron bug closed due to lack of activity, please feel free to reopen if needed.

Changed in neutron:
status: Incomplete → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.