pacemaker service provider race condition

Bug #1355816 reported by Andrey Epifanov
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Fix Committed
High
Bogdan Dobrelya
5.0.x
Fix Committed
High
Bogdan Dobrelya

Bug Description

http://jenkins-product.srt.mirantis.net:8080/view/0_master_swarm/job/master-test-reports/184/

OS - Ubuntu
Mode - HA
Neutron + VLAN

Steps to reproduce:
Deploy OS using FUEL

Previous bug for this was:
https://bugs.launchpad.net/fuel/+bug/1306705

Revision history for this message
Andrey Epifanov (aepifanov) wrote :
no longer affects: fuel
Changed in mos:
milestone: none → 5.1
Revision history for this message
Andrey Epifanov (aepifanov) wrote :
Download full text (7.1 KiB)

2014-08-12 07:25:02 ERR
 (/Stage[main]/Neutron::Agents::L3/Service[neutron-l3]) /usr/bin/puppet:4
2014-08-12 07:25:02 ERR
 (/Stage[main]/Neutron::Agents::L3/Service[neutron-l3]) /usr/lib/ruby/vendor_ruby/puppet/util/command_line.rb:91:in `execute'
2014-08-12 07:25:02 ERR
 (/Stage[main]/Neutron::Agents::L3/Service[neutron-l3]) /usr/lib/ruby/vendor_ruby/puppet/util/command_line.rb:137:in `run'
2014-08-12 07:25:02 ERR
 (/Stage[main]/Neutron::Agents::L3/Service[neutron-l3]) /usr/lib/ruby/vendor_ruby/puppet/application.rb:364:in `run'
2014-08-12 07:25:02 ERR
 (/Stage[main]/Neutron::Agents::L3/Service[neutron-l3]) /usr/lib/ruby/vendor_ruby/puppet/util.rb:478:in `exit_on_fail'
2014-08-12 07:25:02 ERR
 (/Stage[main]/Neutron::Agents::L3/Service[neutron-l3]) /usr/lib/ruby/vendor_ruby/puppet/application.rb:364:in `run'
2014-08-12 07:25:02 ERR
 (/Stage[main]/Neutron::Agents::L3/Service[neutron-l3]) /usr/lib/ruby/vendor_ruby/puppet/application.rb:470:in `plugin_hook'
2014-08-12 07:25:02 ERR
 (/Stage[main]/Neutron::Agents::L3/Service[neutron-l3]) /usr/lib/ruby/vendor_ruby/puppet/application.rb:364:in `run'
2014-08-12 07:25:02 ERR
 (/Stage[main]/Neutron::Agents::L3/Service[neutron-l3]) /usr/lib/ruby/vendor_ruby/puppet/application/apply.rb:146:in `run_command'
2014-08-12 07:25:02 ERR
 (/Stage[main]/Neutron::Agents::L3/Service[neutron-l3]) /usr/lib/ruby/vendor_ruby/puppet/application/apply.rb:218:in `main'
2014-08-12 07:25:02 ERR
 (/Stage[main]/Neutron::Agents::L3/Service[neutron-l3]) /usr/lib/ruby/vendor_ruby/puppet/application/apply.rb:268:in `apply_catalog'
2014-08-12 07:25:02 ERR
 (/Stage[main]/Neutron::Agents::L3/Service[neutron-l3]) /usr/lib/ruby/vendor_ruby/puppet/configurer.rb:192:in `run'
2014-08-12 07:25:02 ERR
 (/Stage[main]/Neutron::Agents::L3/Service[neutron-l3]) /usr/lib/ruby/vendor_ruby/puppet/configurer.rb:124:in `apply_catalog'
2014-08-12 07:25:02 ERR
 (/Stage[main]/Neutron::Agents::L3/Service[neutron-l3]) /usr/lib/ruby/vendor_ruby/puppet/util.rb:160:in `benchmark'
2014-08-12 07:25:02 ERR
 (/Stage[main]/Neutron::Agents::L3/Service[neutron-l3]) /usr/lib/ruby/1.8/benchmark.rb:308:in `realtime'
2014-08-12 07:25:02 ERR
 (/Stage[main]/Neutron::Agents::L3/Service[neutron-l3]) /usr/lib/ruby/vendor_ruby/puppet/util.rb:161:in `benchmark'
2014-08-12 07:25:02 ERR
 (/Stage[main]/Neutron::Agents::L3/Service[neutron-l3]) /usr/lib/ruby/vendor_ruby/puppet/configurer.rb:125:in `apply_catalog'
2014-08-12 07:25:02 ERR
 (/Stage[main]/Neutron::Agents::L3/Service[neutron-l3]) /usr/lib/ruby/vendor_ruby/puppet/resource/catalog.rb:163:in `apply'
2014-08-12 07:25:02 ERR
 (/Stage[main]/Neutron::Agents::L3/Service[neutron-l3]) /usr/lib/ruby/vendor_ruby/puppet/transaction/report.rb:108:in `as_logging_destination'
2014-08-12 07:25:02 ERR
 (/Stage[main]/Neutron::Agents::L3/Service[neutron-l3]) /usr/lib/ruby/vendor_ruby/puppet/util/log.rb:149:in `with_destination'
2014-08-12 07:25:02 ERR
 (/Stage[main]/Neutron::Agents::L3/Service[neutron-l3]) /usr/lib/ruby/vendor_ruby/puppet/resource/catalog.rb:164:in `apply'
2014-08-12 07:25:02 ERR
 (/Stage[main]/Neutron::Agents::L3/Service[neutron-l3]) /usr/lib/ruby/vendor_ruby/puppet/transaction.rb:108:in `evaluate'...

Read more...

description: updated
description: updated
description: updated
description: updated
tags: added: library
Changed in fuel:
milestone: none → 5.1
Revision history for this message
Eugene Nikanorov (enikanorov) wrote :

Seeing similar issue for HA+GRE+unbuntu on 425 iso:

2014-08-12 15:04:41 INFO
 (/Stage[main]/Neutron::Agents::L3/Anchor[neutron-l3-done]) Evaluated in 0.01 seconds
2014-08-12 15:04:41 WARNING
 (/Stage[main]/Neutron::Agents::L3/Anchor[neutron-l3-done]) Skipping because of failed dependencies
2014-08-12 15:04:41 NOTICE
 (/Stage[main]/Neutron::Agents::L3/Anchor[neutron-l3-done]) Dependency Service[mysql-service] has failures: true
2014-08-12 15:04:41 NOTICE
 (/Stage[main]/Neutron::Agents::L3/Anchor[neutron-l3-done]) Dependency Service[p_rabbitmq-server] has failures: true
2014-08-12 15:04:41 NOTICE
 (/Stage[main]/Neutron::Agents::L3/Anchor[neutron-l3-done]) Dependency Service[p_haproxy] has failures: true
2014-08-12 15:04:41 INFO
 (/Stage[main]/Neutron::Agents::L3/Anchor[neutron-l3-done]) Starting to evaluate the resource
2014-08-12 15:04:41 INFO
 (/Stage[main]/Neutron::Agents::L3/Service[neutron-l3]) Evaluated in 0.01 seconds
2014-08-12 15:04:41 WARNING
 (/Stage[main]/Neutron::Agents::L3/Service[neutron-l3]) Skipping because of failed dependencies
2014-08-12 15:04:41 NOTICE
 (/Stage[main]/Neutron::Agents::L3/Service[neutron-l3]) Dependency Service[mysql-service] has failures: true
2014-08-12 15:04:41 NOTICE
 (/Stage[main]/Neutron::Agents::L3/Service[neutron-l3]) Dependency Service[p_rabbitmq-server] has failures: true
2014-08-12 15:04:41 NOTICE
 (/Stage[main]/Neutron::Agents::L3/Service[neutron-l3]) Dependency Service[p_haproxy] has failures: true
2014-08-12 15:04:41 INFO
 (/Stage[main]/Neutron::Agents::L3/Service[neutron-l3]) Starting to evaluate the resource

Revision history for this message
Sergey Kolekonov (skolekonov) wrote :

This problem seems to be caused by MySQL:

(/Stage[main]/Galera/Exec[wait-initial-sync]) Failed to call refresh: /usr/bin/mysql -uwsrep_sst -ppassword -Nbe "show status like 'wsrep_local_state_comment'" | /bin/grep -q -e Synced -e Initialized && sleep 10 returned 1 instead of one of [0]

Also on the failed controller:
     p_mysql (ocf::mirantis:mysql-wss): Started (unmanaged) FAILED

Revision history for this message
Nastya Urlapova (aurlapova) wrote :

Guys, please don't trek issues about systest except services cases. Fuel-qa team have everyday schedule about duty on CI, and you did double work. Also this bug description is very poor, as usual we investigate env before create issue, just link to fail isn't enough.

Mike Scherbakov (mihgen)
Changed in fuel:
assignee: nobody → Fuel Library Team (fuel-library)
tags: added: ha
Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

From logs:
2014-08-12T07:01:47.441656 node-2 ./node-2.test.domain.local/puppet-apply.log:2014-08-12T07:01:47.441656+01:00 err: Could not set 'running' on ensure: undefined method `elements' for nil:NilClass at 49:/etc/puppet/modules/rabbitmq/manifests/service.pp

Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

And one more for neutron:
2014-08-12T07:10:59.777465 node-2 ./node-2.test.domain.local/puppet-apply.log:2014-08-12T07:10:59.777465+01:00 err: (/Stage[main]/Neutron::Agents::L3/Service[neutron-l3]) Failed to call refresh: undefined method `attributes' for nil:NilClass

Changed in fuel:
status: New → Confirmed
importance: Undecided → High
Changed in mos:
status: New → Confirmed
importance: Undecided → High
no longer affects: mos
Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

Root cause is what corosync_service.rb is missing block_until_ready() call in its dump_cib method. That might result in empty xml result.

Changed in fuel:
status: Confirmed → Triaged
assignee: Fuel Library Team (fuel-library) → Bogdan Dobrelya (bogdando)
Revision history for this message
Vladimir Kuklin (vkuklin) wrote :

ok, the problem is the possible race condition in CIB fetching for pacemaker service provider.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (master)

Fix proposed to branch: master
Review: https://review.openstack.org/114221

Changed in fuel:
status: Triaged → In Progress
summary: - deploy_neutron_vlan_ha failed
+ pacemaker service provider race condition
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-library (master)

Reviewed: https://review.openstack.org/114221
Committed: https://git.openstack.org/cgit/stackforge/fuel-library/commit/?id=ab4d09a1855478013789c65f27aba20daf7517cc
Submitter: Jenkins
Branch: master

commit ab4d09a1855478013789c65f27aba20daf7517cc
Author: Bogdan Dobrelya <email address hidden>
Date: Thu Aug 14 15:18:49 2014 +0300

    Fix dump_cib method for corosync service provider

    Corosync_service.rb is missing block_until_ready() call
    in its dump_cib method. That might result in empty xml result

    Closes-bug: #1355816

    Change-Id: I757522b297e0025cee7de3130743e0ef6a6fb77b
    Signed-off-by: Bogdan Dobrelya <email address hidden>

Changed in fuel:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (stable/5.0)

Fix proposed to branch: stable/5.0
Review: https://review.openstack.org/114915

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-library (stable/5.0)

Reviewed: https://review.openstack.org/114915
Committed: https://git.openstack.org/cgit/stackforge/fuel-library/commit/?id=d81b06854125b65292bcae48abce5e4fbf2befb8
Submitter: Jenkins
Branch: stable/5.0

commit d81b06854125b65292bcae48abce5e4fbf2befb8
Author: Bogdan Dobrelya <email address hidden>
Date: Thu Aug 14 15:18:49 2014 +0300

    Fix dump_cib method for corosync service provider

    Corosync_service.rb is missing block_until_ready() call
    in its dump_cib method. That might result in empty xml result.

    Closes-bug: #1355816

    Change-Id: I757522b297e0025cee7de3130743e0ef6a6fb77b
    Signed-off-by: Bogdan Dobrelya <email address hidden>

Changed in fuel:
status: Fix Committed → Fix Released
status: Fix Released → Fix Committed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.