[BVT] Provisioning of node failed : func do_bootloader failed on attempts to execute Command: blkid -o value -s UUID /dev/mapper/os-swap

Bug #1550227 reported by Tatyanka
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Fix Released
Critical
Dmitry Bilunov

Bug Description

  Deploy ceph HA with RadosGW for objects

    Scenario:

            Create cluster with Neutron
            Add 3 nodes with controller role
            Add 3 nodes with compute and ceph-osd role
            Deploy the cluster
            Check ceph status
            Run OSTF tests
            Check the radosqw daemon is started

Actual:
Provisioning failed:
2016-02-25 18:17:00.721 5920 DEBUG fuel_agent.utils.utils [-] Trying to execute command: chroot /tmp/target grep -v rootfs /proc/mounts
2016-02-25 18:17:00.745 5920 DEBUG fuel_agent.utils.utils [-] Trying to execute command: blkid -o value -s UUID /dev/vda3
2016-02-25 18:17:00.760 5920 DEBUG fuel_agent.utils.utils [-] Trying to execute command: blkid -o value -s UUID /dev/mapper/os-root
2016-02-25 18:17:00.780 5920 DEBUG fuel_agent.utils.utils [-] Trying to execute command: blkid -o value -s UUID /dev/mapper/os-swap
2016-02-25 18:17:00.794 5920 WARNING fuel_agent.utils.utils [-] Failed to execute command: Unexpected error while running command.
Command: blkid -o value -s UUID /dev/mapper/os-swap
Exit code: 2
Stdout: ''
Stderr: ''
2016-02-25 18:17:00.794 5920 ERROR fuel_agent.cmd.agent [-] Unexpected error while running command.
Command: blkid -o value -s UUID /dev/mapper/os-swap
Exit code: 2
Stdout: ''
Stderr: ''
2016-02-25 18:17:00.794 5920 TRACE fuel_agent.cmd.agent Traceback (most recent call last):
2016-02-25 18:17:00.794 5920 TRACE fuel_agent.cmd.agent File "/usr/lib/python2.7/dist-packages/fuel_agent/cmd/agent.py", line 118, in main
2016-02-25 18:17:00.794 5920 TRACE fuel_agent.cmd.agent getattr(mgr, action)()
2016-02-25 18:17:00.794 5920 TRACE fuel_agent.cmd.agent File "/usr/lib/python2.7/dist-packages/fuel_agent/manager.py", line 779, in do_provisioning
2016-02-25 18:17:00.794 5920 TRACE fuel_agent.cmd.agent self.do_bootloader()
2016-02-25 18:17:00.794 5920 TRACE fuel_agent.cmd.agent File "/usr/lib/python2.7/dist-packages/fuel_agent/manager.py", line 662, in do_bootloader
2016-02-25 18:17:00.794 5920 TRACE fuel_agent.cmd.agent check_exit_code=[0])[0].strip()
2016-02-25 18:17:00.794 5920 TRACE fuel_agent.cmd.agent File "/usr/lib/python2.7/dist-packages/fuel_agent/utils/utils.py", line 133, in execute
2016-02-25 18:17:00.794 5920 TRACE fuel_agent.cmd.agent stderr=stderr, cmd=command)
2016-02-25 18:17:00.794 5920 TRACE fuel_agent.cmd.agent ProcessExecutionError: Unexpected error while running command.
2016-02-25 18:17:00.794 5920 TRACE fuel_agent.cmd.agent Command: blkid -o value -s UUID /dev/mapper/os-swap
2016-02-25 18:17:00.794 5920 TRACE fuel_agent.cmd.agent Exit code: 2
2016-02-25 18:17:00.794 5920 TRACE fuel_agent.cmd.agent Stdout: ''
2016-02-25 18:17:00.794 5920 TRACE fuel_agent.cmd.agent Stderr: ''
2016-02-25 18:17:00.794 5920 TRACE fuel_agent.cmd.agent

http://paste.openstack.org/show/488326/

Revision history for this message
Tatyanka (tatyana-leontovich) wrote :
Revision history for this message
Tatyanka (tatyana-leontovich) wrote :
tags: added: swarm-blocker
Changed in fuel:
status: New → Confirmed
tags: added: area-python
Revision history for this message
Alexander Gordeev (a-gordeev) wrote :

According to the manual of blkid:

If the specified token was not found, or no (specified) devices could be identified, an exit code of 2 is returned.

tags: added: tricky
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-agent (master)

Fix proposed to branch: master
Review: https://review.openstack.org/285318

Changed in fuel:
assignee: Fuel Python Team (fuel-python) → Dmitry Bilunov (dbilunov)
status: Confirmed → In Progress
Revision history for this message
Alexander Gordeev (a-gordeev) wrote :

can it be related with recent patch which adds multipath packages to IBP? https://review.openstack.org/#/c/277886/

for some reasons, multipathd was doing its activities exactly during the period of time when fuel-agent was creating lv

syslog: http://paste.openstack.org/show/488363/
partial log of fuel-agent: http://paste.openstack.org/show/488364/

multipathd and lvm, are not friends when it's time to sync with udev, right?

Revision history for this message
Alexander Gordeev (a-gordeev) wrote :

is it reproducible? how often does it get caught on our BVT?

Revision history for this message
Dmitry Bilunov (dbilunov) wrote :

Actually, this bug is caused by a strange UUID.
As I see, anything that matches the pattern xxxxxxxx-8f13-xxxx-xxxx-xxxxxxxxxxxx makes blkid fail by
treating it as a minix filesystem signature.

Revision history for this message
Dmitry Bilunov (dbilunov) wrote :

Also the problem cannot be reproduced on swap partitions less than 256M + 8k.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-agent (master)

Reviewed: https://review.openstack.org/285318
Committed: https://git.openstack.org/cgit/openstack/fuel-agent/commit/?id=10b07592a9a1cc4ac6e1917f629ce818a06ebc6c
Submitter: Jenkins
Branch: master

commit 10b07592a9a1cc4ac6e1917f629ce818a06ebc6c
Author: Dmitry Bilunov <email address hidden>
Date: Fri Feb 26 17:07:37 2016 +0300

    Avoid swapspace uuid's time_mid collision with minix fs magic

    mkswap defaults to generating a random UUID. The second field
    of that UUID, time_mid, is written at offset 0x410. If a
    random generated value yields to have 0x8f13 in this very
    location, blkid mistakenly treats the block device as a minix
    filesystem superblock and then fails reading the structure.

    This bug is triggered with probability of 2^-16.

    We can work around this bug by re-running mkfs several times
    unless blkid is able to examine the newly-created filesystem.

    Change-Id: I13209faeb4e1d2798b7b0e0e8382e43803998d36
    Closes-Bug: #1550227

Changed in fuel:
status: In Progress → Fix Committed
Revision history for this message
Alexander Gordeev (a-gordeev) wrote :

looks like we forgot to add '-c /dev/null' to blkid call in manager.py

https://github.com/openstack/fuel-agent/blob/master/fuel_agent/manager.py#L661

moving back to confirmed status.

Changed in fuel:
status: Fix Committed → Confirmed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-agent (master)

Fix proposed to branch: master
Review: https://review.openstack.org/287695

Changed in fuel:
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-agent (master)

Reviewed: https://review.openstack.org/287695
Committed: https://git.openstack.org/cgit/openstack/fuel-agent/commit/?id=ab41c3db58def6a17588cf3472ba231822c59ad3
Submitter: Jenkins
Branch: master

commit ab41c3db58def6a17588cf3472ba231822c59ad3
Author: Dmitry Bilunov <email address hidden>
Date: Thu Mar 3 13:28:53 2016 +0300

    Disable caching for blkid calls

    blkid saves block device metadata cache to a file (default location is
    /dev/.blkid.tab). Because of the probabilistic nature of UUID collision
    bug (which causes blkid to misinterpret partition metadata structures)
    we need to make sure that all data read by blkid comes from the actual
    device, not from the cache.

    Passing "-c /dev/null" effectively suppresses the blkid caching.

    Change-Id: I9e740286ffefb780d06437ed8e56b5f73ad6fa97
    Partial-Bug: 1550227

Changed in fuel:
status: In Progress → Fix Committed
Vladimir (vushakov)
tags: added: on-verification
Revision history for this message
Vladimir (vushakov) wrote :

Verified on:
    Fuel 9.0 iso #91

tags: removed: on-verification
Changed in fuel:
status: Fix Committed → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-agent (stable/8.0)

Fix proposed to branch: stable/8.0
Review: https://review.openstack.org/301752

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on fuel-agent (stable/8.0)

Change abandoned by Fuel DevOps Robot (<email address hidden>) on branch: stable/8.0
Review: https://review.openstack.org/301752
Reason: This review is > 4 weeks without comment, and failed Jenkins the last time it was checked. We are abandoning this for now. Feel free to reactivate the review by pressing the restore button and leaving a 'recheck' comment to get fresh test results.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.