Deployment failure due to missing iscsi target

Bug #1316350 reported by Ben Nemec
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
tripleo
Expired
High
Unassigned

Bug Description

Saw this on a CI run that failed bogusly: https://review.openstack.org/#/c/90425/ It's not a functional change, so there's no way it could have broken anything. Looking through the logs, the most likely culprit appears to be this error in the baremetal-deploy-helper log:

2014-05-05 18:59:53.598 4819 ERROR nova.virt.baremetal.deploy_helper [req-0d5b3022-4b5d-46e4-8677-6d0893ca93db None] deployment to node 2 failed
2014-05-05 18:59:53.598 4819 TRACE nova.virt.baremetal.deploy_helper Traceback (most recent call last):
2014-05-05 18:59:53.598 4819 TRACE nova.virt.baremetal.deploy_helper File "/opt/stack/venvs/nova/local/lib/python2.7/site-packages/nova/cmd/baremetal_deploy_helper.py", line 288, in run
2014-05-05 18:59:53.598 4819 TRACE nova.virt.baremetal.deploy_helper deploy(**params)
2014-05-05 18:59:53.598 4819 TRACE nova.virt.baremetal.deploy_helper File "/opt/stack/venvs/nova/local/lib/python2.7/site-packages/nova/cmd/baremetal_deploy_helper.py", line 247, in deploy
2014-05-05 18:59:53.598 4819 TRACE nova.virt.baremetal.deploy_helper image_path, preserve_ephemeral)
2014-05-05 18:59:53.598 4819 TRACE nova.virt.baremetal.deploy_helper File "/opt/stack/venvs/nova/local/lib/python2.7/site-packages/nova/cmd/baremetal_deploy_helper.py", line 209, in work_on_disk
2014-05-05 18:59:53.598 4819 TRACE nova.virt.baremetal.deploy_helper if not is_block_device(dev):
2014-05-05 18:59:53.598 4819 TRACE nova.virt.baremetal.deploy_helper File "/opt/stack/venvs/nova/local/lib/python2.7/site-packages/nova/cmd/baremetal_deploy_helper.py", line 112, in is_block_device
2014-05-05 18:59:53.598 4819 TRACE nova.virt.baremetal.deploy_helper s = os.stat(dev)
2014-05-05 18:59:53.598 4819 TRACE nova.virt.baremetal.deploy_helper OSError: [Errno 2] No such file or directory: '/dev/disk/by-path/ip-192.0.2.2:3260-iscsi-iqn-bd9fccef-bc9b-4b6d-976c-35061fd15fbc-lun-1'
2014-05-05 18:59:53.598 4819 TRACE nova.virt.baremetal.deploy_helper

Revision history for this message
Ben Nemec (bnemec) wrote :

Been seeing this more lately in CI, so bumping priority.

Changed in tripleo:
importance: Medium → High
Revision history for this message
Derek Higgins (derekh) wrote :
Download full text (3.3 KiB)

tripleo novabm ci jobs are plagued by errors accessing the iscii target, usually appearing in the nova-baremetal-deploy-helper

I can reliably reproduce these (below) and some other examples locally and can consistently eliminate the problem by increasing some sleeps in the nova baremetal driver.

Occurrences on the hp1 rack go from 20% failures due to these problems down to about 2 or 3 %

http://logs.openstack.org/88/117788/3/check-tripleo/check-tripleo-novabm-overcloud-f20-nonha/b25c43d/logs/seed_logs/nova-baremetal-deploy-helper.txt.gz
[req-770c02dc-a476-4b39-b25a-c0b4395dc759 None] deployment to node 9 failed
Traceback (most recent call last):
  File "/opt/stack/venvs/nova/lib/python2.7/site-packages/nova/cmd/baremetal_deploy_helper.py", line 288, in run
    deploy(**params)
  File "/opt/stack/venvs/nova/lib/python2.7/site-packages/nova/cmd/baremetal_deploy_helper.py", line 247, in deploy
    image_path, preserve_ephemeral)
  File "/opt/stack/venvs/nova/lib/python2.7/site-packages/nova/cmd/baremetal_deploy_helper.py", line 209, in work_on_disk
    if not is_block_device(dev):
  File "/opt/stack/venvs/nova/lib/python2.7/site-packages/nova/cmd/baremetal_deploy_helper.py", line 112, in is_block_device
    s = os.stat(dev)
OSError: [Errno 2] No such file or directory: '/dev/disk/by-path/ip-192.0.2.5:3260-iscsi-iqn-53e4c637-084d-4253-828b-8f779edc362d-lun-1'

http://logs.openstack.org/88/117788/3/check-tripleo/check-tripleo-novabm-overcloud-f20-nonha/b25c43d/logs/seed_logs/nova-baremetal-deploy-helper.txt.gz
sudo[27324]: nova : TTY=unknown ; PWD=/ ; USER=root ; COMMAND=/usr/local/bin/nova-rootwrap /etc/nova/rootwrap.conf dd if=/mnt/state/var/lib/nova/instances/instance-00000003/disk of=/dev/disk/by-path/ip-192.0.2.7:3260-iscsi-iqn-6c0607c1-29ac-415e-98ca-ec9a47030c6e-lun-1-part3 bs=1M oflag=direct
[-] deployment to node 8 failed
Traceback (most recent call last):
  File "/opt/stack/venvs/nova/lib/python2.7/site-packages/nova/cmd/baremetal_deploy_helper.py", line 288, in run
    deploy(**params)
  File "/opt/stack/venvs/nova/lib/python2.7/site-packages/nova/cmd/baremetal_deploy_helper.py", line 247, in deploy
    image_path, preserve_ephemeral)
  File "/opt/stack/venvs/nova/lib/python2.7/site-packages/nova/cmd/baremetal_deploy_helper.py", line 218, in work_on_disk
    dd(image_path, root_part)
  File "/opt/stack/venvs/nova/lib/python2.7/site-packages/nova/cmd/baremetal_deploy_helper.py", line 124, in dd
    check_exit_code=[0])
  File "/opt/stack/venvs/nova/lib/python2.7/site-packages/nova/utils.py", line 163, in execute
    return processutils.execute(*cmd, **kwargs)
  File "/opt/stack/venvs/nova/lib/python2.7/site-packages/nova/openstack/common/processutils.py", line 199, in execute
    sanitized_stderr = strutils.mask_password(stderr)
  File "/opt/stack/venvs/nova/lib/python2.7/site-packages/nova/openstack/common/strutils.py", line 295, in mask_password
    message = six.text_type(message)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 23: ordinal not in range(128)
Sep 11 13:29:33 host-192-168-1-54 nova-baremetal-deploy-helper[3986]: 2014-09-11 13:29:20.814 3986 TRACE nova.virt.baremetal.deploy_helper

The character...

Read more...

Revision history for this message
Dan Prince (dan-prince) wrote :

Ironic also has a few sleeps that we might should look at as well:

http://git.openstack.org/cgit/openstack/ironic/tree/ironic/drivers/modules/deploy_utils.py#n68

Revision history for this message
Derek Higgins (derekh) wrote :
Revision history for this message
Steven Hardy (shardy) wrote : potentially eol bug

This bug was reported against an old version of TripleO, and may no longer be valid.

Since it was reported before the start of the liberty cycle (and our oldest stable
branch is stable/liberty), I'm marking this incomplete.

Please reopen this (change the status from incomplete) if the bug is still valid
on a current supported (stable/liberty, stable/mitaka or trunk) version of TripleO,
thanks!

Changed in tripleo:
status: Triaged → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for tripleo because there has been no activity for 60 days.]

Changed in tripleo:
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.