[BVT] Volume failed to be attached to instance

Bug #1525138 reported by Tatyanka
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Fix Released
High
Artem Panchenko

Bug Description

Steps:
            1. Create cluster
            2. Add 1 node with controller role
            3. Add 2 nodes with compute role
            4. Run network verification
            5. Deploy the cluster
            6. Run network verification
            7. Run OSTF

Actual:
Ostf test: Create volume and attach it to instance fails with: Time limit exceeded while waiting for volume becoming 'in-use'
In nova.log
http://paste.openstack.org/show/481596/

VERSION:
  feature_groups:
    - mirantis
  production: "docker"
  release: "8.0"
  api: "1.0"
  build_number: "285"
  build_id: "285"
  fuel-nailgun_sha: "eb187d0eed96226ca0d8c4513ffcafc4d09dee62"
  python-fuelclient_sha: "f96659066e522e28b389de3cc685f6f2aacca3da"
  fuel-agent_sha: "2f18b7596bc7da79d2f28c34f42620b2090d8a35"
  fuel-nailgun-agent_sha: "a33a58d378c117c0f509b0e7badc6f0910364154"
  astute_sha: "e8c753d6ce1405df78d032e88c0d5a1c6f3d17ce"
  fuel-library_sha: "15d966e164a8b8f7d94c70863bd1cc4990703ec8"
  fuel-ostf_sha: "632730169e8c01afe7fd5d78a898f00d4646358b"
  fuel-mirror_sha: "31b9df814960ec69b644ca9b689dacec0c7e10a1"
  fuelmenu_sha: "680b720291ff577f4c058cee25f85e563c96312e"
  shotgun_sha: "a0bd06508067935f2ae9be2523ed0d1717b995ce"
  network-checker_sha: "a3534f8885246afb15609c54f91d3b23d599a5b1"
  fuel-upgrade_sha: "1e894e26d4e1423a9b0d66abd6a79505f4175ff6"
  fuelmain_sha: "1577a306c2c9e7bd12f28c0e16cf3652997da004"

job: https://product-ci.infra.mirantis.net/job/8.0.ubuntu.smoke_neutron/267/consoleText

Revision history for this message
Tatyanka (tatyana-leontovich) wrote :
Revision history for this message
Roman Podoliaka (rpodolyaka) wrote :

According to logs something must have failed on the remote side:

2015-12-11T03:43:46.011151+00:00 debug: Running cmd (subprocess): sudo nova-rootwrap /etc/nova/rootwrap.conf iscsiadm -m node -T iqn.2010-10.org.openstack:volume-4285d50b-6e50-43a7-9626-1a0e843431ff -p 10.109.2.1:3260
2015-12-11T03:43:46.058498+00:00 debug: CMD "sudo nova-rootwrap /etc/nova/rootwrap.conf iscsiadm -m node -T iqn.2010-10.org.openstack:volume-4285d50b-6e50-43a7-9626-1a0e843431ff -p 10.109.2.1:3260" returned: 21 in 0.047s
2015-12-11T03:43:46.058849+00:00 debug: u'sudo nova-rootwrap /etc/nova/rootwrap.conf iscsiadm -m node -T iqn.2010-10.org.openstack:volume-4285d50b-6e50-43a7-9626-1a0e843431ff -p 10.109.2.1:3260' failed. Not Retrying.

http://linux.die.net/man/8/iscsiadm:
   21 ISCSI_ERR_NO_OBJS_FOUND - no records/targets/sessions/portals found to execute operation on.

We'll take a closer look at Cinder logs.

Revision history for this message
Roman Podoliaka (rpodolyaka) wrote :

At the same time Cinder reports that the volume has been initialised successfully:

2015-12-11T03:43:45.741911+00:00 debug: Running cmd (subprocess): sudo cinder-rootwrap /etc/cinder/rootwrap.conf tgt-admin --show
2015-12-11T03:43:45.813492+00:00 debug: CMD "sudo cinder-rootwrap /etc/cinder/rootwrap.conf tgt-admin --show" returned: 0 in 0.072s
2015-12-11T03:43:45.814022+00:00 debug: Running cmd (subprocess): sudo cinder-rootwrap /etc/cinder/rootwrap.conf tgt-admin --show
2015-12-11T03:43:45.886937+00:00 debug: CMD "sudo cinder-rootwrap /etc/cinder/rootwrap.conf tgt-admin --show" returned: 0 in 0.073s
2015-12-11T03:43:45.888047+00:00 debug: Set provider_location to: 10.109.2.1:3260,1 iqn.2010-10.org.openstack:volume-4285d50b-6e50-43a7-9626-1
a0e843431ff 1
2015-12-11T03:43:45.994404+00:00 info: Initialize volume connection completed successfully.
2015-12-11T03:43:46.000943+00:00 debug: sending reply msg_id: 8624c498023048ff9e6a11b753bd4586 size: 146 reply queue: reply_d99b6ae53b5841e5bd
f3fde0c5ec7cd5

summary: - [BVT] Volume failed to be attached to intsnace
+ [BVT] Volume failed to be attached to instance
Revision history for this message
Roman Podoliaka (rpodolyaka) wrote :

So I've taken a closer look and this looks like a problem with environment configuration.

Cinder node (effectively, a compute node running cinder-volume):

10: br-storage: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
    link/ether 64:28:fb:8b:0e:ea brd ff:ff:ff:ff:ff:ff
    inet 10.109.2.1/24 brd 10.109.2.255 scope global br-storage
       valid_lft forever preferred_lft forever
    inet6 fe80::5c99:bdff:fe68:426/64 scope link
       valid_lft forever preferred_lft forever

root@node-1:~# netstat -nalp | grep 3260
tcp 0 0 0.0.0.0:3260 0.0.0.0:* LISTEN 15345/tgtd
tcp6 0 0 :::3260 :::* LISTEN 15345/tgtd

root@node-1:~# telnet 10.109.2.1 3260
Trying 10.109.2.1...
Connected to 10.109.2.1.
Escape character is '^]'.
^]
telnet>

Everything works fine from the same host.

Nova compute node (the one an instance is running on):

root@node-2:~# telnet 10.109.2.1 3260
Trying 10.109.2.1...
telnet: Unable to connect to remote host: Connection refused

No connectivity from other host.

Here we can see that this IP address is actually configured on two different hosts:

root@node-3:~# arping 10.109.2.1
ARPING 10.109.2.1
60 bytes from 52:54:00:7b:88:aa (10.109.2.1): index=0 time=203.450 msec
60 bytes from 52:54:00:7b:88:aa (10.109.2.1): index=1 time=1.001 sec
60 bytes from 64:28:fb:8b:0e:ea (10.109.2.1): index=2 time=3.293 usec
60 bytes from 52:54:00:7b:88:aa (10.109.2.1): index=3 time=1.001 sec
60 bytes from 64:28:fb:8b:0e:ea (10.109.2.1): index=4 time=1.001 sec

And node-2 is really trying to connect to the host system, not to another slave:

root@node-2:~# ssh 10.109.2.1
/* *\
  Welcome to the Product server srv44-bud.infra.mirantis.net

  Please note:
    * You have to upload your key to auth.mirantis.com to access here.
    * Key deploying procedure takes up to 1 hour.
    * All actions are logged and tracked.
    * We're NOT granting sudo to anyone except Product DevOps team.

  On any questions please do not hesitate to contact.
  --
  #infra @Slack
  #fuel-infra @Freenode
\* */
root@10.109.2.1's password:

On the host:

(venv-nailgun-tests-2.9)rpodolyaka@srv44-bud:~$ ip a | grep 10.109.2.1
    inet 10.109.2.1/24 brd 10.109.2.255 scope global fuelbr2158

Revision history for this message
Roman Podoliaka (rpodolyaka) wrote :

For some reason, we set an incorrect range of IP addresses to be used by this cluster, which intersects with an IP address used on the host storage bridge, created by fuel-devops for this particular environment.

I suggest we take a look at recent changes to fuel-qa and fuel-devops.

Changed in fuel:
assignee: MOS Nova (mos-nova) → Fuel QA Team (fuel-qa)
status: New → Confirmed
tags: added: area-qa
removed: area-mos
description: updated
Changed in fuel:
status: Confirmed → Incomplete
Changed in fuel:
assignee: Fuel QA Team (fuel-qa) → Artem Panchenko (apanchenko-8)
status: Incomplete → In Progress
Changed in fuel:
importance: Critical → High
Revision history for this message
Artem Panchenko (apanchenko-8) wrote :

Lowering priority to 'high' because this issue is floating and looks like affects only small part of tests (I think because in tests for multi racks we set correct IP ranges; in other cases we don't use gateway in management/storage networks, so usually ARP doesn't get as issue due to enabled 'bridge-nf-call-arptables' option on nodes)

tags: added: system-tests
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-qa (master)

Reviewed: https://review.openstack.org/257050
Committed: https://git.openstack.org/cgit/openstack/fuel-qa/commit/?id=385b7327e8dae2f111217965c895191340ebf1d8
Submitter: Jenkins
Branch: master

commit 385b7327e8dae2f111217965c895191340ebf1d8
Author: Artem Panchenko <email address hidden>
Date: Sun Dec 13 17:41:23 2015 +0200

    Set correct IP ranges in default network config

    Since default network settings are used for new
    environments, it's necessary to configure valid
    IP ranges (which don't include router's IPs) for
    management, storage and private networks.

    Closes-bug: #1525138
    Change-Id: Ia3587e5333d858271aeea5bd646efe6bb6c27460

Changed in fuel:
status: In Progress → Fix Committed
Revision history for this message
Tatyanka (tatyana-leontovich) wrote :

Run manually smoke tests, issue is not reproduced as well as on todays swarm - so move to released

Changed in fuel:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.