Multiple nova-volume services fails to create volume on second storage server when using Nexenta driver

Bug #1039763 reported by Aimon Bustardo
24
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Cinder
Invalid
Undecided
Unassigned
cinder (Ubuntu)
Won't Fix
High
Unassigned

Bug Description

OS: Ubuntu 12.04
Arch: amd64
Nova Version: 2012.1+stable~20120612-3ee026e-0ubuntu1.2
Storage Driver: Nexenta

Using a single Nexenta server with one nova-volume works fine. When adding a second nova-volume service pointing to a second Nexenta server I get failures when creating the volume on the second server. Volumes on first server still work.

(both have same rabbit setting and talk to api ok)

scheduler conf:
--scheduler_driver=nova.scheduler.multi.MultiScheduler
--volume_scheduler_driver=nova.scheduler.chance.ChanceScheduler

nova-volume1 conf:

# VOLUME
--volume_name_template=volume-nova-volumes-1-14-2%08x
--volume_group=nova-volume-1-14-2
--quota_gigabytes=5000
# Nexenta Storage Driver
--volume_driver=nexenta.volume.NexentaDriver
--use_local_volumes=False
--nexenta_host=172.16.14.3
--nexenta_volume=nova-volume-1-14-1
--nexenta_user=<SANITZED>
--nexenta_password=<SANITZED>
--nexenta_rest_port=2000
--nexenta_rest_protocol=http

nova-volume2 conf:
# VOLUME
--volume_name_template=volume-nova-volumes-1-14-1%08x
--volume_group=nova-volumes-1-14-1
--quota_gigabytes=5000
# Nexenta Storage Driver
--volume_driver=nexenta.volume.NexentaDriver
--use_local_volumes=False
--nexenta_host=10.50.0.254
--nexenta_volume=nova-volume-1-14-1
--nexenta_user=<SANITZED>
--nexenta_password=<SANITZED>
--nexenta_rest_port=2000
--nexenta_rest_protocol=http

I start the second service with a second upstart script which points to the second nova-volume conf. There are no errors on startup. Authentication succeeds. Error shown in second nova volume logs:

nova.service: AUDIT: Starting volume node (version 2012.1-LOCALBRANCH:LOCALREVISION)
nova.volume.nexenta.jsonrpc: DEBUG: [req-62f2b7fe-1d8b-4278-b413-5c18ae3e9b44 None None] Sending JSON data: {"object": "volume", "params": ["nova-volume-1-14-1"], "method": "object_exists"} from (pid=87464) __call__ /usr/lib/python2.7/dist-packages/nova/volume/nexenta/jsonrpc.py:64
nova.volume.nexenta.jsonrpc: DEBUG: [req-62f2b7fe-1d8b-4278-b413-5c18ae3e9b44 None None] Got response: {"tg_flash": null, "result": 1, "error": null} from (pid=87464) __call__ /usr/lib/python2.7/dist-packages/nova/volume/nexenta/jsonrpc.py:79
nova.utils: DEBUG: [req-62f2b7fe-1d8b-4278-b413-5c18ae3e9b44 None None] backend <module 'nova.db.sqlalchemy.api' from '/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.pyc'> from (pid=87464) __get_backend /usr/lib/python2.7/dist-packages/nova/utils.py:658
nova.volume.manager: DEBUG: [req-62f2b7fe-1d8b-4278-b413-5c18ae3e9b44 None None] Re-exporting 0 volumes from (pid=87464) init_host /usr/lib/python2.7/dist-packages/nova/volume/manager.py:96
nova.rpc.common: INFO: Connected to AMQP server on 172.16.14.1:5672
nova.service: DEBUG: Creating Consumer connection for Service volume from (pid=87464) start /usr/lib/python2.7/dist-packages/nova/service.py:178
nova.rpc.amqp: DEBUG: received {u'_context_roles': [u'admin'], u'_context_request_id': u'req-14847c59-101c-4c1a-a7e5-a5f7c9936a15', u'_context_read_deleted': u'no', u'args': {u'volume_id': 21, u'snapshot_id': None}, u'_context_auth_token': '<SANITIZED>', u'_context_is_admin': True, u'_context_project_id': u'7f44c421ea134d8e9d33ef28e1ded1ba', u'_context_timestamp': u'2012-08-21T21:19:35.026564', u'_context_user_id': u'8a90a4d3fe9a476c8cbcc43dc6534d4d', u'method': u'create_volume', u'_context_remote_address': u'127.0.0.1'} from (pid=87464) _safe_log /usr/lib/python2.7/dist-packages/nova/rpc/common.py:160
nova.rpc.amqp: DEBUG: [req-14847c59-101c-4c1a-a7e5-a5f7c9936a15 8a90a4d3fe9a476c8cbcc43dc6534d4d 7f44c421ea134d8e9d33ef28e1ded1ba] unpacked context: {'user_id': u'8a90a4d3fe9a476c8cbcc43dc6534d4d', 'roles': [u'admin'], 'timestamp': '2012-08-21T21:19:35.026564', 'auth_token': '<SANITIZED>', 'remote_address': u'127.0.0.1', 'is_admin': True, 'request_id': u'req-14847c59-101c-4c1a-a7e5-a5f7c9936a15', 'project_id': u'7f44c421ea134d8e9d33ef28e1ded1ba', 'read_deleted': u'no'} from (pid=87464) _safe_log /usr/lib/python2.7/dist-packages/nova/rpc/common.py:160
nova.rpc.amqp: ERROR: [req-14847c59-101c-4c1a-a7e5-a5f7c9936a15 8a90a4d3fe9a476c8cbcc43dc6534d4d 7f44c421ea134d8e9d33ef28e1ded1ba] Exception during message handling
TRACE nova.rpc.amqp Traceback (most recent call last):
TRACE nova.rpc.amqp File "/usr/lib/python2.7/dist-packages/nova/rpc/amqp.py", line 253, in _process_data
TRACE nova.rpc.amqp rval = node_func(context=ctxt, **node_args)
TRACE nova.rpc.amqp File "/usr/lib/python2.7/dist-packages/nova/volume/manager.py", line 106, in create_volume
TRACE nova.rpc.amqp volume_ref = self.db.volume_get(context, volume_id)
TRACE nova.rpc.amqp File "/usr/lib/python2.7/dist-packages/nova/db/api.py", line 948, in volume_get
TRACE nova.rpc.amqp return IMPL.volume_get(context, volume_id)
TRACE nova.rpc.amqp File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 120, in wrapper
TRACE nova.rpc.amqp return f(*args, **kwargs)
TRACE nova.rpc.amqp File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 2403, in volume_get
TRACE nova.rpc.amqp raise exception.VolumeNotFound(volume_id=volume_id)
TRACE nova.rpc.amqp VolumeNotFound: Volume 21 could not be found.
TRACE nova.rpc.amqp

---------------------------------------------------

Aimon Bustardo (aimonb)
description: updated
Revision history for this message
Aimon Bustardo (aimonb) wrote :

Note that both nova-volume servies are on same server.. From what I can tell it is doing round robin between the two services when I create volumes. I do notice that the service table in db only references one service.. Is this to be expected?

Revision history for this message
Aimon Bustardo (aimonb) wrote :

Note that if I switch the order in which they start, the second one to start is always one to fail creation. Both work ok on their own.

Revision history for this message
Aimon Bustardo (aimonb) wrote :

I have now also tried:

--volume_scheduler_driver=nova.scheduler.simple.SimpleScheduler

And have exact same bad behavior when it hits the second sever. Could this be LUN number conflicts? They both default to LUN0.

description: updated
Revision history for this message
Aimon Bustardo (aimonb) wrote :

Ok after tracing the Nexenta logs I have found that the second Nexenta server is never contacted when the failure occurs.

Revision history for this message
Aimon Bustardo (aimonb) wrote :

Aha!!!! Ok it requires that the nova-volume services run on separate hosts! I realized this after looking at service table in db and seeing only one nova-volume entry. Having them on different hosts is not an option for us. I am going to write patch that allows override of hostname to in essence spoof that they are on different hosts. Will keep this ticket updated with details.

Monty Taylor (mordred)
affects: openstack-ci → cinder
Revision history for this message
Huang Zhiteng (zhiteng-huang) wrote :

Hi Aimon,

In your case, having two volume services running on same physical host requires a little bit of hack. Here's what I would do:
1) assign two IPs to the nova-volume physical host, say 172.16.0.1/24 and 172.16.0.2/24. You may also have to turn on arp_filter via echo '1'>/proc/net/ipv4/conf/ethX/arp_filter, where ethX is the NIC that has two IPs.
2) edit two configuration file for volume service, assign different IP to each one by editing 'host' flag. For example, first volume service can use 'host=172.16.0.1' and 2nd one can use 'host=172.16.0.2'.
3) start two services, they should work fine.

Revision history for this message
shankao (shankao) wrote :

Assigning to the Ubuntu package

affects: ubuntu → cinder (Ubuntu)
tags: added: precise
Changed in cinder:
status: New → Confirmed
Changed in cinder (Ubuntu):
status: New → Confirmed
importance: Undecided → High
tags: added: driver
Changed in cinder:
status: Confirmed → Invalid
Revision history for this message
John Griffith (john-griffith) wrote :

multiple volume services on a single host isn't something that's actually supported right now and won't be on Nova-Volume. The hack that Huang mentioned may work for you if you can get the networking squared away.

Chuck Short (zulcss)
Changed in cinder (Ubuntu):
status: Confirmed → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Related questions

Remote bug watches

Bug watches keep track of this bug in other bug trackers.