XenServer ensure_vlan_bridge in VLAN mode

Bug #794645 reported by Édouard Thuleau
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Low
Salvatore Orlando

Bug Description

Nova revision
XenServer 5.6

When XenAPI driver need to create network VLAN for the first instance of a project, I've got this error:

2011-06-08 18:11:08,448 DEBUG nova.rpc [-] received {u'_context_request_id': u'7X9Z6Q908NBVVSJ-SUXB', u'_context_read_deleted': False, u'args': {u'instance_id': 45, u'request_spec': {u'filter': u'nova.scheduler.host_filter.InstanceTypeFilter', u'instance_type': {u'rxtx_quota': 0, u'deleted_at': None, u'name': u'm1.small', u'deleted': False, u'created_at': None, u'updated_at': None, u'memory_mb': 2048, u'vcpus': 1, u'rxtx_cap': 0, u'swap': 0, u'flavorid': 2, u'id': 5,
 u'local_gb': 20}}, u'admin_password': None, u'injected_files': None, u'availability_zone': None}, u'_context_is_admin': None, u'_context_timestamp': u'2011-06-08T16:11:08Z', u'_context_user': u'user1', u'method': u'run_instance', u'_context_project': u'simple', u'_context_remote_address': u'10.193.118.30'} from
(pid=29408) process_data /usr/lib/pymodules/python2.6/nova/rpc.py:202
2011-06-08 18:11:08,448 DEBUG nova.rpc [-] unpacked context: {'timestamp': u'2011-06-08T16:11:08Z', 'msg_id': None, 'remote_address': u'10.193.118.30', 'proj
ect': u'simple', 'is_admin': None, 'user': u'user1', 'request_id': u'7X9Z6Q908NBVVSJ-SUXB', 'read_deleted': False} from (pid=29408) _unpack_context /usr/lib/pymodules/python2.6/nova/rpc.py:445
2011-06-08 18:11:11,291 AUDIT nova.compute.manager [7X9Z6Q908NBVVSJ-SUXB user1 simple] instance 45: starting...
2011-06-08 18:11:11,409 DEBUG nova.rpc [-] Making asynchronous call on network.p-hs22-12 ... from (pid=29408) multicall /usr/lib/pymodules/python2.6/nova/rpc
.py:475
2011-06-08 18:11:11,410 DEBUG nova.rpc [-] MSG_ID is 13d76ae28e8b41f1832245ebdecb3ed8 from (pid=29408) multicall /usr/lib/pymodules/python2.6/nova/rpc.py:478
2011-06-08 18:11:11,749 DEBUG nova.xenapi_net [-] ENTERING ensure_vlan_bridge in xenapi net from (pid=29408) ensure_vlan_bridge /usr/lib/pymodules/python2.6/
nova/network/xenapi_net.py:40
2011-06-08 18:11:12,187 ERROR nova.exception [-] Uncaught exception
(nova.exception): TRACE: Traceback (most recent call last):
(nova.exception): TRACE: File "/usr/lib/pymodules/python2.6/nova/exception.py", line 87, in _wrap
(nova.exception): TRACE: return f(*args, **kw)
(nova.exception): TRACE: File "/usr/lib/pymodules/python2.6/nova/compute/manager.py", line 250, in run_instance
(nova.exception): TRACE: instance_id)
(nova.exception): TRACE: File "/usr/lib/pymodules/python2.6/nova/network/manager.py", line 531, in setup_compute_network
(nova.exception): TRACE: network_ref['bridge'])
(nova.exception): TRACE: File "/usr/lib/pymodules/python2.6/nova/network/xenapi_net.py", line 61, in ensure_vlan_bridge
(nova.exception): TRACE: pifs = session.call_xenapi('PIF.get_all_records_where', expr)
(nova.exception): TRACE: File "/usr/lib/pymodules/python2.6/nova/virt/xenapi_conn.py", line 368, in call_xenapi
(nova.exception): TRACE: return tpool.execute(f, *args)
(nova.exception): TRACE: File "/usr/lib/pymodules/python2.6/eventlet/tpool.py", line 76, in tworker
(nova.exception): TRACE: rv = meth(*args,**kwargs)
(nova.exception): TRACE: File "/usr/local/lib/python2.6/dist-packages/XenAPI.py", line 229, in __call__
(nova.exception): TRACE: return self.__send(self.__name, args)
(nova.exception): TRACE: File "/usr/local/lib/python2.6/dist-packages/XenAPI.py", line 133, in xenapi_request
(nova.exception): TRACE: result = _parse_result(getattr(self, methodname)(*full_params))
(nova.exception): TRACE: File "/usr/local/lib/python2.6/dist-packages/XenAPI.py", line 203, in _parse_result
(nova.exception): TRACE: raise Failure(result['ErrorDescription'])
(nova.exception): TRACE: Failure: ['INTERNAL_ERROR', 'Failure("lexing: empty token")']
(nova.exception): TRACE:
2011-06-08 18:11:12,189 ERROR nova [-] Exception during message handling
(nova): TRACE: Traceback (most recent call last):
(nova): TRACE: File "/usr/lib/pymodules/python2.6/nova/rpc.py", line 232, in _process_data
(nova): TRACE: rval = node_func(context=ctxt, **node_args)
(nova): TRACE: File "/usr/lib/pymodules/python2.6/nova/exception.py", line 93, in _wrap
(nova): TRACE: raise Error(str(e))
(nova): TRACE: Error: ['INTERNAL_ERROR', 'Failure("lexing: empty token")']
(nova): TRACE:

The instance passed to state 'shutdown' and a network named 'br100' is created on XenServer but no NIC or VLAN are associate to it.

After that, I can set manually through the XenCenter, the NIC and VLAN ID for the network and start a new instances in this project, no error are raised and the instances running and network connectivity works.

Revision history for this message
Salvatore Orlando (salvatore-orlando) wrote :

Hi Edouard,

Sorry for the delayed reply.

looking at the provided stack trace it seems xapi backend fails to evaluate the filter expression for PIFs.
Can you confirm that FLAGS.vlan_interface is not empty on your system? (it shouldn't be, anyway).

Also, you could really help me reproducing the failure by posting a few more bits, namely:

- version of XenServer/XCP/OSS Xen
- networking stack (linux bridge/OVS)
- complete set of flags for nova-compute

Thanks,
Salvtore

tags: added: vlan
Changed in nova:
assignee: nobody → Salvatore Orlando (salvatore-orlando)
Revision history for this message
Salvatore Orlando (salvatore-orlando) wrote :

Apologies again I did not see you were using XenServer 5.6...

Revision history for this message
Édouard Thuleau (ethuleau) wrote :

Hi Salvatore,

I use the XenServer 5.6 with nova revision 1158 on Ubuntu 10.04.2 and the Linux networking stack (for the moment).

Here my Xenserver compute config :

--verbose
--logdir=/var/log/nova/
--lock_path=/var/lib/nova/lockfiles/

--sql_connection=mysql://root:nova@p-novamaster/nova
--s3_host=p-novamaster
--rabbit_host=p-novamaster
--cc_host=p-novamaster
--ec2_url=http://p-novamaster:8773/services/Cloud
--auth_driver=nova.auth.dbdriver.DbDriver

--connection_type=xenapi
--xenapi_connection_url=https://p-hs22-13
--xenapi_connection_username=root
--xenapi_connection_password=orange
--xenapi_inject_image=false
--rescue_timeout=86400

--network_driver=nova.network.xenapi_net
--network_manager=nova.network.manager.VlanManager
--routing_source_ip=10.193.175.154
--dhcpbridge_flagfile=/etc/nova/nova.conf
--vlan_interface=xenbr1
--public_interface=xenbr0
--dhcpbridge=/usr/bin/nova-dhcpbridge
--ec2_dmz_host=10.193.175.44
#--fixed_range=172.16.1.0/24
--dmz_cidr=10.193.175.44/32

--ca_path=/var/lib/nova/CA/
--keys_path=/var/lib/nova/keys/
--images_path=/media/data/Nova/images/
--buckets_path=/media/data/Nova/buckets/
--networks_path=/var/lib/nova/networks/
--instances_path=/var/lib/nova/instances/

--image_service=nova.image.glance.GlanceImageService
--glance_api_server=p-novamaster

--default_project=simple
--allow_admin_api=True
--allow_project_net_traffic=False

--vpn_image_id=71
--use_project_ca
--cnt_vpn_clients=5

--iscsi_ip_prefix=10.193.175.167

Revision history for this message
Salvatore Orlando (salvatore-orlando) wrote :

Hi Edouard,

I haven't yet tried to run nova using your conf, but this flag caught my eye:

--vlan_interface=xenbr1

vlan_interface is supposed to be physical ethernet device over which VLANs are created.
For xenapi, this should be a PIF, and therefore eth1 (which is usally associated with xenbr1).

As soon as I will run nova with your configuration I will be able to provide you with more detailed information.

Revision history for this message
Édouard Thuleau (ethuleau) wrote :

Hi Salvatore,

Yes, you are right. I must set 'vlan_interface' to eth1.
I Tried it, but I always have the same problem.

Revision history for this message
Salvatore Orlando (salvatore-orlando) wrote :

Hi Edouard,

I apologise but I've been unable to reproduce the exception so far, even with your configuration.
Basically the problem appears to be an internal xenapi failure when evaluating the filter expression for PIF.get_all_records_where operation.

I'm afraid I have to ask you some more information:

1) XenServer build number
     (you can retrieve it by executing 'cat /etc/xensource-inventory | grep BUILD_NUMBER' at dom0 console)
2) PIFs - are there only ethX PIFs or are there other PIFs as well? Do eth PIFs have the VLAN attribute set to '-1'?
     (it would be great to have xe pif-list output, but I understand there might be privacy concerns).

Thanks,
Salvatore

Revision history for this message
Édouard Thuleau (ethuleau) wrote :

Hi,

1)
# cat /etc/xensource-inventory | grep BUILD_NUMBER
BUILD_NUMBER='47101p'

2) No privacy problem to give the PIF output :

xe pif-list
uuid ( RO) : 40dbbf5f-ac4a-9c2d-311b-c16cc42a70f0
                device ( RO): eth0
    currently-attached ( RO): true
                  VLAN ( RO): -1
          network-uuid ( RO): 11ab40a2-6e1f-245c-1325-43c518b4f701

uuid ( RO) : cd6264a5-66e4-513f-0923-19fd75b9255a
                device ( RO): eth1
    currently-attached ( RO): true
                  VLAN ( RO): 100
          network-uuid ( RO): a8476c27-5390-e86c-8191-f7d48867c27f

uuid ( RO) : 9774dced-3d9f-f5b1-d4ec-ec0732c46c1e
                device ( RO): eth1
    currently-attached ( RO): true
                  VLAN ( RO): -1
          network-uuid ( RO): 88a9610a-cda1-1382-ac69-14fc9ce14d78

Revision history for this message
Édouard Thuleau (ethuleau) wrote :

Oops, just to say, I openned a question about XenServer and Glance here: https://answers.launchpad.net/nova/+question/161683

Perhaps you can answer it.

Revision history for this message
Édouard Thuleau (ethuleau) wrote :

For the output command 'xe pif-list', the second entry correspond to a network of a Nova project (VLAN 100 I set manually).
If I haven't any network set for a Nova project, the output command is:

<XenServer># xe pif-list
uuid ( RO) : 40dbbf5f-ac4a-9c2d-311b-c16cc42a70f0
                device ( RO): eth0
    currently-attached ( RO): true
                  VLAN ( RO): -1
          network-uuid ( RO): 11ab40a2-6e1f-245c-1325-43c518b4f701

uuid ( RO) : 9774dced-3d9f-f5b1-d4ec-ec0732c46c1e
                device ( RO): eth1
    currently-attached ( RO): true
                  VLAN ( RO): -1
          network-uuid ( RO): 88a9610a-cda1-1382-ac69-14fc9ce14d78

Revision history for this message
Édouard Thuleau (ethuleau) wrote :

I made a small patch.

If network doesn't exist, it create one and it gets all PIF and explores them to find the one with associated device is equal to flag 'vlan_interface' and VLAN to value '-1'. If found, create new PIF with project VLAN id and associate it to the new network. If doesn't found, it raises an exception.

Revision history for this message
Édouard Thuleau (ethuleau) wrote :
Thierry Carrez (ttx)
Changed in nova:
status: New → Incomplete
Revision history for this message
Salvatore Orlando (salvatore-orlando) wrote :

Hi Edouard,

I looked back at your initial bug report and I saw you found the error against rev 1158.
Looking at changes in nova revs, I found out that the string for building the filter was slightly changed, replacing single quotes with double quotes and vice versa.
This caused the failure as xapi backend only accepts filter tokens enclosed in double quotes.

I think the change was due to coding style consistency. I have reverted the string to the old style, and verified that the error does not occur anymore. The fix will be proposed for merge.

If the change gets rejected for coding style issues, I will propose a fix along the lines of the patch you supplied (btw, thanks for providing it).

Thanks,
Salvatore

Changed in nova:
status: Incomplete → In Progress
Thierry Carrez (ttx)
Changed in nova:
importance: Undecided → Low
Thierry Carrez (ttx)
Changed in nova:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in nova:
milestone: none → diablo-2
Thierry Carrez (ttx)
Changed in nova:
milestone: diablo-2 → 2011.3
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.