Failing to launch VM if RX_TX*Network Qos is greater than Maximum value is 2147483

Bug #1463184 reported by Raghu Katti on 2015-06-08
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
vmware-nsx
Undecided
Aaron Rosen

Bug Description

Unable to launch VMs when net QOS rate that is RX_TX of flavor multiplied by Network QOS max bandwidth value is greater than 2147483.

ERROR Server Error Message: Invalid bandwidth settings. Maximum value is 2147483.

If default bandwidth is 1 GB/s and RX_TX for a flavor is set to 5 it does not work and fails while launching instance.

neutron server log :

neutron.openstack.common.rpc.com ERROR Failed to publish message to topic 'notifications.info': [Errno 32] Broken pipe#012Traceback (most recent call last):#012 File "/usr/lib/python2.7/dist-packages/neutron/openstack/common/rpc/impl_kombu.py", line 579, in ensure#012 return method(*args, **kwargs)#012 File "/usr/lib/python2.7/dist-packages/neutron/openstack/common/rpc/impl_kombu.py", line 690, in _publish#012 publisher = cls(self.conf, self.channel, topic, **kwargs)#012 File "/usr/lib/python2.7/dist-packages/neutron/openstack/common/rpc/impl_kombu.py", line 392, in __init__#012 super(NotifyPublisher, self).__init__(conf, channel, topic, **kwargs)#012 File "/usr/lib/python2.7/dist-packages/neutron/openstack/common/rpc/impl_kombu.py", line 368, in __init__#012 **options)#012 File "/usr/lib/python2.7/dist-packages/neutron/openstack/common/rpc/impl_kombu.py", line 315, in __init__#012 self.reconnect(channel)#012 File "/usr/lib/python2.7/dist-packages/neutron/openstack/common/rpc/impl_kombu.py", line 395, in reconnect#012 super(NotifyPublisher, self).reconnect(channel)#012 File "/usr/lib/python2.7/dist-packages/neutron/openstack/common/rpc/impl_kombu.py", line 323, in reconnect#012 routing_key=self.routing_key)#012 File "/usr/lib/python2.7/dist-packages/kombu/messaging.py", line 82, in __init__#012 self.revive(self._channel)#012 File "/usr/lib/python2.7/dist-packages/kombu/messaging.py", line 216, in revive#012 self.declare()#012 File "/usr/lib/python2.7/dist-packages/kombu/messaging.py", line 102, in declare#012 self.exchange.declare()#012 File "/usr/lib/python2.7/dist-packages/kombu/entity.py", line 166, in declare#012 nowait=nowait, passive=passive,#012 File "/usr/lib/python2.7/dist-packages/amqp/channel.py", line 604, in exchange_declare#012 self._send_method((40, 10), args)#012 File "/usr/lib/python2.7/dist-packages/amqp/abstract_channel.py", line 62, in _send_method#012 self.channel_id, method_sig, a

neutron.plugins.vmware.api_client log :

ERROR Server Error Message: Invalid bandwidth settings. Maximum value is 2147483.

i think the max bandwidth limitation would not exceed 0x7fffffff(2147483647). Since because of your configuration of RX_TX factor for 1GB/s network exceeds the max limit the vmware server is throwing error.

Also, in your log 2147483 is truncation of 2147483647

--

Changed in neutron:
status: New → Opinion
Raghu Katti (rakatti) wrote :

Thats interesting, following is the exact error message from the log :

neutron.plugins.vmware.api_clien ERROR Server Error Message: Invalid bandwidth settings. Maximum value is 2147483.

I tested creating a flavor with RX_TX value of 2 that puts the max bandwidth at 2 GB/s . The spinning VM functionality works, I think it is because the value is still below : 2147483

It fails on spinning VM with RX_TX 3 or greater. So I still believe the cap at max value is 2147483

Triaging

Changed in neutron:
assignee: nobody → Salvatore Orlando (salvatore-orlando)
no longer affects: neutron
Changed in vmware-nsx:
status: New → Incomplete
Aaron Rosen (arosen) on 2015-06-09
Changed in vmware-nsx:
assignee: nobody → Aaron Rosen (arosen)
Aaron Rosen (arosen) wrote :

Right, I just confirmed. In NSX you cannot create a queue that has a value greater than 2147483kb.

It looks like we have a 2 options for handling this:

1) we could just raise a http 400 error telling the user that the max queue size is greater than 2gb. This isn't create because nova-compute is the one who is going to be getting this error and the rxtx-factor can push you to getting this error if one configures it to be a multiple higher than this value.

2) In neutron we can just cap the value to 2gb if the lqueue max is above this value.

The 3rd option isn't really an option but I figure I'll put it in here anyways. We won't be able to remove the lqueue from the port if the queue max is above 2gb. The reason for this is because it will require us to modify multiple ports at once and there is no way to guarantee that, that operation is going to succeed.

Do you guys have a preference between 1 & 2 or see another behavior that you would like?

Raghu Katti (rakatti) wrote :

Is 2 GB/s limit due to nsx or limitation of underlying OVS ?

Aaron Rosen (arosen) wrote :

I'm not sure off hand. The NSX api is what is limiting the value to 2 gb/s. I'll ask the OVS guys and find out. I think NSX on linux leverages tc so perhaps there is some limit on that.

Aaron Rosen (arosen) wrote :

I asked the OVS guys and they say there should be no limit in there. I think it's just that the nsx-api just limits it to 2gbps right now :(

Raghu Katti (rakatti) wrote :

thanks for the options... Can we make a feature request to remove the cap of 2 GB/s ?

Aaron Rosen (arosen) wrote :

I did also file a bug internally against this to see if they can raise the 2gbps limit.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers