unrecoverable error using shared routers created with mitaka in rocky

Bug #1802110 reported by Daniel 'f0o' Preussker
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
vmware-nsx
New
Undecided
Unassigned

Bug Description

We recently updated from Mitaka to Rocky (through Newton, Queens and Pike) and are now noticing that we cannot create routers unless they're exclusive.

All previously created shared routers will error when attempting to create a new router/interface on them with this error:

Invalid virtualServer IP address: 169.254.169.254, virtualServer IP must be assigned to a Vnic.

Full log entry:

2018-11-07 13:58:44.608 7634 WARNING vmware_nsx.common.utils [req-f6637124-c022-40fb-8a7b-a93954d7799f f21faec0fd864d94a3fdc8ea19a972b4 0626bc0db05d406f977854953fb0c8d1 - 86b9a8d9d1794cbf8ac6ea718eb8a9fc 86b9a8d9d1794cbf8ac6ea718eb8a9fc] Finished retry of vmware_nsx.plugins.nsx_v.vshield.vcns.update_interface for the 17th time after 65.514(s) with args: (edge_id=u'edge-423', vnic={'enableProxyArp': False, 'index': 1, 'enableSendRedirects': True, 'name': 'internal1', 'isConnected': True, 'type': 'internal', 'addressGroups': {'addressGroups': [{'subnetPrefixLength': '24', 'primaryAddress': u'10.48.138.1'}]}, 'portgroupId': u'virtualwire-548', 'mtu': 1500}): RequestBad: Request https://10.135.4.4/api/4.0/edges/edge-423/vnics/1 is Bad, response {"errorCode":14523,"details":"[LoadBalancer] Invalid virtualServer IP address: 169.254.169.254, virtualServer IP must be assigned to a Vnic.","moduleName":"vShield Edge"}

Is there a migration path for those shared routers to be in a state that Rocky can use them?
We can't even convert shared to exclusive for a similar issue.

When trying to convert a shared router to an exclusive one we get this error in the logs:
2018-11-07T13:38:23.553418+00:00 ki01 neutron-server - - - 2018-11-07 13:38:23.551 7638 WARNING vmware_nsx.common.utils [req-25401a5a-ec36-40b2-9403-7067f85ec5f9 bdf28a37b5be4e45b0ba3a68878785a1 d0a3e9e1283a48099aa9f9209e8a0dd0 - 86b9a8d9d1794cbf8ac6ea718eb8a9fc 86b9a8d9d1794cbf8ac6ea718eb8a9fc] Finished retry of vmware_nsx.plugins.nsx_v.vshield.vcns.update_interface for the 11th time after 37.167(s) with args: (edge_id=u'edge-465', vnic={'enableProxyArp': False, 'index': 2, 'enableSendRedirects': True, 'name': 'internal2', 'isConnected': True, 'type': 'internal', 'addressGroups': {'addressGroups': [{'subnetPrefixLength': '24', 'primaryAddress': u'192.168.247.4'}]}, 'portgroupId': None, 'mtu': 1500}): RequestBad: Request https://10.135.4.4/api/4.0/edges/edge-465/vnics/2 is Bad, response {"errorCode":10100,"details":"Invalid Vnic configuration for index 2. PortGroup association must be specified for the Connected Vnic.","moduleName":"vShield Edge"}

The database has the dvportgroup- info however.

Right now we're very much stuck and for critical customers have to delete the router and recreate from scratch as exclusive to get it back in a working state.

nsxadmin didnt find any orphaned or missing vnics,edges,bindings.

Revision history for this message
Daniel 'f0o' Preussker (dpreussker) wrote :

It turns out that none of the migration scripts executes a backpropagation of neutron_nsx_network_mappings to add the dvs_id values of nsxv_tz_network_bindings.

sending a simple update query fixed the missing PortGroup issue.

Still investigating on the virtualServer issue

Revision history for this message
Daniel 'f0o' Preussker (dpreussker) wrote :

Turns out the vnic binding table was missing some entries. I assume it's another db migration issue.

After adding bogus data as network_id where the metadata interface is on, including all tunnel-ids solved it.

Now I got a bunch of random bogus data on some of the entries but that's acceptable.

I'm now able to add interfaces on shared routers again :)

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.