Cold migration fails

Bug #1708920 reported by Lenny
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Expired
Undecided
Unassigned

Bug Description

Nova cold migration intermediate fails due to broken connection to SQL cell database

setup: devstack master(pike)
  allinone physical server
  compute physical server
  SR-IOV over Mellanox ConnectX-4 NICs

Scenario:
    Running tempest cold migration few times, it fails on the 3rd time.
    #testr run tempest.scenario.test_network_advanced_server_ops.TestNetworkAdvancedServerOps.test_server_connectivity_cold_migration

Issue:
 One of the computes loses sql connection to novacell[1]
        coldmigration fails since it's migration is not allowed to the same node[2]

Logs:
 AllinOne http://52.169.200.208/tmp/cold_migration_bug_20170806/controller/
 Compute http://52.169.200.208/tmp/cold_migration_bug_20170806/compute

[1] novacell Error:
http://52.169.200.208/tmp/cold_migration_bug_20170806/controller/logs/n-cond-cell1.log
http://paste.openstack.org/show/617598/

[2] Compute error
http://52.169.200.208/tmp/cold_migration_bug_20170806/compute/logs/n-cpu.log
http://paste.openstack.org/show/617599/

Tags: cellsv2
Revision history for this message
Dan Smith (danms) wrote :

The "novacell" error above is actually a failure of a conductor trying to connect to the api database, seemingly because it's not configured. Are you running conductors on your second node? If so, are you missing the api_database config section there?

Sean Dague (sdague)
Changed in nova:
status: New → Incomplete
tags: added: cellsv2
Revision history for this message
Lenny (lennyb) wrote :

I am not running conductor on the second (compute) node
Migration works if I am not using port=direct, so I believe that configuration is fine.
the issue happens only with port direct

Revision history for this message
Lenny (lennyb) wrote :

normal port migration works both ways
direct port migration works from AllinOne -> Compute
direct port migration fails from Compute - > Allinone

Allinone (10.224.33.57)
#nova-manage cell_v2 list_cells
http://paste.openstack.org/show/618990/

Compute
#nova-manage cell_v2 list_cells
http://paste.openstack.org/show/618989/

Question: do I suppose to see Database Connection
 mysql+pymysql://root:****@127.0.0.1/nova_cell0?charset=utf8
on Compute node or it should show AllinOne IP addr?

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for OpenStack Compute (nova) because there has been no activity for 60 days.]

Changed in nova:
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.