Exception during router rescheduling

Bug #1493754 reported by Eugene Nikanorov
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Mirantis OpenStack
Fix Released
High
Eugene Nikanorov
7.0.x
Fix Released
High
Eugene Nikanorov
8.0.x
Fix Released
High
Eugene Nikanorov

Bug Description

Fuel 7.0 build #287

The follwoing trace is seen:

 28608 ERROR neutron.db.l3_agentschedulers_db [req-a4af4755-6bf4-4082-bf0f-f5ad12e341ac ] Exception encountered during router rescheduling.
 28608 TRACE neutron.db.l3_agentschedulers_db Traceback (most recent call last):
 28608 TRACE neutron.db.l3_agentschedulers_db File "/usr/lib/python2.7/dist-packages/neutron/db/l3_agentschedulers_db.py", line 121, in reschedule_routers_from_down_agents
 28608 TRACE neutron.db.l3_agentschedulers_db self.reschedule_router(context, binding.router_id)
 28608 TRACE neutron.db.l3_agentschedulers_db File "/usr/lib/python2.7/dist-packages/neutron/db/l3_agentschedulers_db.py", line 263, in reschedule_router
 28608 TRACE neutron.db.l3_agentschedulers_db self._unbind_router(context, router_id, agent['id'])
 28608 TRACE neutron.db.l3_agentschedulers_db File "/usr/lib/python2.7/dist-packages/neutron/db/l3_dvrscheduler_db.py", line 357, in _unbind_router
 28608 TRACE neutron.db.l3_agentschedulers_db self.unbind_snat_servicenode(context, router_id)
 28608 TRACE neutron.db.l3_agentschedulers_db File "/usr/lib/python2.7/dist-packages/neutron/db/l3_dvrscheduler_db.py", line 317, in unbind_snat_servicenode
 28608 TRACE neutron.db.l3_agentschedulers_db binding = self.unbind_snat(context, router_id)
 28608 TRACE neutron.db.l3_agentschedulers_db File "/usr/lib/python2.7/dist-packages/neutron/db/l3_dvrscheduler_db.py", line 265, in unbind_snat
 28608 TRACE neutron.db.l3_agentschedulers_db binding = query.one()
 28608 TRACE neutron.db.l3_agentschedulers_db File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/query.py", line 2378, in one
 28608 TRACE neutron.db.l3_agentschedulers_db "Multiple rows were found for one()")
 28608 TRACE neutron.db.l3_agentschedulers_db MultipleResultsFound: Multiple rows were found for one()

User impact: In case such condition is hit (multiple bindings for snat router) rescheduling will always fail, potentially preventing external access to failover.

Upstream bug: https://bugs.launchpad.net/neutron/+bug/1497980

Changed in mos:
assignee: nobody → Eugene Nikanorov (enikanorov)
Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Fix proposed to openstack/neutron (openstack-ci/fuel-7.0/2015.1.0)

Fix proposed to branch: openstack-ci/fuel-7.0/2015.1.0
Change author: Eugene Nikanorov <email address hidden>
Review: https://review.fuel-infra.org/11317

Changed in mos:
status: Confirmed → In Progress
Revision history for this message
Timur Nurlygayanov (tnurlygayanov) wrote :

The the approximate steps to reproduce:
1. Deploy HA cloud
2. Create several networks
3. Get list of l3 and dhcp agents
4. Ban some agents and start network recheduling
5. Check logs

tags: added: neutron
description: updated
Revision history for this message
Eugene Bogdanov (ebogdanov) wrote :

Not 7.0 release blocker - moving to 7.0 Updates.

Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote :

Fix proposed to branch: openstack-ci/fuel-7.0/2015.1.0
Change author: Eugene Nikanorov <email address hidden>
Review: https://review.fuel-infra.org/11431

description: updated
tags: added: 70mu1-confirmed
Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Fix merged to openstack/neutron (openstack-ci/fuel-7.0/2015.1.0)

Reviewed: https://review.fuel-infra.org/11317
Submitter: mos-infra-ci <>
Branch: openstack-ci/fuel-7.0/2015.1.0

Commit: ad89c8872375f48ab8c55771886373b9f5181705
Author: Eugene Nikanorov <email address hidden>
Date: Thu Sep 10 10:06:20 2015

Avoid using .one() doing broad query.

This breaks router rescheduling.

Change-Id: Ie5fe298d5712246a2d3a849b4176b76324950482
Closes-Bug: #1493754

Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Fix proposed to openstack/neutron (openstack-ci/fuel-8.0/liberty)

Fix proposed to branch: openstack-ci/fuel-8.0/liberty
Change author: Eugene Nikanorov <email address hidden>
Review: https://review.fuel-infra.org/13311

Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Change abandoned on openstack/neutron (openstack-ci/fuel-8.0/liberty)

Change abandoned by Ann Kamyshnikova <email address hidden> on branch: openstack-ci/fuel-8.0/liberty
Review: https://review.fuel-infra.org/13311
Reason: Fixed in stable/liberty with https://review.openstack.org/#/c/233114/

tags: removed: 70mu1-confirmed
tags: added: on-verification
Revision history for this message
Kristina Berezovskaia (kkuznetsova) wrote :

verify on
VERSION:
  feature_groups:
    - mirantis
  production: "docker"
  release: "7.0"
  openstack_version: "2015.1.0-7.0"
  api: "1.0"
  build_number: "301"
  build_id: "301"
  nailgun_sha: "4162b0c15adb425b37608c787944d1983f543aa8"
  python-fuelclient_sha: "486bde57cda1badb68f915f66c61b544108606f3"
  fuel-agent_sha: "50e90af6e3d560e9085ff71d2950cfbcca91af67"
  fuel-nailgun-agent_sha: "d7027952870a35db8dc52f185bb1158cdd3d1ebd"
  astute_sha: "6c5b73f93e24cc781c809db9159927655ced5012"
  fuel-library_sha: "5d50055aeca1dd0dc53b43825dc4c8f7780be9dd"
  fuel-ostf_sha: "2cd967dccd66cfc3a0abd6af9f31e5b4d150a11c"
  fuelmain_sha: "a65d453215edb0284a2e4761be7a156bb5627677"
with updates
vlan + neutron, 3 controller noda and 2 compute node

Steps:
1) Create 50 networks, subnets, routers, boot and delete vm (for scheduling on agents)
2) Ban 2 l3-agents
3) Ban 2 dhcp-agents
4) Check neutron logs

This bug helped to find some logic error, so I also have checked existence of new code on env with updates

tags: removed: on-verification
tags: added: on-verification
Revision history for this message
Kristina Berezovskaia (kkuznetsova) wrote :

verify on
Verify on
VERSION:
  feature_groups:
    - mirantis
  production: "docker"
  release: "8.0"
  openstack_version: "2015.1.0-8.0"
  api: "1.0"
  build_number: "196"
  build_id: "196"
  fuel-nailgun_sha: "8d6ef41f1bef84378a61ee98b46aeb2b925fd10f"
  python-fuelclient_sha: "e685d68c1c0d0fa0491a250f07d9c3a8d0f9608c"
  fuel-agent_sha: "1d98edb0468aa70b9b3a43b8422804e9095e7d9d"
  fuel-nailgun-agent_sha: "b56f832abc18aee9a8c603fd6cc2055c5f4287bc"
  astute_sha: "c8400f51b0b92254da206de55ef89d17fdf35393"
  fuel-library_sha: "33c0fa3aada734dc9e6f315197ce0e4a16f5987c"
  fuel-ostf_sha: "11afd5743a12b1006317d3ca7000d1ede77bdae2"
  fuel-createmirror_sha: "994fed9b1ed889718b61a59733275c08c2dd4c64"
  fuelmenu_sha: "d12061b1aee82f81b3d074de74ea27a6e962a686"
  shotgun_sha: "c377d163519f6d10b69a654019d6086ba5f14edc"
  network-checker_sha: "2c62cd52655ea6456ff6294fd63f18d6ea54fe38"
  fuel-upgrade_sha: "1e894e26d4e1423a9b0d66abd6a79505f4175ff6"
  fuelmain_sha: "22fe551f5525d11a1854fd87dbc8c77fae8fec08"
(neutron+vxlan, 3 controller + 2 compute)

There are no traces and errors in logs after steps below
Steps
1) Create 50 networks, subnets, routers, boot and delete vm (for scheduling on agents)
2) Ban 2 l3-agents
3) Ban 2 dhcp-agents
4) Check neutron logs

This bug helped to find some logic error, so I also have checked existence of new code on env 8.0 from upstream

tags: removed: on-verification
tags: added: on-automation
Revision history for this message
Ekaterina Shutova (eshutova) wrote :
tags: added: covered-automated-test
removed: on-automation
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.