DBIntegrityError when inserting into routersl3agentbindings

Bug #1341765 reported by Eugene Nikanorov
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Expired
Medium
Unassigned

Bug Description

Traceback:
 TRACE oslo.messaging.rpc.dispatcher Traceback (most recent call last):
 TRACE oslo.messaging.rpc.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo/messaging/rpc/dispatcher.py", line 134, in _dispatch_and_reply
 TRACE oslo.messaging.rpc.dispatcher incoming.message))
 TRACE oslo.messaging.rpc.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo/messaging/rpc/dispatcher.py", line 177, in _dispatch
 TRACE oslo.messaging.rpc.dispatcher return self._do_dispatch(endpoint, method, ctxt, args)
 TRACE oslo.messaging.rpc.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo/messaging/rpc/dispatcher.py", line 123, in _do_dispatch
 TRACE oslo.messaging.rpc.dispatcher result = getattr(endpoint, method)(ctxt, **new_args)
 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/new/neutron/neutron/db/l3_rpc_base.py", line 55, in sync_routers
 TRACE oslo.messaging.rpc.dispatcher l3plugin.auto_schedule_routers(context, host, router_ids)
 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/new/neutron/neutron/db/l3_agentschedulers_db.py", line 270, in auto_schedule_routers
 TRACE oslo.messaging.rpc.dispatcher self, context, host, router_ids)
 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/new/neutron/neutron/scheduler/l3_agent_scheduler.py", line 114, in auto_schedule_routers
 TRACE oslo.messaging.rpc.dispatcher self.bind_router(context, router_id, l3_agent)
 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/new/neutron/neutron/scheduler/l3_agent_scheduler.py", line 156, in bind_router
 TRACE oslo.messaging.rpc.dispatcher 'agent_id': chosen_agent.id})
 TRACE oslo.messaging.rpc.dispatcher File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/session.py", line 447, in __exit__
 TRACE oslo.messaging.rpc.dispatcher self.rollback()
 TRACE oslo.messaging.rpc.dispatcher File "/usr/lib/python2.7/dist-packages/sqlalchemy/util/langhelpers.py", line 58, in __exit__
 TRACE oslo.messaging.rpc.dispatcher compat.reraise(exc_type, exc_value, exc_tb)
 TRACE oslo.messaging.rpc.dispatcher File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/session.py", line 444, in __exit__
 TRACE oslo.messaging.rpc.dispatcher self.commit()
 TRACE oslo.messaging.rpc.dispatcher File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/session.py", line 354, in commit
 TRACE oslo.messaging.rpc.dispatcher self._prepare_impl()
 TRACE oslo.messaging.rpc.dispatcher File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/session.py", line 334, in _prepare_impl
 TRACE oslo.messaging.rpc.dispatcher self.session.flush()
 TRACE oslo.messaging.rpc.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo/db/sqlalchemy/session.py", line 463, in _wrap
 TRACE oslo.messaging.rpc.dispatcher raise exception.DBError(e)
 TRACE oslo.messaging.rpc.dispatcher DBError: (IntegrityError) insert or update on table "routerl3agentbindings" violates foreign key constraint "routerl3agentbindings_router_id_fkey"
 TRACE oslo.messaging.rpc.dispatcher DETAIL: Key (router_id)=(5c70bab4-3176-4a93-ace0-653fdf48cf4c) is not present in table "routers".
 TRACE oslo.messaging.rpc.dispatcher 'INSERT INTO routerl3agentbindings (id, router_id, l3_agent_id) VALUES (%(id)s, %(router_id)s, %(l3_agent_id)s)' {'router_id': u'5c70bab4-3176-4a93-ace0-653fdf48cf4c', 'l3_agent_id': u'9fa2b62d-3a33-40ee-b502-89e615d9a852', 'id': 'e8a59637-2525-4f31-92a4-9622c1d274fb'}

Observed in the gate:
http://logs.openstack.org/44/106744/1/check/check-tempest-dsvm-neutron-pg-2/554a4b8/logs/screen-q-svc.txt.gz?level=TRACE#_2014-07-14_12_43_17_270

Changed in neutron:
assignee: nobody → Prasoon Telang (prasoontelang)
Revision history for this message
Prasoon Telang (prasoontelang) wrote :

Hi Eugene,

May I know the steps to reproduce this bug? Other than the methods in trace, I couldn't derive the steps from the logs.

Thanks,
Prasoon

Revision history for this message
Eugene Nikanorov (enikanorov) wrote :

Hi Prasoon, you need to go to the link I've provided to analyze the failure.
It happened during tempest testing in the gate.
Corresponding review is https://review.openstack.org/#/c/106744 however it looks totally unrelated to the issue

Revision history for this message
Prasoon Telang (prasoontelang) wrote :

Hi,

I analysed the log and here is what I gathered from it.

The router with id 5c70bab4-3176-4a93-ace0-653fdf48cf4c was first created,
bind_router to l3_agent_id 9fa2b62d-3a33-40ee-b502-89e615d9a852,
then called with index and then later deleted.

After deletion it was attempted to bind_router with l3_agent_id 9fa2b62d-3a33-40ee-b502-89e615d9a852 that threw the error.

I looked at the other router's behaviour in the same log.

For the router id 0371b474-f77c-4fd3-b775-045e68538074
router creation
bind_router to l3_agent_id 9fa2b62d-3a33-40ee-b502-89e615d9a85,
then called with index and then later deleted.
But there was no bind_router called here after deletion.

I am wondering what made the bind_router execute twice for the former router.
I'll be honest that I haven't used routers much so I couldn't deduce the commands used to reproduce it. Little guidance would be very helpful.

Revision history for this message
Prasoon Telang (prasoontelang) wrote :

I have been trying to trace the log. I couldn't track down the cause. Only thing that I could determine has been written on comment #3. Without causing further delays, if anybody wants to take up this bug can feel free to take it.

Changed in neutron:
assignee: Prasoon Telang (prasoontelang) → nobody
Changed in neutron:
importance: High → Medium
Changed in neutron:
assignee: nobody → Eugene Nikanorov (enikanorov)
Revision history for this message
Eugene Nikanorov (enikanorov) wrote :

Moving to incomplete.
This bug hasn't been seen for more than a month now and some fixes were made to prevent deadlock.

Changed in neutron:
status: New → Incomplete
assignee: Eugene Nikanorov (enikanorov) → nobody
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for neutron because there has been no activity for 60 days.]

Changed in neutron:
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.