Jobs "inventory" and "statuscheck" fails after switch OS upgrade

Bug #1486430 reported by Einar Jørgen Haraldseid
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Network Administration Visualized
Fix Released
High
Morten Brekkevold

Bug Description

We did a routine IOS upgrade from IOS Version 15.1(2)SY2 to 15.1(2)SY5 on one of our switches (gsw), and after that the jobs "inventory" and "statuscheck" has stopped working.

The switch is a Cisco C6807-XL, supervisor VS-SUP2T-10G

The ipdevpoll.log reports a lot of the following:

grep inventory:

2015-08-19 09:58:09,113 [WARNING plugins.uptime.uptime] [inventory gsw-hostname] Detected possible coldboot at 2015-08-17 19:03:37
2015-08-19 09:58:29,362 [ERROR jobs.jobhandler] [inventory gsw-hostname] Save stage failed with unhandled error
2015-08-19 09:58:29,362 [ERROR jobs.jobhandler] [inventory gsw-hostname] Job 'inventory' for gsw-hostname aborted: Job aborted due to save failure (cause=IntegrityError('duplicate key value violates unique constraint "netboxentity_netboxid_source_index_unique"\nDETAIL: Key (netboxid, source, index)=(111, ENTITY-MIB, -5000) already exists.\n',))

grep statuscheck:

2015-08-19 10:04:36,720 [INFO schedule.netboxjobscheduler] [statuscheck gsw-hostname] statuscheck for gsw-hostname failed in 0:00:02.904420. next run in 0:04:59.999959.
2015-08-19 10:09:41,234 [ERROR jobs.jobhandler] [statuscheck gsw-hostname] Save stage failed with unhandled error
2015-08-19 10:09:41,235 [ERROR jobs.jobhandler] [statuscheck gsw-hostname] Job 'statuscheck' for gsw-hostname aborted: Job aborted due to save failure (cause=IntegrityError('duplicate key value violates unique constraint "netboxentity_netboxid_source_index_unique"\nDETAIL: Key (netboxid, source, index)=(111, ENTITY-MIB, -5000) already exists.\n',))

Revision history for this message
Morten Brekkevold (mbrekkevold) wrote :

Hi, it would be very helpful if you could provide a full traceback from the logs, so we are able to locate the problematic code.

Changed in nav:
assignee: nobody → Morten Brekkevold (mbrekkevold)
Revision history for this message
Morten Brekkevold (mbrekkevold) wrote :

I have managed to reproduce the problem without a traceback. Your upgrade/reboot has likely resulted in new indexes being assigned to all the physical entities listed in the ENTITY-MIB::entPhysicalTable (and possibly a changed set of entities as well). The ipdevpoll code that tries to resolve predicted db integrity errors in advance may however fail under unforeseen circumstances.

Likely the whole resolve code would not be necessary if the entire NetboxEntity database update ran inside a single transaction. I have confirmed this on my side, but since I cannot be sure I have replicated your issue 100%, we won't know until you upgrade.

Changed in nav:
status: New → In Progress
importance: Undecided → High
milestone: none → 4.3.1
Revision history for this message
Morten Brekkevold (mbrekkevold) wrote :
Changed in nav:
status: In Progress → Fix Committed
Changed in nav:
status: Fix Committed → Fix Released
Revision history for this message
Einar Jørgen Haraldseid (einar-haraldseid) wrote :

I have now upgraded to NAV 4.3.1 and can confirm that this has been fixed, there are no more errors in the log and the watchdog is back to green.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.