Network Administration Visualized

Jobs "inventory" and "statuscheck" fails after switch OS upgrade

Bug #1486430 reported by Einar Jørgen Haraldseid on 2015-08-19

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	Network Administration Visualized	Fix Released	High	Morten Brekkevold	Network Administration Visualized 4.3.1

Bug Description

We did a routine IOS upgrade from IOS Version 15.1(2)SY2 to 15.1(2)SY5 on one of our switches (gsw), and after that the jobs "inventory" and "statuscheck" has stopped working.

The switch is a Cisco C6807-XL, supervisor VS-SUP2T-10G

The ipdevpoll.log reports a lot of the following:

grep inventory:

2015-08-19 09:58:09,113 [WARNING plugins.uptime.uptime] [inventory gsw-hostname] Detected possible coldboot at 2015-08-17 19:03:37
2015-08-19 09:58:29,362 [ERROR jobs.jobhandler] [inventory gsw-hostname] Save stage failed with unhandled error
2015-08-19 09:58:29,362 [ERROR jobs.jobhandler] [inventory gsw-hostname] Job 'inventory' for gsw-hostname aborted: Job aborted due to save failure (cause=IntegrityError('duplicate key value violates unique constraint "netboxentity_netboxid_source_index_unique"\nDETAIL: Key (netboxid, source, index)=(111, ENTITY-MIB, -5000) already exists.\n',))

grep statuscheck:

2015-08-19 10:04:36,720 [INFO schedule.netboxjobscheduler] [statuscheck gsw-hostname] statuscheck for gsw-hostname failed in 0:00:02.904420. next run in 0:04:59.999959.
2015-08-19 10:09:41,234 [ERROR jobs.jobhandler] [statuscheck gsw-hostname] Save stage failed with unhandled error
2015-08-19 10:09:41,235 [ERROR jobs.jobhandler] [statuscheck gsw-hostname] Job 'statuscheck' for gsw-hostname aborted: Job aborted due to save failure (cause=IntegrityError('duplicate key value violates unique constraint "netboxentity_netboxid_source_index_unique"\nDETAIL: Key (netboxid, source, index)=(111, ENTITY-MIB, -5000) already exists.\n',))

Revision history for this message

Morten Brekkevold (mbrekkevold) wrote on 2015-08-20:

Hi, it would be very helpful if you could provide a full traceback from the logs, so we are able to locate the problematic code.

Changed in nav:
assignee:	nobody → Morten Brekkevold (mbrekkevold)

Revision history for this message

Morten Brekkevold (mbrekkevold) wrote on 2015-08-20:

I have managed to reproduce the problem without a traceback. Your upgrade/reboot has likely resulted in new indexes being assigned to all the physical entities listed in the ENTITY-MIB::entPhysicalTable (and possibly a changed set of entities as well). The ipdevpoll code that tries to resolve predicted db integrity errors in advance may however fail under unforeseen circumstances.

Likely the whole resolve code would not be necessary if the entire NetboxEntity database update ran inside a single transaction. I have confirmed this on my side, but since I cannot be sure I have replicated your issue 100%, we won't know until you upgrade.

Changed in nav:
status:	New → In Progress
importance:	Undecided → High
milestone:	none → 4.3.1

Revision history for this message

Morten Brekkevold (mbrekkevold) wrote on 2015-08-20:

fix here: https://nav.uninett.no/hg/stable/rev/1a1c2c3f8899

Changed in nav:
status:	In Progress → Fix Committed

Morten Brekkevold (mbrekkevold) on 2015-08-20

Changed in nav:
status:	Fix Committed → Fix Released

Revision history for this message

Einar Jørgen Haraldseid (einar-haraldseid) wrote on 2015-08-20:

I have now upgraded to NAV 4.3.1 and can confirm that this has been fixed, there are no more errors in the log and the watchdog is back to green.

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.