AssertionError prevents Netmap from loading any graph/map

Bug #1398382 reported by Morten Brekkevold
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Network Administration Visualized
Fix Released
Medium
Morten Brekkevold
4.2
Fix Released
Medium
Morten Brekkevold

Bug Description

A customer installation repeatedly mails a Django traceback from Netmap invocations, with an AssertionError that prevents the map from ever displaying:

> AssertionError: Source and target GwPortPrefix must reside in same VLan for Prefix! Bailing

The end user will only see the non-descript error message "Error loading graph, please try to reload the page". Reloading the page will not help, as the problem occurs while assembling a topology

The full, mailed traceback looks like this:

Traceback (most recent call last):

  File "/usr/lib/python2.7/dist-packages/django/core/handlers/base.py", line 109, in get_response
    response = callback(request, *callback_args, **callback_kwargs)

  File "/usr/lib/python2.7/dist-packages/rest_framework/compat.py", line 121, in view
    return self.dispatch(request, *args, **kwargs)

  File "/usr/lib/python2.7/dist-packages/django/views/decorators/csrf.py", line 77, in wrapped_view
    return view_func(*args, **kwargs)

  File "/usr/lib/python2.7/dist-packages/rest_framework/views.py", line 327, in dispatch
    response = self.handle_exception(exc)

  File "/usr/lib/python2.7/dist-packages/rest_framework/views.py", line 324, in dispatch
    response = handler(request, *args, **kwargs)

  File "/usr/lib/python2.7/dist-packages/nav/web/netmap/views.py", line 297, in get
    return Response(get_topology_graph(layer, load_traffic, view))

  File "/usr/lib/python2.7/dist-packages/nav/web/netmap/graph.py", line 46, in get_topology_graph
    return _json_layer3(load_traffic, view)

  File "/usr/lib/python2.7/dist-packages/nav/web/netmap/graph.py", line 86, in _json_layer3
    view)

  File "/usr/lib/python2.7/dist-packages/nav/netmap/topology.py", line 207, in build_netmap_layer3_graph
    traffic)

  File "/usr/lib/python2.7/dist-packages/nav/netmap/metadata.py", line 439, in edge_metadata_layer3
    edge = Edge((nx_edge), source, target, traffic)

  File "/usr/lib/python2.7/dist-packages/nav/netmap/metadata.py", line 260, in __init__
    "Source and target GwPortPrefix must reside in same VLan for "

AssertionError: Source and target GwPortPrefix must reside in same VLan for Prefix! Bailing

Tags: netmap
Changed in nav:
status: Confirmed → In Progress
Revision history for this message
Morten Brekkevold (mbrekkevold) wrote :

The problem appears to be with the backend code that processes ELINK prefixes and inserts these into the topology graph.

An ELINK peer is a node that is not monitored by NAV, so the code will insert a stubbed ELINK node into the graph it is constructing. The stub code, located in nav.netmap.stubs, consists of stub classes representing Netboxes, GwPortPrefixes and Interfaces. However, the __hash__ implementation of the GwPortPrefix and Interface classes do not ensure uniqueness across all nodes. The biggest problem is for the GwPortPrefix stub class, which uses only the gw_ip for hashing. This would be fine if actual IPs were used, but for stub classes, the peer name (derived from the netident) is inserted as the gw_ip. If this peer appears multiple times in the graph, the data will be fudged, since the GwPortPrefix is supposed to represent individual router ports of this peer.

(In effect, all the GwPortPrefixes of this ELINK peer would hash to the same value, and there will be conflicts).

Revision history for this message
Morten Brekkevold (mbrekkevold) wrote :

forgot to close this bug. Fix is here: https://nav.uninett.no/hg/stable/rev/1681cf9850b8

Changed in nav:
status: In Progress → Fix Committed
Changed in nav:
status: Fix Committed → Fix Released
Revision history for this message
Morten Brekkevold (mbrekkevold) wrote :

The bug appears to still be present in NAV 4.2.2, but due to another data problem:

The code does not allow for multiple ELINKs to the same peer node if the ELINK vlans have not been assigned a proper net_ident value. An empty net_ident causes a sysname and interface name for the ELINK peer to be generated, but the interface name is always the same, causing hashing/keying errors in the graph.

The remote "fictional" interface should at least be unique per detected link.

Revision history for this message
Morten Brekkevold (mbrekkevold) wrote :
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.