Agent hangs (in init?)

Bug #1670731 reported by Ian Wells
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
networking-vpp
Fix Released
High
Naveen Joy

Bug Description

The circumstance:

When restarting first VPP and then the agent during an upgrade (which will work post-resync but is not the recommended strategy right now), the agent hangs.

It's not been possible to get more information on why, but the strace shows python hung on a futex. Our guess is that the agent is hung on the VPP reconnect:

Planned actions:

- recommend as a VPP upgrade: stop agent, restart VPP, restart agent
- ask VPP project to do some soak testing on connection connect/disconnect, particularly with unclean disconnects.

Revision history for this message
Jerome Tollet (jtollet) wrote :

Looping on connect/disconnect for a stress test has been raised to VPP team
Ungracefull disconnect has also been raised to VPP team

Might be interesting to run the following commands

DBGvpp# show api ?
     >
     > show api clients Client information
     >
     > show api histogram show api histogram
     >
     > show api message-table Message Table
     >
     > show api plugin show api plugin
     >
     > show api ring-stats Message ring statistics
     >
     > show api status Show API trace status

Revision history for this message
Ian Wells (ijw-ubuntu) wrote :

Note frequency - seen once so far, to my knowledge.

Revision history for this message
Ian Wells (ijw-ubuntu) wrote :

Seems to cause agent hangs should VPP crash.

Changed in networking-vpp:
assignee: nobody → Naveen Joy (najoy)
importance: Undecided → High
milestone: none → 17.10.0
status: New → Confirmed
Revision history for this message
Ian Wells (ijw-ubuntu) wrote :

Test some event orderings to characterise bug and see if we still have it:

- agent without VPP (agent should either die or not heartbeat while waiting)
- VPP terminated with agent running (agent should die and remove heartbeat key)

Ian Wells (ijw-ubuntu)
Changed in networking-vpp:
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.