Check for mismatched software versions (e.g. neutron-common)

Bug #1919929 reported by Paul Goins
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Neutron Open vSwitch Charm
Triaged
Wishlist
Unassigned

Bug Description

This may not be directly something for O-S-C to check, but it's an OpenStack-related issue which we could monitor for, via O-S-C or via one or more of the OpenStack charms.

In response to a recent customer issue, it's been suggested that we add some sort or check which allows us to detect when software versions deployed on the cloud are inconsistent. Recently a fix to a bug was applied on one neutron-api unit, but it was also supposed to be applied to the other neutron-api units and to the nova-compute units; we did not know until I examined all of the nodes manually myself, and noted that the bug we tried to resolve was only partially resolved as a result of the miss.

Revision history for this message
Andrea Ieri (aieri) wrote :

this sounds like something landscape should alert on

Revision history for this message
Paul Goins (vultaire) wrote :

Added the neutron-openvswitch charm, although this may be more general to OpenStack charms overall. Or, if there is a way we can alert on this via LandScape in some way (as Andrea noted), that may be an alternative solution.

Revision history for this message
Billy Olsen (billy-olsen) wrote :

This doesn't seem like a charm issue at all. I agree with Andrea here that a tool like landscape is a much more appropriate place for this kind of check.

Changed in charm-neutron-openvswitch:
status: New → Incomplete
status: Incomplete → Invalid
Revision history for this message
James Troup (elmo) wrote :

Not sure why you think it's not a charm issue? Look at the work done in the Kubernetes charms around snap cohorts for an obvious and trivial counter-example? Landscape deals with machine primitives. It has no concept of what those machines are running. Juju does (and so correspondinglty) do charms. It's easy to see how you could within the charm to ensure that there is consistency within an application. I don't see how you could do that within Landscape... unless something like the charm pushed a bunch of information to Landscape.

Revision history for this message
Xav Paice (xavpaice) wrote :

The problem was caused by inconsistency across multiple units rather than simply needing to run an update. Landscape is OK to tell us when a box needs patches, but that won't tell us that one box is a totally different version of a key application unless we write a very specific query to do so.

We could include the version of a key common package, say neutron-common, in relation data for all Neutron charms. If the versions do not match, issue a warning via an NRPE check - which means that when we update one unit and not the others, we get a warning from the entire set of related charms that there's updates which need to be applied. E.g. we update neutron-api but not neutron-openvswitch, which can cause API version mismatches, we'll at least know that one or more units were missed. Same pattern for other components which must match - Nova has a bunch of similar examples, and the rules are very specific to the application.

This could mean writing a new Nagios plugin, feeding that with something like a minimum version from relation data or the application version information in Juju, and checking the list of packages on the machine against that version, alerting if the package is a lower version than specified.

Revision history for this message
Billy Olsen (billy-olsen) wrote :

Interesting and valid point on the snap cohorts. Cohorts are provided as a primitive by snapd and coordinated by the charms, which doesn't exist in debian land. Additionally, the functionality and dependencies are contained within the snap. Strictly confined snaps Debian packages don't have this snapshot view across servers.

This being said, it is clearly a valid feature request against the charms to be able to detect which versions of packages are installed and report when there is a discrepancy. Regardless of where or how the solution might be implemented, it is a valid feature request.

Changed in charm-neutron-openvswitch:
status: Invalid → Triaged
importance: Undecided → Wishlist
Paul Goins (vultaire)
no longer affects: charm-openstack-service-checks
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.