StarlingX

Bug #1856064
Comment #12

Comment 12 for bug 1856064

Revision history for this message

Paul-Ionut Vaduva (pvaduva) wrote on 2020-03-16:

#12

There are 2 aspects to this bug. Two reasons why this appears:
* The fact that on a lock unlock compute-0 the ceph osd enters stuck peering
* The fact that ceph osd fails to exit suck peering

The second issue is addressed by the proposed partial-bug commit, https://review.opendev.org/712117

The first is a more complex problem as it is concerned with what ip does ceph uses to connect to it's
distributed components like mons and osds. The ip that ceph components use to listen to incoming
connection is configurable in /etc/ceph/ceph.conf however the outgoing ip address is not and can
and does on ocasions be the floating ip or the controller-platform-nfs (also floating) which are
all assigned on the management network interface. When a connection is instantiated using one of
those floating ips the connection is momentarily interrupted on a host lock/unlock compute-0 or
during a host-swact.