changing ceph-{public,cluster}-network post deployment is unsupported

Bug #1384341 reported by Nobuto Murata
20
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Ceph Monitor Charm
Triaged
Wishlist
Unassigned
Ceph OSD Charm
Triaged
Wishlist
Unassigned
OpenStack Ceph Charm (Retired)
Invalid
Low
Unassigned
ceph (Juju Charms Collection)
Invalid
Low
Unassigned
ceph-mon (Juju Charms Collection)
Invalid
Low
Unassigned

Bug Description

lp:~openstack-charmers/charms/trusty/ceph/next
revno. 86

We used juju bundle at the bottom to setup multi-network ceph clusters, but ceph.conf uses both 10.3.X.X and 192.168.X.X address at the same time. However ceph-mon only listen on 10.3.X.X network. So `start ceph-mon-all` never finish.

tcp 0 0 10.3.0.103:6789 0.0.0.0:* LISTEN 11628/ceph-mon

ceph.conf
==========
[global]

auth cluster required = cephx
auth service required = cephx
auth client required = cephx

keyring = /etc/ceph/$cluster.$name.keyring
mon host = 10.3.0.103:6789 192.168.104.23:6789 192.168.104.24:6789
==========

ceph:
  series: trusty
  services:
    ceph:
      branch: lp:~openstack-charmers/charms/trusty/ceph/next
      constraints: tags=ceph
      num_units: 3
      options:
        fsid: '6547bd3e-1397-11e2-82e5-53567c8d32dc'
        monitor-secret: 'AQCXrnZQwI7KGBAAiPofmKEXKxu5bUzoYLVkbQ=='
        osd-devices: '/dev/vdb'
        osd-reformat: 'yes'
        ceph-cluster-network: '10.2.0.0/24'
        ceph-public-network: '10.3.0.0/24'

Tags: openstack cts
Revision history for this message
Nobuto Murata (nobuto) wrote :
tags: added: cts
Revision history for this message
Nobuto Murata (nobuto) wrote :

looks like this is related to bug #1384333. I could workaround it with attached branch. So this might be a duplicate of bug #1384333.

Revision history for this message
James Page (james-page) wrote :

This looks like a race when using the network support in the charm:

mon host = 10.3.0.103:6789 192.168.104.23:6789 192.168.104.24:6789

10.3.0.103 will only be expecting traffic on 192.168.104.x (or whatever is configured for the 'public' network).

We probably need to switch to not using private-address as the indicator/keys for bootstrap.

Changed in ceph (Juju Charms Collection):
importance: Undecided → Medium
status: New → Triaged
milestone: none → 15.04
James Page (james-page)
tags: added: openstack
Changed in ceph (Juju Charms Collection):
milestone: 15.04 → 15.07
James Page (james-page)
Changed in ceph (Juju Charms Collection):
milestone: 15.07 → 15.10
Revision history for this message
Florian Haas (fghaas) wrote :

Using the private address for the mon host looks rather nonsensical when the ceph charm is deployed in a MAAS environment, rendering the cluster network and public network options useless. What happens is that the primary interface address will typically be on the MAAS network, which is *not* the public network, and then what the Mon actually listens on is the MAAS network, not the network given in ceph-public-network. In other words, any Ceph client that is on the public network, but not on the MAAS network, won't be able to use the cluster at all.

This looks like a show stopper bug. Can this be given a higher priority please?

Revision history for this message
Florian Haas (fghaas) wrote :

Looking into this a little more closely, it appears that the charm does attempt to do the right thing in http://bazaar.launchpad.net/~openstack-charmers/charms/trusty/ceph/trunk/view/head:/hooks/utils.py#L74. However, this only fires on mon-relation-changed, so the only way to get this updated is to remove a unit and then redeploy it.

Even so, the charm makes no attempt to update the monmap, so while peers are instructed to find the mons at the new address, the mon doesn't actually start to *listen* on that address, as it will select its listening IP based not on ceph.conf, but on the monmap.

See also: http://docs.ceph.com/docs/master/rados/operations/add-or-rm-mons/#changing-a-monitor-s-ip-address

That means that if you change ceph-public-network in an existing deployment, and you redeploy all units, what you'll end up with is a "mon hosts" entry in all your hosts' ceph.conf that does not contain a single IP that a mon is actually listening on.

Revision history for this message
James Page (james-page) wrote :

I dug into this problem a bit more this morning; in the scenario where ceph-public/cluster-address is provided at deployment time (i.e. not changed after deployment), the monitors should all boot from the address they have on the public network; this assumes that the machine the monitor is running on has a IP on the public network - if not it falls back to the private-address (as detail in get_public_addr). If one of the machines does not have an IP on the public network, it will pass 'private-address' to its peers, resulting in a mixed address deployment as described in the original bug report. We should probably switch to a hard configuration choice - if ceph-public-network is provided then no fallback should be used in the event of a mis-configured machine network configuration.

Changing the values of ceph-public-address and ceph-cluster-address post bootstrap of the cluster is current not supported by the charm in any way - i.e. it will make no attempt to reconfigure the daemons running todo the magic trick of re-IPing a running storage cluster. Unfortunately this is not made clear in either the README or the config.yaml, and juju does not have the concept of immutable configuration...

James Page (james-page)
Changed in ceph (Juju Charms Collection):
milestone: 15.10 → 16.01
James Page (james-page)
Changed in ceph (Juju Charms Collection):
milestone: 16.01 → 16.04
James Page (james-page)
Changed in ceph (Juju Charms Collection):
milestone: 16.04 → 16.07
Changed in ceph-mon (Juju Charms Collection):
status: New → Triaged
importance: Undecided → Medium
milestone: none → 16.07
Liam Young (gnuoy)
Changed in ceph (Juju Charms Collection):
milestone: 16.07 → 16.10
Changed in ceph-mon (Juju Charms Collection):
milestone: 16.07 → 16.10
James Page (james-page)
Changed in ceph (Juju Charms Collection):
milestone: 16.10 → 17.01
Changed in ceph-mon (Juju Charms Collection):
milestone: 16.10 → 17.01
James Page (james-page)
Changed in ceph-mon (Juju Charms Collection):
milestone: 17.01 → none
Changed in ceph (Juju Charms Collection):
milestone: 17.01 → none
importance: Medium → Low
Changed in ceph-mon (Juju Charms Collection):
importance: Medium → Low
summary: - when enable ceph-public-network and ceph-cluster-network, ceph-mon
- cannot communicate each other
+ changing ceph-{public,cluster}-network post deployment is unsupported
James Page (james-page)
Changed in charm-ceph:
importance: Undecided → Low
status: New → Triaged
Changed in ceph (Juju Charms Collection):
status: Triaged → Invalid
Changed in charm-ceph-mon:
importance: Undecided → Low
status: New → Triaged
Changed in ceph-mon (Juju Charms Collection):
status: Triaged → Invalid
Revision history for this message
Xav Paice (xavpaice) wrote :

Adding ceph-osd because there is the same issue with that charm too.

Revision history for this message
Xav Paice (xavpaice) wrote :

Adding field-medium, we have a production cloud that needs the change to migrate from 1Gbps interfaces to 10Gbps, to relieve serious performance issues.

Revision history for this message
Xav Paice (xavpaice) wrote :

Same issue on bionic/queens.

I've added new addresses to all units, and run:
juju config ceph-osd ceph-public-network=172.16.1.0/24 ceph-cluster-network=172.16.1.0/24
juju config ceph-mon ceph-public-network=172.16.1.0/24 ceph-cluster-network=172.16.1.0/24

The new addresses appear in ceph.conf under public network, cluster network, public addr and cluster addr, and the local IP for the 'mon host' is updated, but the remaining mons (or none if on an OSD host) are not updated. Restarting ceph-mon does not change the address it listens on.

Further, https://docs.ceph.com/docs/master/rados/operations/add-or-rm-mons/#changing-a-monitor-s-ip-address says "Important Existing monitors are not supposed to change their IP addresses."

I guess that means the workaround for mons is to deploy new units. I suspect therefore that this is a documentation bug?

For OSDs, although the change does change ceph.conf, it doesn't restart the processes (which is actually good, as I'd want to coordinate that). Restarting the ceph-osd process, one by one, does appear to change the listener address, and the OSD comes up OK as long as other hosts have legs on both the old and new networks.

I have yet to find out the effect on Cinder, Glance, and Nova for the change. Advice would be appreciated.

Ryan Beisner (1chb1n)
Changed in charm-ceph:
status: Triaged → Invalid
Revision history for this message
James Page (james-page) wrote :

I think the best approach to changing the IP's of the monitors in a cluster is to cycle the units - this would (theoretically) be done by creating three new units in LXD containers in the ceph-mon application, and then for the three original units, perform the operation on the running cluster to remove them from the monmap:

  sudo ceph mon rm <id> (where ID is generally the hostname of the server)

as each mon is removed from the monmap, the associated unit should then also be destroyed.

This is similar in approach to the migration from ceph -> ceph-mon documented here:

  https://docs.openstack.org/project-deploy-guide/charm-deployment-guide/latest/app-ceph-migration.html

albeit the removal of the original monitor units is not automated.

Revision history for this message
James Page (james-page) wrote :

It would appear that the deployment in #9 may not be using network-space binding as the example uses the old pre-space method of configuring network binding.

How the connectivity between the old units and the new units would be maintained also needs to be considered.

Andrew McLeod (admcleod)
Changed in charm-ceph-osd:
status: New → Triaged
importance: Undecided → Low
Ryan Beisner (1chb1n)
Changed in charm-ceph-osd:
importance: Low → Wishlist
Changed in charm-ceph-mon:
importance: Low → Wishlist
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.