Activity log for bug #1211693

Date Who What changed Old value New value Message
2013-08-13 09:19:10 Ian Wells bug added bug
2013-08-13 09:19:40 Ian Wells summary Can't replace mon node with Ceph running First Ceph node provides cluster auth keys and can't be replaced
2013-08-13 09:19:52 Ian Wells description The current Ceph install uses the first mon node to create the admin key. (Conversely, the mon key - used to decide if a node has permission to join the mnon quorum - is statically defined in the site.pp file.) This means that if the first mon node is replace - as is permissible and expected in the event of a hardware failure in an HA system - it will recreate the admin key, and retransmit it to the entire cluster, changing the cluster's credentials and likely resulting in odd failures (particularly if anything off-system is using the Ceph cluster). In parallel, the mon key is hardcoded into the site.pp file and, since it's not obvious that it should be changed and there are no instructions as to what format its replacement should be in, likely all clusters installed will have the same mon key and it serves as no security at all. I suggest the following: - in install_os_puppet or similar, create both a mon and admin key and store them on the boot node - pass the mon and admin keys to all nodes during installation - pass the mon and admin keys to nodes during reinstallation Added bonus is that puppet runs fewer times to get the install done, as the admin key is available on the first run (otherwise the mon node takes 3 runs to settle and it's not until the second that the key is put in storedconfig for other nodes to use). The current Ceph install uses the first mon node to create the admin key. (Conversely, the mon key - used to decide if a node has permission to join the mnon quorum - is statically defined in the site.pp file.) This means that if the first mon node is replaced - as is permissible and expected in the event of a hardware failure in an HA system - it will recreate the admin key, and retransmit it to the entire cluster, changing the cluster's credentials and likely resulting in odd failures (particularly if anything off-system is using the Ceph cluster). In parallel, the mon key is hardcoded into the site.pp file and, since it's not obvious that it should be changed and there are no instructions as to what format its replacement should be in, likely all clusters installed will have the same mon key and it serves as no security at all. I suggest the following: - in install_os_puppet or similar, create both a mon and admin key and store them on the boot node - pass the mon and admin keys to all nodes during installation - pass the mon and admin keys to nodes during reinstallation Added bonus is that puppet runs fewer times to get the install done, as the admin key is available on the first run (otherwise the mon node takes 3 runs to settle and it's not until the second that the key is put in storedconfig for other nodes to use).
2013-08-13 12:08:30 Mark T. Voelker openstack-cisco: status New Triaged
2013-08-13 12:08:31 Mark T. Voelker openstack-cisco: importance Undecided Wishlist
2013-08-13 12:08:38 Mark T. Voelker openstack-cisco: assignee Don Talton (dotalton)