Bug #1751127 “charm osdize's /dev/sdb, then fails with “looks li...” : Bugs : Ceph OSD Charm

Revision history for this message

Jason Hobbs (jason-hobbs) wrote on 2018-02-22:

#1

juju-crashdump-08bb4fd8-0bee-47ea-a142-561e0c312c58.tar.gz Edit (16.5 MiB, application/x-tar)

Revision history for this message

Jason Hobbs (jason-hobbs) wrote on 2018-02-26:

#2

We hit this again today on a run that was not bluestore, so it doesn't seem to be bluestore specific.

Revision history for this message

Dmitrii Shcherbakov (dmitriis) wrote on 2018-02-26:

#3

Jason, JFYI: I was able to deploy Ceph clusters with bluestore just fine without similar errors. So if that happens, it is certainly intermittent and probably not bluestore-specific.

From looking at the ceph-disk code I can say that the steps it does are very similar for FileStore and BlueStore.

Revision history for this message

Chris Gregan (cgregan) wrote on 2018-03-06:

#4

Escalated to field high to increase visibility

Revision history for this message

Chris MacNaughton (chris.macnaughton) wrote on 2018-03-06:

#5

Download full text (8.5 KiB)

Digging through the ceph logs on monitor 0, and osd.2 (machine 8), it looks like, as soon as osd.2 came online, osd.5 tried to ping it (normal healthcheck), immediately failed, and started reporting it to the monitors as down, causing it to eventually kill itself. Is it possible that there could be networking issues between the machines in the storage network, potentially MTU issues?

Alternately, if the machine were under heavy enough load and couldn't respond to healthchecks fast enough, that could also cause it to get marked as down.

Attaching a couple of log snippets below:

mon.log

2018-02-22 13:05:10.825046 7f5e88c16700 0 mon.juju-6ba979-0-lxd-0@0(leader) e2 handle_command mon_command({"prefix": "osd ls", "format": "json"} v 0) v1
2018-02-22 13:05:10.825084 7f5e88c16700 0 log_channel(audit) log [DBG] : from='client.? 192.168.36.11:0/3533814954' entity='client.admin' cmd=[{"prefix": "osd ls", "format": "json"}]: dispatch
2018-02-22 13:05:11.397144 7f5e8b41b700 0 log_channel(cluster) log [INF] : Health check cleared: OSD_DOWN (was: 1 osds down)
2018-02-22 13:05:11.397207 7f5e8b41b700 0 log_channel(cluster) log [INF] : Health check cleared: OSD_HOST_DOWN (was: 1 host (1 osds) down)
2018-02-22 13:05:11.401213 7f5e8440d700 1 mon.juju-6ba979-0-lxd-0@0(leader).osd e130 e130: 9 total, 9 up, 9 in
2018-02-22 13:05:11.405136 7f5e8440d700 0 log_channel(cluster) log [INF] : osd.2 192.168.36.6:6800/182399 boot
2018-02-22 13:05:11.405220 7f5e8440d700 0 log_channel(cluster) log [DBG] : osdmap e130: 9 total, 9 up, 9 in
2018-02-22 13:05:11.411304 7f5e88c16700 1 mon.juju-6ba979-0-lxd-0@0(leader).osd e130 prepare_failure osd.2 192.168.36.6:6800/182399 from osd.5 192.168.36.4:6800/501269 is reporting failure:1
2018-02-22 13:05:11.411316 7f5e88c16700 0 log_channel(cluster) log [DBG] : osd.2 192.168.36.6:6800/182399 reported immediately failed by osd.5 192.168.36.4:6800/501269
2018-02-22 13:05:11.411338 7f5e88c16700 1 mon.juju-6ba979-0-lxd-0@0(leader).osd e130 we're forcing failure of osd.2
2018-02-22 13:05:11.411360 7f5e88c16700 0 log_channel(cluster) log [INF] : osd.2 failed (root=default,rack=zone2,host=ariados) (connection refused reported by osd.5)
2018-02-22 13:05:11.614894 7f5e88c16700 1 mon.juju-6ba979-0-lxd-0@0(leader).osd e130 prepare_failure osd.2 192.168.36.6:6800/182399 from osd.5 192.168.36.4:6800/501269 is reporting failure:1
2018-02-22 13:05:11.614912 7f5e88c16700 0 log_channel(cluster) log [DBG] : osd.2 192.168.36.6:6800/182399 reported immediately failed by osd.5 192.168.36.4:6800/501269
2018-02-22 13:05:11.682861 7f5e88c16700 0 mon.juju-6ba979-0-lxd-0@0(leader) e2 handle_command mon_command({"prefix": "config-key put", "key": "cephx.groups.objects", "val": "{\"pools\": [\"default.rgw.buckets\", \"default.rgw\", \"default.rgw.root\", \"default.rgw.control\", \"default.rgw.gc\", \"default.rgw.buckets.index\", \"default.rgw.buckets.extra\", \"default.log\", \"default.intent-log\", \"default.usage\", \"default.users\", \"default.users.email\", \"default.users.swift\", \"default.users.uid\", \".rgw.root\"], \"services\": []}"} v 0) v1
2018-02-22 13:05:11.682907 7f5e88c16700 0 log_channel(audit) log [INF] : from='client.15830 -...

Digging through the ceph logs on monitor 0, and osd.2 (machine 8), it looks like, as soon as osd.2 came online, osd.5 tried to ping it (normal healthcheck), immediately failed, and started reporting it to the monitors as down, causing it to eventually kill itself. Is it possible that there could be networking issues between the machines in the storage network, potentially MTU issues?

Alternately, if the machine were under heavy enough load and couldn't respond to healthchecks fast enough, that could also cause it to get marked as down.

Attaching a couple of log snippets below:

mon.log

2018-02-22 13:05:10.825046 7f5e88c16700  0 mon.juju-6ba979-0-lxd-0@0(leader) e2 handle_command mon_command({"prefix": "osd ls", "format": "json"} v 0) v1
2018-02-22 13:05:10.825084 7f5e88c16700  0 log_channel(audit) log [DBG] : from='client.? 192.168.36.11:0/3533814954' entity='client.admin' cmd=[{"prefix": "osd ls", "format": "json"}]: dispatch
2018-02-22 13:05:11.397144 7f5e8b41b700  0 log_channel(cluster) log [INF] : Health check cleared: OSD_DOWN (was: 1 osds down)
2018-02-22 13:05:11.397207 7f5e8b41b700  0 log_channel(cluster) log [INF] : Health check cleared: OSD_HOST_DOWN (was: 1 host (1 osds) down)
2018-02-22 13:05:11.401213 7f5e8440d700  1 mon.juju-6ba979-0-lxd-0@0(leader).osd e130 e130: 9 total, 9 up, 9 in
2018-02-22 13:05:11.405136 7f5e8440d700  0 log_channel(cluster) log [INF] : osd.2 192.168.36.6:6800/182399 boot
2018-02-22 13:05:11.405220 7f5e8440d700  0 log_channel(cluster) log [DBG] : osdmap e130: 9 total, 9 up, 9 in
2018-02-22 13:05:11.411304 7f5e88c16700  1 mon.juju-6ba979-0-lxd-0@0(leader).osd e130 prepare_failure osd.2 192.168.36.6:6800/182399 from osd.5 192.168.36.4:6800/501269 is reporting failure:1
2018-02-22 13:05:11.411316 7f5e88c16700  0 log_channel(cluster) log [DBG] : osd.2 192.168.36.6:6800/182399 reported immediately failed by osd.5 192.168.36.4:6800/501269
2018-02-22 13:05:11.411338 7f5e88c16700  1 mon.juju-6ba979-0-lxd-0@0(leader).osd e130  we're forcing failure of osd.2
2018-02-22 13:05:11.411360 7f5e88c16700  0 log_channel(cluster) log [INF] : osd.2 failed (root=default,rack=zone2,host=ariados) (connection refused reported by osd.5)
2018-02-22 13:05:11.614894 7f5e88c16700  1 mon.juju-6ba979-0-lxd-0@0(leader).osd e130 prepare_failure osd.2 192.168.36.6:6800/182399 from osd.5 192.168.36.4:6800/501269 is reporting failure:1
2018-02-22 13:05:11.614912 7f5e88c16700  0 log_channel(cluster) log [DBG] : osd.2 192.168.36.6:6800/182399 reported immediately failed by osd.5 192.168.36.4:6800/501269
2018-02-22 13:05:11.682861 7f5e88c16700  0 mon.juju-6ba979-0-lxd-0@0(leader) e2 handle_command mon_command({"prefix": "config-key put", "key": "cephx.groups.objects", "val": "{\"pools\": [\"default.rgw.buckets\", \"default.rgw\", \"default.rgw.root\", \"default.rgw.control\", \"default.rgw.gc\", \"default.rgw.buckets.index\", \"default.rgw.buckets.extra\", \"default.log\", \"default.intent-log\", \"default.usage\", \"default.users\", \"default.users.email\", \"default.users.swift\", \"default.users.uid\", \".rgw.root\"], \"services\": []}"} v 0) v1
2018-02-22 13:05:11.682907 7f5e88c16700  0 log_channel(audit) log [INF] : from='client.15830 -' entity='client.admin' cmd=[{"prefix": "config-key put", "key": "cephx.groups.objects", "val": "{\"pools\": [\"default.rgw.buckets\", \"default.rgw\", \"default.rgw.root\", \"default.rgw.control\", \"default.rgw.gc\", \"default.rgw.buckets.index\", \"default.rgw.buckets.extra\", \"default.log\", \"default.intent-log\", \"default.usage\", \"default.users\", \"default.users.email\", \"default.users.swift\", \"default.users.uid\", \".rgw.root\"], \"services\": []}"}]: dispatch
2018-02-22 13:05:11.684905 7f5e8440d700  0 log_channel(audit) log [INF] : from='client.15830 -' entity='client.admin' cmd='[{"prefix": "config-key put", "key": "cephx.groups.objects", "val": "{\"pools\": [\"default.rgw.buckets\", \"default.rgw\", \"default.rgw.root\", \"default.rgw.control\", \"default.rgw.gc\", \"default.rgw.buckets.index\", \"default.rgw.buckets.extra\", \"default.log\", \"default.intent-log\", \"default.usage\", \"default.users\", \"default.users.email\", \"default.users.swift\", \"default.users.uid\", \".rgw.root\"], \"services\": []}"}]': finished
2018-02-22 13:05:12.016429 7f5e88c16700  1 mon.juju-6ba979-0-lxd-0@0(leader).osd e130 prepare_failure osd.2 192.168.36.6:6800/182399 from osd.5 192.168.36.4:6800/501269 is reporting failure:1
2018-02-22 13:05:12.016448 7f5e88c16700  0 log_channel(cluster) log [DBG] : osd.2 192.168.36.6:6800/182399 reported immediately failed by osd.5 192.168.36.4:6800/501269
...
2018-02-22 13:15:14.468097 7f5e8b41b700  0 log_channel(cluster) log [INF] : Marking osd.2 out (has been down for 602 seconds)
2018-02-22 13:15:14.468441 7f5e8b41b700  0 log_channel(cluster) log [INF] : Health check cleared: OSD_DOWN (was: 1 osds down)
2018-02-22 13:15:14.468498 7f5e8b41b700  0 log_channel(cluster) log [INF] : Health check cleared: OSD_HOST_DOWN (was: 1 host (1 osds) down)
2018-02-22 13:15:14.471024 7f5e8440d700  1 mon.juju-6ba979-0-lxd-0@0(leader).osd e140 e140: 9 total, 8 up, 8 in
2018-02-22 13:15:14.471727 7f5e8440d700  0 log_channel(cluster) log [DBG] : osdmap e140: 9 total, 8 up, 8 in
2018-02-22 13:15:15.292686 7f5e88c16700  0 mon.juju-6ba979-0-lxd-0@0(leader) e2 handle_command mon_command({ " p r e f i x " : " d f " ,   " f o r m a t " : " j s o n " } v 0) v1
2018-02-22 13:15:15.292736 7f5e88c16700  0 log_channel(audit) log [DBG] : from='client.? 192.168.36.13:0/992128002' entity='client.cinder-ceph' cmd=[{,",p,r,e,f,i,x,",:,",d,f,",,, ,",f,o,r,m,a,t,",:,",j,s,o,n,",}]: dispatch
2018-02-22 13:15:15.293846 7f5e88c16700  0 mon.juju-6ba979-0-lxd-0@0(leader) e2 handle_command mon_command({ " p r e f i x " : " o s d   p o o l   g e t - q u o t a " ,   " p o o l " :   " c i n d e r - c e p h " ,   " f o r m a t " : " j s o n " } v 0) v1
2018-02-22 13:15:15.293919 7f5e88c16700  0 log_channel(audit) log [DBG] : from='client.? 192.168.36.13:0/992128002' entity='client.cinder-ceph' cmd=[{,",p,r,e,f,i,x,",:,",o,s,d, ,p,o,o,l, ,g,e,t,-,q,u,o,t,a,",,, ,",p,o,o,l,",:, ,",c,i,n,d,e,r,-,c,e,p,h,",,, ,",f,o,r,m,a,t,",:,",j,s,o,n,",}]: dispatch
2018-02-22 13:15:15.486544 7f5e8440d700  1 mon.juju-6ba979-0-lxd-0@0(leader).osd e141 e141: 9 total, 8 up, 8 in
2018-02-22 13:15:15.487863 7f5e8440d700  0 log_channel(cluster) log [DBG] : osdmap e141: 9 total, 8 up, 8 in
2018-02-22 13:15:16.489067 7f5e8440d700  1 mon.juju-6ba979-0-lxd-0@0(leader).osd e142 e142: 9 total, 8 up, 8 in
2018-02-22 13:15:16.492265 7f5e8440d700  0 log_channel(cluster) log [DBG] : osdmap e142: 9 total, 8 up, 8 in

osd.2 log

transitioning to Primary
2018-02-22 13:05:11.430405 7f8fc8657700  1 osd.2 pg_epoch: 130 pg[18.33( empty local-lis/les=119/120 n=0 ec=115/115 lis/c 119/0 les/c/f 120/0/0 130/130/130) [2,3,4] r=0 lpr=130 pi=[115,130)/1 crt=0'0 mlcod 0'0 unknown NOTIFY] start_peering_interval up [3,4] -> [2,3,4], acting [3,4] -> [2,3,4], acting_primary 3 -> 2, up_primary 3 -> 2, role -1 -> 0, features acting 2305244844532236283 upacting 2305244844532236283
2018-02-22 13:05:11.430449 7f8fc8657700  1 osd.2 pg_epoch: 130 pg[18.33( empty local-lis/les=119/120 n=0 ec=115/115 lis/c 119/0 les/c/f 120/0/0 130/130/130) [2,3,4] r=0 lpr=130 pi=[115,130)/1 crt=0'0 mlcod 0'0 unknown] state<Start>: transitioning to Primary
2018-02-22 13:05:11.582038 7f8fdc664700  0 -- 192.168.35.6:6802/5182399 >> 192.168.35.7:6800/270907 conn(0x558c5a27a000 :6802 s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).handle_connect_msg accept connect_seq 0 vs existing csq=0 existing_state=STATE_CONNECTING_WAIT_CONNECT_REPLY
2018-02-22 13:05:12.428538 7f8fcf665700  0 log_channel(cluster) log [WRN] : Monitor daemon marked osd.2 down, but it is still running
2018-02-22 13:05:12.428556 7f8fcf665700  0 log_channel(cluster) log [DBG] : map e131 wrongly marked me down at e131
2018-02-22 13:05:12.428559 7f8fcf665700  0 osd.2 131 _committed_osd_maps marked down 6 > osd_max_markdown_count 5 in last 600.000000 seconds, shutting down
2018-02-22 13:05:12.428564 7f8fcf665700  1 osd.2 131 start_waiting_for_healthy
...
2018-02-22 13:05:12.437866 7f8fcf665700  0 osd.2 131 _committed_osd_maps shutdown OSD via async signal
2018-02-22 13:05:12.437927 7f8fbfe46700 -1 Fail to open '/proc/0/cmdline' error = (2) No such file or directory
2018-02-22 13:05:12.437940 7f8fbfe46700 -1 received  signal: Interrupt from  PID: 0 task name: <unknown> UID: 0
2018-02-22 13:05:12.437943 7f8fbfe46700 -1 osd.2 131 *** Got signal Interrupt ***
2018-02-22 13:05:12.437945 7f8fbfe46700  0 osd.2 131 prepare_to_stop starting shutdown
2018-02-22 13:05:12.437946 7f8fbfe46700 -1 osd.2 131 shutdown

Revision history for this message

Ryan Beisner (1chb1n) wrote on 2018-03-08:

#6

Bluestore support in the ceph charms is classified preview/experimental, but is intended for feature development as soon as possible.

Changed in charm-ceph-osd:
importance:	Undecided → Wishlist

Revision history for this message

Jason Hobbs (jason-hobbs) wrote on 2018-03-08:

#7

juju-crashdump-0bf269af-7425-4c4c-9eed-868593f9ab12.tar.gz Edit (26.6 MiB, application/x-tar)

attached crashdump from the reproduction with bluestore off.

description:

updated

Ryan Beisner (1chb1n) on 2018-03-08

Changed in charm-ceph-osd:
importance:	Wishlist → High
assignee:	nobody → Chris MacNaughton (chris.macnaughton)
milestone:	none → 18.05

Ryan Beisner (1chb1n) on 2018-03-21

Changed in charm-ceph-osd:
status:	New → In Progress

Revision history for this message

Chris MacNaughton (chris.macnaughton) wrote on 2018-03-22:

#8

The message in the title of this bug is not really useful, as it means that the charm correctly determined that the block device was mounted and moved on gracefully. What is interesting, to me, about this bug is the disks on a machine crashing (machine 8 in the original crashdump).

The next time that we see this error, I'd like to get a new crashdump with the load information, as load or networking issues[1] are the only cases that I can think of that would cause the cluster to mark an OSD down almost immediately.

[1]: I'd suspect a routing issue, if network is to be blamed, as the OSD ping message is tiny, well under a size that would cause problems with an MTU mismatch.

Revision history for this message

Ryan Beisner (1chb1n) wrote on 2018-03-22:

#9

If this recurs, please include a new crashdump so that we can re-analyze with the new /var/log/load information. Thank you.

Changed in charm-ceph-osd:
status:	In Progress → Incomplete

Revision history for this message

Jason Hobbs (jason-hobbs) wrote on 2018-03-22:

#10

juju-crashdump-9345fb8e-44fc-4b12-9652-36990eb6e74d.tar.gz Edit (11.7 MiB, application/x-tar)

posting bundle/crashdump from reproduction that includes load logging.

bundle:
http://paste.ubuntu.com/p/3rG3XyCrs7/

Changed in charm-ceph-osd:
status:	Incomplete → New

Revision history for this message

Chris MacNaughton (chris.macnaughton) wrote on 2018-03-22:

#11

The error in the crashdump attached in #10 is entirely different, as there was a hook error in the mon-relation-changed hook.

Changed in charm-ceph-osd:
status:	New → Incomplete

Revision history for this message

Chris MacNaughton (chris.macnaughton) wrote on 2018-03-22:

#12

Given the stack trace:

2018-03-02 22:26:13 INFO juju-log mon:38: osdize cmd: ['ceph-disk', 'prepare', '--fs-type', 'xfs', '--zap-disk', '--filestore', '/dev/sdb']
2018-03-02 22:26:14 DEBUG mon-relation-changed ceph-disk: Error: Device is mounted: /dev/sdb1
2018-03-02 22:26:14 DEBUG worker.uniter.jujuc server.go:178 running hook tool "juju-log"
2018-03-02 22:26:14 ERROR juju-log mon:38: Unable to initialize device: /dev/sdb
2018-03-02 22:26:14 DEBUG mon-relation-changed Traceback (most recent call last):
2018-03-02 22:26:14 DEBUG mon-relation-changed File "/var/lib/juju/agents/unit-ceph-osd-6/charm/hooks/mon-relation-changed", line 559, in <module>
2018-03-02 22:26:14 DEBUG mon-relation-changed hooks.execute(sys.argv)
2018-03-02 22:26:14 DEBUG mon-relation-changed File "/var/lib/juju/agents/unit-ceph-osd-6/charm/hooks/charmhelpers/core/hookenv.py", line 800, in execute
2018-03-02 22:26:14 DEBUG mon-relation-changed self._hooks[hook_name]()
2018-03-02 22:26:14 DEBUG mon-relation-changed File "/var/lib/juju/agents/unit-ceph-osd-6/charm/hooks/mon-relation-changed", line 486, in mon_relation
2018-03-02 22:26:14 DEBUG mon-relation-changed prepare_disks_and_activate()
2018-03-02 22:26:14 DEBUG mon-relation-changed File "/var/lib/juju/agents/unit-ceph-osd-6/charm/hooks/mon-relation-changed", line 389, in prepare_disks_and_activate
2018-03-02 22:26:14 DEBUG mon-relation-changed config('bluestore'))
2018-03-02 22:26:14 DEBUG mon-relation-changed File "lib/ceph/utils.py", line 1439, in osdize
2018-03-02 22:26:14 DEBUG mon-relation-changed bluestore)
2018-03-02 22:26:14 DEBUG mon-relation-changed File "lib/ceph/utils.py", line 1507, in osdize_dev
2018-03-02 22:26:14 DEBUG mon-relation-changed subprocess.check_call(cmd)
2018-03-02 22:26:14 DEBUG mon-relation-changed File "/usr/lib/python3.5/subprocess.py", line 581, in check_call
2018-03-02 22:26:14 DEBUG mon-relation-changed raise CalledProcessError(retcode, cmd)
2018-03-02 22:26:14 DEBUG mon-relation-changed subprocess.CalledProcessError: Command '['ceph-disk', 'prepare', '--fs-type', 'xfs', '--zap-disk', '--filestore', '/dev/sdb']' returned non-zero exit status 1

It looks like this is actually a reproduction of #1746118

Given the stack trace:

2018-03-02 22:26:13 INFO juju-log mon:38: osdize cmd: ['ceph-disk', 'prepare', '--fs-type', 'xfs', '--zap-disk', '--filestore', '/dev/sdb']
2018-03-02 22:26:14 DEBUG mon-relation-changed ceph-disk: Error: Device is mounted: /dev/sdb1
2018-03-02 22:26:14 DEBUG worker.uniter.jujuc server.go:178 running hook tool "juju-log"
2018-03-02 22:26:14 ERROR juju-log mon:38: Unable to initialize device: /dev/sdb
2018-03-02 22:26:14 DEBUG mon-relation-changed Traceback (most recent call last):
2018-03-02 22:26:14 DEBUG mon-relation-changed   File "/var/lib/juju/agents/unit-ceph-osd-6/charm/hooks/mon-relation-changed", line 559, in <module>
2018-03-02 22:26:14 DEBUG mon-relation-changed     hooks.execute(sys.argv)
2018-03-02 22:26:14 DEBUG mon-relation-changed   File "/var/lib/juju/agents/unit-ceph-osd-6/charm/hooks/charmhelpers/core/hookenv.py", line 800, in execute
2018-03-02 22:26:14 DEBUG mon-relation-changed     self._hooks[hook_name]()
2018-03-02 22:26:14 DEBUG mon-relation-changed   File "/var/lib/juju/agents/unit-ceph-osd-6/charm/hooks/mon-relation-changed", line 486, in mon_relation
2018-03-02 22:26:14 DEBUG mon-relation-changed     prepare_disks_and_activate()
2018-03-02 22:26:14 DEBUG mon-relation-changed   File "/var/lib/juju/agents/unit-ceph-osd-6/charm/hooks/mon-relation-changed", line 389, in prepare_disks_and_activate
2018-03-02 22:26:14 DEBUG mon-relation-changed     config('bluestore'))
2018-03-02 22:26:14 DEBUG mon-relation-changed   File "lib/ceph/utils.py", line 1439, in osdize
2018-03-02 22:26:14 DEBUG mon-relation-changed     bluestore)
2018-03-02 22:26:14 DEBUG mon-relation-changed   File "lib/ceph/utils.py", line 1507, in osdize_dev
2018-03-02 22:26:14 DEBUG mon-relation-changed     subprocess.check_call(cmd)
2018-03-02 22:26:14 DEBUG mon-relation-changed   File "/usr/lib/python3.5/subprocess.py", line 581, in check_call
2018-03-02 22:26:14 DEBUG mon-relation-changed     raise CalledProcessError(retcode, cmd)
2018-03-02 22:26:14 DEBUG mon-relation-changed subprocess.CalledProcessError: Command '['ceph-disk', 'prepare', '--fs-type', 'xfs', '--zap-disk', '--filestore', '/dev/sdb']' returned non-zero exit status 1

It looks like this is actually a reproduction of #1746118

Revision history for this message

Chris MacNaughton (chris.macnaughton) wrote on 2018-03-30:

#13

@jason-hobbs, @cgregan is this bug still occurring? If so, can we get a fresh crashdump where this bug has been reproduced along with the new load information in the crashdump?

In addition to waiting on more information for this bug, there are reviews in progress to improve logging under error conditions during disk changes in addition to improving disk management in general.

Revision history for this message

Eric Vasquez (envas) wrote on 2018-04-06:

#14

Juju crash dump attached Edit (437.7 KiB, application/x-tar)

@icey
We had this occur again. ceph-osd was:
charm: cs:~chris.macnaughton/ceph-osd-7

Attaching juju crash dump from the failing run.

Revision history for this message

Eric Vasquez (envas) wrote on 2018-04-06:

#15

Added bundle.yaml Edit (27.3 KiB, text/plain)

Revision history for this message

Chris MacNaughton (chris.macnaughton) wrote on 2018-04-06:

#16

That crashdump has a very partial juju status, and almost none of the machine logs?

Revision history for this message

Eric Vasquez (envas) wrote on 2018-04-06:

#17

Juju crash dump Edit (18.0 MiB, application/x-tar)

Apologies, attached the wrong crash dump.
Here is the right one.

Chris Gregan (cgregan) on 2018-04-06

Changed in charm-ceph-osd:
status:	Incomplete → Confirmed

Revision history for this message

Chris MacNaughton (chris.macnaughton) wrote on 2018-04-09:

#18

Looking at the logs (with additional logging, yay!), I can see that /dev/sdb1 is mounted (briefly) after the charm executes:

2018-04-06 02:32:02 INFO juju-log mon:42: osdize cmd: ['ceph-disk', 'prepare', '--fs-type', 'xfs', '--zap-disk', '--filestore', '/dev/sdb']

Looking at the timestamps, we can see that preparing and zapping the disk took around 30 seconds. Several hook executions later, then, we see the output of lsblk again with /dev/sdb1 _not_ mounted.

2018-04-06 02:32:35 DEBUG juju-log mon:42: NAME="sdb" MAJ:MIN="8:16" RM="0" SIZE="931.5G" RO="0" TYPE="disk" MOUNTPOINT=""
NAME="sdb1" MAJ:MIN="8:17" RM="0" SIZE="930.5G" RO="0" TYPE="part" MOUNTPOINT="/var/lib/ceph/tmp/mnt.fw3oZa"
NAME="sdb2" MAJ:MIN="8:18" RM="0" SIZE="1G" RO="0" TYPE="part" MOUNTPOINT=""
...snip...
2018-04-06 02:32:52 DEBUG juju-log mon:42: NAME="sdb" MAJ:MIN="8:16" RM="0" SIZE="931.5G" RO="0" TYPE="disk" MOUNTPOINT=""
NAME="sdb1" MAJ:MIN="8:17" RM="0" SIZE="930.5G" RO="0" TYPE="part" MOUNTPOINT=""
NAME="sdb2" MAJ:MIN="8:18" RM="0" SIZE="1G" RO="0" TYPE="part" MOUNTPOINT=""

I wonder if, per the suggestion at the end of ceph-disk's prepare state, we need to add a call to partprobe or kpartx:

2018-04-06 02:32:33 DEBUG mon-relation-changed Warning: The kernel is still using the old partition table.
2018-04-06 02:32:33 DEBUG mon-relation-changed The new table will be used at the next reboot or after you
2018-04-06 02:32:33 DEBUG mon-relation-changed run partprobe(8) or kpartx(8)

Revision history for this message

James Page (james-page) wrote on 2018-04-09:

#19

$ grep "Monitor hosts are" unit-ceph-osd-4.log
2018-04-06 02:31:46 INFO juju-log mon:42: Monitor hosts are ['10.244.40.228:6789', '10.244.41.5:6789', '192.168.36.14:6789']
2018-04-06 02:31:55 INFO juju-log mon:42: Monitor hosts are ['10.244.40.228:6789', '10.244.41.5:6789', '192.168.36.14:6789']
2018-04-06 02:32:42 INFO juju-log mon:42: Monitor hosts are ['10.244.40.228:6789', '192.168.36.14:6789', '192.168.36.23:6789']
2018-04-06 02:32:49 INFO juju-log mon:42: Monitor hosts are ['10.244.40.228:6789', '192.168.36.14:6789', '192.168.36.23:6789']

syslog on the unit indicates that the udev trigger action to activate the OSD fails until 02:53 - just after a write to the ceph.conf file with 2/3 mons actually on the right network address.

The get_mon_hosts function has a fallback which I think is causing this issue; as that was to deal with upgrades, we should just drop the fallback code and rely on a populated set of ceph-public-address values to address the mon cluster.

Revision history for this message

Chris MacNaughton (chris.macnaughton) wrote on 2018-04-09:

#20

There is a new version if my namespace including the proposal James is making above as well as some additional logging on failure at cs:~chris.macnaughton/ceph-osd-8 . Can you please update your testing to use that branch for a bit to see if it resolves this issue?

Revision history for this message

Chris MacNaughton (chris.macnaughton) wrote on 2018-04-10:

#21

Ok, while I'd love to see if the above change fixes this bug, the fix in https://review.openstack.org/#/c/559964 should prevent it from being a problem to begin with.

Changed in charm-ceph-osd:
status:	Confirmed → Fix Committed

Revision history for this message

Ryan Beisner (1chb1n) wrote on 2018-04-23:

#22

This is a tricky backport, as it changes the type of a config option if we do a straight backport. We will have to look further into how/if we can selectively backport a change which is also backward-compatible with current stable config types.

Revision history for this message

Chris MacNaughton (chris.macnaughton) wrote on 2018-04-23:

#23

An adendum to Ryan's point, this change, excepting the config change, has been backported in https://review.openstack.org/#/c/562488/

Changed in charm-ceph-osd:
status:	Fix Committed → Fix Released

Ceph OSD Charm

charm osdize's /dev/sdb, then fails with "looks like /dev/sdb is in use, skipping"

Bug Description

Other bug subscribers

Bug attachments

Remote bug watches