Comment 0 for bug 1885560

Revision history for this message
Ovidiu Poncea (ovidiuponcea) wrote : Running wipedisk before a restore breaks Ceph OSDs leading to data loss

Brief Description
-----------------
Running wipedisk before a restore leads to jorunal data getting wiped. W/o a valid journal Ceph fails to restore.

Severity
--------
Critical: System/Feature is not usable due to the defect

Steps to Reproduce
------------------
1. Configure an AIO-SX
2. Backup the system
3. Run wipedisk
4. Reinstall ISO
5. Run ansible restore playbook

Expected Behavior
------------------
Ansible should finish

Actual Behavior
----------------
Ansible fails with:

Error message:
2020-06-24 15:12:20,711 p=11425 u=sysadmin | TASK [recover-ceph-data : Bring up ceph Monitor and OSDs] **********************
2020-06-24 15:12:22,963 p=11425 u=sysadmin | fatal: [localhost]: FAILED! => {"changed": true, "cmd": ["/etc/init.d/ceph", "start"], "delta": "0:00:02.003026", "end": "2020-06-24 15:12:22.933556", "msg": "non-zero return code", "rc": 1, "start": "2020-06-24 15:12:20.930530", "stderr": "2020-06-24 15:12:21.790 7f4307d66140 -1 mon.controller-0@-1(probing).mgr e1 Failed to load mgr commands: (2) No such file or directory\n2020-06-24 15:12:22.901 7fd4c2d011c0 -1 journal FileJournal::open: ondisk fsid 00000000-0000-0000-0000-000000000000 doesn't match expected ff145265-25f5-4986-87ca-398fcad6fd5b, invalid (someone else's?) journal\n2020-06-24 15:12:22.902 7fd4c2d011c0 -1 filestore(/var/lib/ceph/osd/ceph-0) mount(1871): failed to open journal /var/lib/ceph/osd/ceph-0/journal: (22) Invalid argument\n2020-06-24 15:12:22.902 7fd4c2d011c0 -1 osd.0 0 OSD:init: unable to mount object store\n2020-06-24
[….]

Reproducibility
---------------
Reproducible

System Configuration
--------------------
SX, DX, Standard with Ceph enabled.

Branch/Pull Time/Commit
-----------------------
master

Test Activity
-------------
Developer Testing

Workaround
----------
none