Brief Description
-----------------
Restore fails with ansible error when ceph is enabled and backup is taken on controller-1.
Severity
--------
Major: System/Feature is usable but degraded
Steps to Reproduce
------------------
1. Install any setup with Ceph
2. Swact to controller-1
3. Do a backup
4. Copy backup off-site
5. Reinstall controller-0
6. Copy the backup to controller-0
7. Run ansible restore playbook
Expected Behavior
------------------
Ansible should complete
Actual Behavior
----------------
Ansible fails:
fatal: [localhost]: FAILED! => {"changed": true, "cmd": ["/etc/init.d/ceph", "start"], "delta": "0:00:02.940099", "end": "2020-08-25 02:20:24.630149", "msg": "non-zero return code", "rc": 1, "start": "2020-08-25 02:20:21.690050", "stderr": "2020-08-25 02:20:22.160 7f8f8e4de140 -1 mon.controller@-1(probing).mgr e1 Failed to load mgr commands: (2) No such file or directory\n2020-08-25 02:20:23.226 7f2475d631c0 -1 journal do_read_entry(318771200): bad header magic\n2020-08-25 02:20:23.226 7f2475d631c0 -1 journal do_read_entry(318771200): bad header magic\n2020-08-25 02:20:24.022 7f2475d631c0 -1 osd.0 52 log_to_monitors
{default=true}\n2020-08-25 02:20:24.622 7fb9823451c0 -1 OSD id 0 != my id 1", "stderr_lines": ["2020-08-25 02:20:22.160 7f8f8e4de140 -1 mon.controller@-1(probing).mgr e1 Failed to load mgr commands: (2) No such file or directory", "2020-08-25 02:20:23.226 7f2475d631c0 -1 journal do_read_entry(318771200): bad header magic", "2020-08-25 02:20:23.226 7f2475d631c0 -1 journal do_read_entry(318771200): bad header magic", "2020-08-25 02:20:24.022 7f2475d631c0 -1 osd.0 52 log_to_monitors {default=true}
", "2020-08-25 02:20:24.622 7fb9823451c0 -1 OSD id 0 != my id 1"], "stdout": "=== mon.controller === \nStarting Ceph mon.controller on controller-0...\n=== osd.0 === \nStarting Ceph osd.0 on controller-0...\nstarting osd.0 at - osd_data /var/lib/ceph/osd/ceph-0 /var/lib/ceph/osd/ceph-0/journal\n=== osd.1 === \nMounting xfs on controller-0:/var/lib/ceph/osd/ceph-1\nStarting Ceph osd.1 on controller-0...\nfailed: 'ulimit -n 32768; TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=134217728 /usr/bin/ceph-osd -i 1 --pid-file /var/run/ceph/osd.1.pid -c /etc/ceph/ceph.conf --cluster ceph '", "stdout_lines": ["=== mon.controller === ", "Starting Ceph mon.controller on controller-0...", "=== osd.0 === ", "Starting Ceph osd.0 on controller-0...", "starting osd.0 at - osd_data /var/lib/ceph/osd/ceph-0 /var/lib/ceph/osd/ceph-0/journal", "=== osd.1 === ", "Mounting xfs on controller-0:/var/lib/ceph/osd/ceph-1", "Starting Ceph osd.1 on controller-0...", "failed: 'ulimit -n 32768; TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=134217728 /usr/bin/ceph-osd -i 1 --pid-file /var/run/ceph/osd.1.pid -c /etc/ceph/ceph.conf --cluster ceph '"]}
Reproducibility
---------------
100% reproducible
System Configuration
--------------------
Any
Branch/Pull Time/Commit
-----------------------
Branch and the time when code was pulled or git commit or cengn load info
Test Activity
-------------
Developer Testing
Workaround
----------
Extract the backup archive, remove the [osd.*] sections from etc/ceph/ceph.conf, create the archive back.
Fix proposed to branch: master /review. opendev. org/757520
Review: https:/