Stuck with status message: 'ceph-osd/0* maintenance executing (config-changed) Upgrading packages to None'

Bug #1874258 reported by Frode Nordahl
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ceph OSD Charm
Triaged
Medium
Unassigned

Bug Description

2020-04-22 13:16:53 DEBUG juju-log Hardening function 'config_changed'
2020-04-22 13:16:53 DEBUG juju-log No hardening applied to 'config_changed'
2020-04-22 13:16:53 INFO juju-log old_version: None
2020-04-22 13:16:53 INFO juju-log new_version: None
2020-04-22 13:16:53 WARNING juju-log Support for use of upstream ``apt_pkg`` module in conjunctionwith charm-helpers is deprecated since 2019-06-25
2020-04-22 13:16:54 INFO juju-log Attempting to resume possibly failed upgrade.
2020-04-22 13:16:54 WARNING juju-log Support for use of upstream ``apt_pkg`` module in conjunctionwith charm-helpers is deprecated since 2019-06-25
2020-04-22 13:16:54 WARNING juju-log Support for use of upstream ``apt_pkg`` module in conjunctionwith charm-helpers is deprecated since 2019-06-25
2020-04-22 13:16:54 INFO juju-log Making dir /var/lib/charm/ceph-osd ceph:ceph 555
2020-04-22 13:16:55 INFO juju-log Monitor hosts are ['10.246.114.10:6789', '10.246.114.12:6789', '10.246.114.15:6789']
2020-04-22 13:16:55 WARNING juju-log Support for use of upstream ``apt_pkg`` module in conjunctionwith charm-helpers is deprecated since 2019-06-25
2020-04-22 13:16:55 WARNING juju-log Support for use of upstream ``apt_pkg`` module in conjunctionwith charm-helpers is deprecated since 2019-06-25
2020-04-22 13:16:56 WARNING juju-log Support for use of upstream ``apt_pkg`` module in conjunctionwith charm-helpers is deprecated since 2019-06-25
2020-04-22 13:16:56 WARNING juju-log Support for use of upstream ``apt_pkg`` module in conjunctionwith charm-helpers is deprecated since 2019-06-25
2020-04-22 13:16:57 WARNING juju-log SSD Discard autodetection: /dev/disk/by-dname/bcache1 is forcing discard off(sata <= 3.0)
2020-04-22 13:16:57 WARNING juju-log SSD Discard autodetection: /dev/disk/by-dname/bcache2 is forcing discard off(sata <= 3.0)
2020-04-22 13:16:57 WARNING juju-log SSD Discard autodetection: /dev/disk/by-dname/bcache3 is forcing discard off(sata <= 3.0)
2020-04-22 13:16:57 WARNING juju-log Support for use of upstream ``apt_pkg`` module in conjunctionwith charm-helpers is deprecated since 2019-06-25
2020-04-22 13:16:57 WARNING juju-log Support for use of upstream ``apt_pkg`` module in conjunctionwith charm-helpers is deprecated since 2019-06-25
2020-04-22 13:16:58 DEBUG juju-log Writing file /var/lib/charm/ceph-osd/ceph.conf ceph:ceph 644
2020-04-22 13:16:58 INFO juju-log roll_osd_cluster called with None
2020-04-22 13:16:59 INFO juju-log osd_sorted_list: [<charms_ceph.utils.CrushLocation object at 0x7f2c82299c50>, <charms_ceph.utils.CrushLocation object at 0x7f2c82299c18>, <charms_ceph.utils.CrushLocation object at 0x7f2c82299b70>]
2020-04-22 13:16:59 INFO juju-log upgrade position: 0
2020-04-22 13:16:59 INFO juju-log monitor_key_set osd_node-amontons_None_start 1587561419.2874928
2020-04-22 13:16:59 DEBUG config-changed set osd_node-amontons_None_start
2020-04-22 13:16:59 INFO juju-log Rolling
2020-04-22 13:17:00 WARNING juju-log Support for use of upstream ``apt_pkg`` module in conjunctionwith charm-helpers is deprecated since 2019-06-25
2020-04-22 13:17:00 INFO juju-log Current ceph version is 15.2
2020-04-22 13:17:00 INFO juju-log Upgrading to: None
2020-04-22 13:17:00 WARNING juju-log Support for use of upstream ``apt_pkg`` module in conjunctionwith charm-helpers is deprecated since 2019-06-25
2020-04-22 13:17:01 INFO juju-log Installing [] with options: ['--option=Dpkg::Options::=--force-confold']
2020-04-22 13:17:01 DEBUG config-changed Reading package lists...
2020-04-22 13:17:01 DEBUG config-changed Building dependency tree...
2020-04-22 13:17:01 DEBUG config-changed Reading state information...
2020-04-22 13:17:01 DEBUG config-changed The following packages were automatically installed and are no longer required:
2020-04-22 13:17:01 DEBUG config-changed python3-asn1crypto python3-xdg
2020-04-22 13:17:01 DEBUG config-changed Use 'apt autoremove' to remove them.
2020-04-22 13:17:01 DEBUG config-changed 0 upgraded, 0 newly installed, 0 to remove and 11 not upgraded.
2020-04-22 13:17:01 DEBUG config-changed Ign:1 http://ubuntu-cloud.archive.canonical.com/ubuntu bionic-updates/ussuri InRelease
2020-04-22 13:17:01 DEBUG config-changed Hit:2 http://ubuntu-cloud.archive.canonical.com/ubuntu bionic-updates/ussuri Release
2020-04-22 13:17:04 DEBUG config-changed Hit:4 http://archive.ubuntu.com/ubuntu bionic InRelease
2020-04-22 13:17:04 DEBUG config-changed Hit:5 http://archive.ubuntu.com/ubuntu bionic-updates InRelease
2020-04-22 13:17:04 DEBUG config-changed Hit:6 http://archive.ubuntu.com/ubuntu bionic-security InRelease
2020-04-22 13:18:04 DEBUG config-changed Err:7 http://archive.ubuntu.com/ubuntu bionic-backports InRelease
2020-04-22 13:18:04 DEBUG config-changed Connection failed [IP: 10.246.112.3 8000]
2020-04-22 13:18:06 DEBUG config-changed Reading package lists...
2020-04-22 13:18:06 DEBUG config-changed W: Failed to fetch http://archive.ubuntu.com/ubuntu/dists/bionic-backports/InRelease Connection failed [IP: 10.246.112.3 8000]
2020-04-22 13:18:06 DEBUG config-changed W: Some index files failed to download. They have been ignored, or old ones used instead.
2020-04-22 13:18:06 INFO juju-log Installing ['ceph', 'gdisk', 'radosgw', 'xfsprogs', 'lvm2', 'parted', 'smartmontools', 'btrfs-tools'] with options: ['--option=Dpkg::Options::=--force-confold']
2020-04-22 13:18:06 DEBUG config-changed Reading package lists...
2020-04-22 13:18:06 DEBUG config-changed Building dependency tree...
2020-04-22 13:18:06 DEBUG config-changed Reading state information...
2020-04-22 13:18:06 DEBUG config-changed btrfs-tools is already the newest version (4.15.1-1build1).
2020-04-22 13:18:06 DEBUG config-changed gdisk is already the newest version (1.0.3-1).
2020-04-22 13:18:06 DEBUG config-changed xfsprogs is already the newest version (4.9.0+nmu1ubuntu2).
2020-04-22 13:18:06 DEBUG config-changed lvm2 is already the newest version (2.02.176-4.1ubuntu3.18.04.2).
2020-04-22 13:18:06 DEBUG config-changed parted is already the newest version (3.2-20ubuntu0.2).
2020-04-22 13:18:06 DEBUG config-changed ceph is already the newest version (15.2.1-0ubuntu1~cloud0).
2020-04-22 13:18:06 DEBUG config-changed radosgw is already the newest version (15.2.1-0ubuntu1~cloud0).
2020-04-22 13:18:06 DEBUG config-changed smartmontools is already the newest version (7.1-1build1~cloud0).
2020-04-22 13:18:06 DEBUG config-changed The following packages were automatically installed and are no longer required:
2020-04-22 13:18:06 DEBUG config-changed python3-asn1crypto python3-xdg
2020-04-22 13:18:06 DEBUG config-changed Use 'apt autoremove' to remove them.
2020-04-22 13:18:06 DEBUG config-changed 0 upgraded, 0 newly installed, 0 to remove and 11 not upgraded.
2020-04-22 13:18:06 WARNING juju-log Support for use of upstream ``apt_pkg`` module in conjunctionwith charm-helpers is deprecated since 2019-06-25
2020-04-22 13:18:07 DEBUG juju-log Restarting all OSDs to load new binaries
2020-04-22 13:18:07 DEBUG config-changed admin_socket: exception getting command descriptions: [Errno 111] Connection refused
2020-04-22 13:18:07 DEBUG juju-log Command '['ceph', 'daemon', '/var/run/ceph/ceph-osd.6.asok', 'status']' returned non-zero exit status 22.

[ goes on forever ]

Prior to getting into this state the unit was brought up with an incorrect value for osd-devices, i.e. '/dev/sdb /dev/vdb' when it should have been '/dev/disk/by-dname/bcache1 /dev/disk/by-dname/bcache2 /dev/disk/by-dname/bcache3'

# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 931.5G 0 disk
├─sda1 8:1 0 512M 0 part /boot/efi
├─sda2 8:2 0 1G 0 part /boot
└─sda3 8:3 0 930G 0 part
  └─bcache3 252:384 0 930G 0 disk /
sdb 8:16 0 931.5G 0 disk
└─bcache1 252:128 0 931.5G 0 disk
  └─ceph--bb8530b6--571a--4998--b2fe--eb892ef26bbe-osd--block--bb8530b6--571a--4998--b2fe--eb892ef26bbe 253:0 0 931.5G 0 lvm
sdc 8:32 0 931.5G 0 disk
└─bcache2 252:256 0 931.5G 0 disk
  └─ceph--9c454450--886e--41b1--be86--62ab365bebe5-osd--block--9c454450--886e--41b1--be86--62ab365bebe5 253:1 0 931.5G 0 lvm
sdd 8:48 0 931.5G 0 disk
└─bcache0 252:0 0 931.5G 0 disk
  └─ceph--25f5d8af--a141--402d--ae92--4805b38f9c93-osd--block--25f5d8af--a141--402d--ae92--4805b38f9c93 253:2 0 931.5G 0 lvm
nvme0n1 259:0 0 372.6G 0 disk
└─nvme0n1p1 259:1 0 372.6G 0 part
  ├─bcache0 252:0 0 931.5G 0 disk
  │ └─ceph--25f5d8af--a141--402d--ae92--4805b38f9c93-osd--block--25f5d8af--a141--402d--ae92--4805b38f9c93 253:2 0 931.5G 0 lvm
  ├─bcache1 252:128 0 931.5G 0 disk
  │ └─ceph--bb8530b6--571a--4998--b2fe--eb892ef26bbe-osd--block--bb8530b6--571a--4998--b2fe--eb892ef26bbe 253:0 0 931.5G 0 lvm
  ├─bcache2 252:256 0 931.5G 0 disk
  │ └─ceph--9c454450--886e--41b1--be86--62ab365bebe5-osd--block--9c454450--886e--41b1--be86--62ab365bebe5 253:1 0 931.5G 0 lvm
  └─bcache3 252:384 0 930G 0 disk /

Frode Nordahl (fnordahl)
summary: - Stuck with status message: 'ceph-osd/0* maintenance
- executing 0 10.246.115.255 (config-changed)
- Upgrading packages to None'
+ Stuck with status message: 'ceph-osd/0* maintenance executing (config-
+ changed) Upgrading packages to None'
Revision history for this message
Drew Freiberger (afreiberger) wrote :

We ran into something similar when performing upgrade from queens to rocky UCA on ceph-osd when one of the OSDs was dead, hardware replaced, but the old /var/lib/ceph/osd/ceph-XX directory existed still on one of the ceph-osd units.

We worked around this by doing the following (in our case, the osd was 97, so replace XX with the dead OSD number.)

systemctl stop jujud-unit-ceph-osd-X
cd /var/lib/ceph/osd
cd ceph-XX
ls -al block
# Ensure that the referenced block device does not exist on the system or in lvm, etc before continuing.
rm -rf *
cd /var/lib/ceph/osd
umount /var/lib/ceph/osd/ceph-XX
rmdir ceph-XX
systemctl start jujud-unit-ceph-osd-X

Changed in charm-ceph-osd:
status: New → Triaged
importance: Undecided → Medium
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.