os-prober blocks writes to raw partitions

Bug #1384062 reported by Laurent GUERBY
32
This bug affects 5 people
Affects Status Importance Assigned to Milestone
ceph (Juju Charms Collection)
Invalid
Undecided
Unassigned
ceph (Ubuntu)
Invalid
Medium
Unassigned
os-prober (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

This morning automatic package upgrade on our running system:

libsigc++-2.0-0c2a,libssl1.0.0,man-db,libgtk2.0-common,libgtk2.0-bin,
libgtk2.0-0,openssh-sftp-server,openssh-server,
openssh-client,grub-pc,grub-pc-bin,grub2-common,grub-common,openssl,python-cryptography,python-pygraphviz

killed five OSD out of 15 on our ceph 0.80.6 cluster of 5 machines :

root@g2:/var/log/ceph# grep -E ^2014-10-22 ceph-osd.7.log
2014-10-22 07:41:36.783358 7f4d33d55700 -1 journal FileJournal::write_bl : write_fd failed: (1) Operation not permitted
2014-10-22 07:41:36.783617 7f4d33d55700 -1 journal FileJournal::do_write: write_bl(pos=793935872) failed
2014-10-22 07:41:36.800201 7f4d33d55700 -1 os/FileJournal.cc: In function 'void FileJournal::do_write(ceph::bufferlist&)' thread 7f4d33d55700 time 2014-10-22 07:41:36.783629
2014-10-22 07:41:36.847389 7f4d33d55700 -1 *** Caught signal (Aborted) **

root@n7:/var/log/ceph# grep -E ^2014-10-22 ceph-osd.10.log|cut -c-120
2014-10-22 07:42:18.169142 7f9b977df700 -1 journal FileJournal::write_bl : write_fd failed: (1) Operation not permitted

root@n7:/var/log/ceph# grep -E ^2014-10-22 ceph-osd.9.log|cut -c-120
2014-10-22 07:42:17.509579 7f6efa27b700 -1 osd.9 15390 heartbeat_check: no reply from osd.13 since back 2014-10-22 07:41
2014-10-22 07:42:17.509593 7f6efa27b700 -1 osd.9 15390 heartbeat_check: no reply from osd.14 since back 2014-10-22 07:41
2014-10-22 07:42:17.945433 7f6ef6a74700 -1 journal FileJournal::do_write: pwrite(fd=23, hbp.length=4096) failed :(1) Ope
2014-10-22 07:42:17.960678 7f6ef6a74700 -1 os/FileJournal.cc: In function 'void FileJournal::do_write(ceph::bufferlist&)

root@stri:/var/log/ceph# grep -E ^2014-10-22 ceph-osd.13.log
2014-10-22 00:42:01.140574 7fa929b8a700 -1 journal FileJournal::write_bl : write_fd failed: (1) Operation not permitted
2014-10-22 00:42:01.141439 7fa929b8a700 -1 journal FileJournal::do_write: write_bl(pos=3496448000) failed

root@stri:/var/log/ceph# grep -E ^2014-10-22 ceph-osd.14.log
2014-10-22 00:41:54.828719 7f438eb45700 -1 osd.14 15388 heartbeat_check: no reply from osd.7 since back 2014-10-22 00:41:34.499777 front 2014-10-22 00:41:34.499777 (cutoff 2014-10-22 00:41:34.828717)
2014-10-22 00:41:55.241586 7f437217f700 0 -- 192.168.99.246:6811/17136 >> 192.168.99.253:6806/25800 pipe(0x7f439f5fd900 sd=182 :6811 s=0 pgs=0 cs=0 l=0 c=0x7f43a71f1180).accept connect_seq 34 vs existing 33 state standby
2014-10-22 00:42:01.235014 7f438b33e700 -1 journal FileJournal::write_bl : write_fd failed: (1) Operation not permitted
2014-10-22 00:42:01.235032 7f438b33e700 -1 journal FileJournal::do_write: write_bl(pos=4626878464) failed

The OSD all died just after a run of os-prober according to the logs:

Oct 22 07:41:36 g2 os-prober: debug: running /usr/lib/os-probes/mounted/05efi on mounted /dev/sda1

os-prober likely did an operation on the journal partition causing the write errors on the OSD.

Revision history for this message
James Page (james-page) wrote :

I did a quick test and was not able to reproduce - but my test environment is virtual so that may be making a difference.

Changed in ceph (Juju Charms Collection):
status: New → Invalid
Changed in ceph (Ubuntu):
importance: Undecided → Medium
Revision history for this message
Dr. Jens Harbott (j-harbott) wrote :

I can reproduce this on Wily with

# apt-cache policy ceph
ceph:
  Installed: 0.94.5-0ubuntu0.15.10.1
  Candidate: 0.94.5-0ubuntu0.15.10.1
  Version table:
 *** 0.94.5-0ubuntu0.15.10.1 0
        500 http://eu.archive.ubuntu.com/ubuntu/ wily-updates/main amd64 Packages
        100 /var/lib/dpkg/status
     0.94.3-0ubuntu2 0
        500 http://eu.archive.ubuntu.com/ubuntu/ wily/main amd64 Packages

when I configure two OSDs both with their journal partition on a third disk. Running os-prober makes the second OSD fail with

     0> 2016-02-16 15:15:17.604773 7fe2c7ec8700 -1 os/FileJournal.cc: In function 'void FileJournal::write_finish_thread_entry()' thread 7fe2c7ec8700 time 2016-02-16 15:15:17.603014
os/FileJournal.cc: 1426: FAILED assert(0 == "unexpected aio error")

 ceph version 0.94.5 (9764da52395923e0b32908d83a9f7304401fee43)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x80) [0x556db4ce44f0]
 2: (FileJournal::write_finish_thread_entry()+0x775) [0x556db4bb5925]
 3: (FileJournal::WriteFinisher::entry()+0xd) [0x556db4a990ed]
 4: (()+0x76aa) [0x7fe2d32656aa]
 5: (clone()+0x6d) [0x7fe2d173ceed]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

which seems to indicate that os-prober is doing unsafe things to the journal partitions.

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in ceph (Ubuntu):
status: New → Confirmed
Changed in os-prober (Ubuntu):
status: New → Confirmed
Revision history for this message
Dr. Jens Harbott (j-harbott) wrote :

Forgot to mention that the Ceph cluster has to be under write load in order to reproduce, i.e. running something like

rados -p rbd bench 600 write -t 1 --show-time --run-length 60

There is no effect of running os-prober if the cluster is idle. Based with that information, though, I can also reproduce the issue by running fio on some partition and os-prober in parallel, getting:

# fio --ioengine=libaio --filename=/dev/sdc4 --bs=64k --rw=randwrite --runtime=300 --size=1G --direct=1 --iodepth=8 --name=a
a: (g=0): rw=randwrite, bs=64K-64K/64K-64K/64K-64K, ioengine=libaio, iodepth=8
fio-2.2.13
Starting 1 process
fio: io_u error on file /dev/sdc4: Operation not permitted: write offset=531300352, buflen=65536
fio: pid=17543, err=1/file:io_u.c:1596, func=io_u error, error=Operation not permitted

So I think the error has nothing to do with Ceph in particular, but really os-prober should be made more conservative when trying to probe partitions.

Revision history for this message
Dr. Jens Harbott (j-harbott) wrote :

Function ro_partition in /usr/share/os-prober/common.sh makes a partition read-only. This is called in /usr/lib/os-probes/50mounted-tests before trying to mount the partition being probed with all possible fs-types. Obviously this will break all other processes trying to write to this partition at that time.

summary: - os-prober kills ceph OSD
+ os-prober blocks writes to raw partitions
Changed in ceph (Ubuntu):
status: Confirmed → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.