Activity log for bug #1856871

Date Who What changed Old value New value Message
2019-12-18 17:01:34 Eric Desrochers bug added bug
2019-12-18 17:01:58 Eric Desrochers bug task added udev (Ubuntu)
2019-12-18 17:03:09 Eric Desrochers description This is reproducible in Bionic and late. Here's an example running 'focal': $ lsb_release -cs focal $ uname -r 5.3.0-24-generic How to trigger it: $ sosreport -o block or more precisely the command causing the situation inside the block plugin: $ parted -s /dev/$(losetup -f) unit s print https://github.com/sosreport/sos/blob/master/sos/plugins/block.py#L52 but if I run it on the next next unused loop device, in this case /dev/loop3 (which is also unused), no errors. While I agree that sosreport shouldn't query unused loop devices, there is definitely something going on with the next unused loop device. What is the difference between loop2 and loop3 and other unused one ? 3 things so far I have noticed: * The loop device need to be the next unused loop device (losetup -f) * A reboot is needed (if some loop modification (snap install, mount loop, ...) has been made at runtime * I have also noticed that loop2 (or whatever the next unused one is) have some stat as oppose to other unused loop devices /sys/block/loop2/stat :::::::::::::: 2 0 10 0 1 0 0 0 0 0 0 while /dev/loop3 doesn't /sys/block/loop2/stat :::::::::::::: 0 0 0 0 0 0 0 0 0 0 0 Explanation of each column: https://meet.google.com/linkredirect?authuser=0&dest=https%3A%2F%2Fwww.kernel.org%2Fdoc%2Fhtml%2Flatest%2Fblock%2Fstat.html Which tells me that something during the boot process most likely acquired (on purpose or not) the next unused loop and possibly didn't released it well. If loop2 is generating errors, and I install a snap, the snap squashfs will take loop2, making loop3 the next unused loop device. If I query loop3 with 'parted' right after, no errors. If I reboot, and query loop3 again, then no I'll have an error. To triggers the errors it need to be after a reboot and it only impact the first unused loop device available (losetup -f). This was tested with focal/systemd whic his very close to latest upstream code. This has been test with latest v5.5 kernel as well. For now, I don't think it's a kernel problem, I'm more thinking of a userspace misbehaviour dealing with loop device at boot. This is reproducible in Bionic and late. Here's an example running 'focal': $ lsb_release -cs focal $ uname -r 5.3.0-24-generic How to trigger it: $ sosreport -o block or more precisely the command causing the situation inside the block plugin: $ parted -s /dev/$(losetup -f) unit s print https://github.com/sosreport/sos/blob/master/sos/plugins/block.py#L52 but if I run it on the next next unused loop device, in this case /dev/loop3 (which is also unused), no errors. While I agree that sosreport shouldn't query unused loop devices, there is definitely something going on with the next unused loop device. What is the difference between loop2 and loop3 and other unused one ? 3 things so far I have noticed: * The loop device need to be the next unused loop device (losetup -f) * A reboot is needed (if some loop modification (snap install, mount loop, ...) has been made at runtime * I have also noticed that loop2 (or whatever the next unused one is) have some stat as oppose to other unused loop devices. The stat exist already right after the system boot for the next unused loop device. /sys/block/loop2/stat :::::::::::::: 2 0 10 0 1 0 0 0 0 0 0 while /dev/loop3 doesn't /sys/block/loop3/stat :::::::::::::: 0 0 0 0 0 0 0 0 0 0 0 Explanation of each column: https://meet.google.com/linkredirect?authuser=0&dest=https%3A%2F%2Fwww.kernel.org%2Fdoc%2Fhtml%2Flatest%2Fblock%2Fstat.html Which tells me that something during the boot process most likely acquired (on purpose or not) the next unused loop and possibly didn't released it well. If loop2 is generating errors, and I install a snap, the snap squashfs will take loop2, making loop3 the next unused loop device. If I query loop3 with 'parted' right after, no errors. If I reboot, and query loop3 again, then no I'll have an error. To triggers the errors it need to be after a reboot and it only impact the first unused loop device available (losetup -f). This was tested with focal/systemd whic his very close to latest upstream code. This has been test with latest v5.5 kernel as well. For now, I don't think it's a kernel problem, I'm more thinking of a userspace misbehaviour dealing with loop device at boot.
2019-12-18 17:05:41 Eric Desrochers description This is reproducible in Bionic and late. Here's an example running 'focal': $ lsb_release -cs focal $ uname -r 5.3.0-24-generic How to trigger it: $ sosreport -o block or more precisely the command causing the situation inside the block plugin: $ parted -s /dev/$(losetup -f) unit s print https://github.com/sosreport/sos/blob/master/sos/plugins/block.py#L52 but if I run it on the next next unused loop device, in this case /dev/loop3 (which is also unused), no errors. While I agree that sosreport shouldn't query unused loop devices, there is definitely something going on with the next unused loop device. What is the difference between loop2 and loop3 and other unused one ? 3 things so far I have noticed: * The loop device need to be the next unused loop device (losetup -f) * A reboot is needed (if some loop modification (snap install, mount loop, ...) has been made at runtime * I have also noticed that loop2 (or whatever the next unused one is) have some stat as oppose to other unused loop devices. The stat exist already right after the system boot for the next unused loop device. /sys/block/loop2/stat :::::::::::::: 2 0 10 0 1 0 0 0 0 0 0 while /dev/loop3 doesn't /sys/block/loop3/stat :::::::::::::: 0 0 0 0 0 0 0 0 0 0 0 Explanation of each column: https://meet.google.com/linkredirect?authuser=0&dest=https%3A%2F%2Fwww.kernel.org%2Fdoc%2Fhtml%2Flatest%2Fblock%2Fstat.html Which tells me that something during the boot process most likely acquired (on purpose or not) the next unused loop and possibly didn't released it well. If loop2 is generating errors, and I install a snap, the snap squashfs will take loop2, making loop3 the next unused loop device. If I query loop3 with 'parted' right after, no errors. If I reboot, and query loop3 again, then no I'll have an error. To triggers the errors it need to be after a reboot and it only impact the first unused loop device available (losetup -f). This was tested with focal/systemd whic his very close to latest upstream code. This has been test with latest v5.5 kernel as well. For now, I don't think it's a kernel problem, I'm more thinking of a userspace misbehaviour dealing with loop device at boot. This is reproducible in Bionic and late. Here's an example running 'focal': $ lsb_release -cs focal $ uname -r 5.3.0-24-generic How to trigger it: $ sosreport -o block or more precisely the command causing the situation inside the block plugin: $ parted -s /dev/$(losetup -f) unit s print https://github.com/sosreport/sos/blob/master/sos/plugins/block.py#L52 but if I run it on the next next unused loop device, in this case /dev/loop3 (which is also unused), no errors. While I agree that sosreport shouldn't query unused loop devices, there is definitely something going on with the next unused loop device. What is the difference between loop2 and loop3 and other unused one ? 3 things so far I have noticed: * The loop device need to be the next unused loop device (losetup -f) * A reboot is needed (if some loop modification (snap install, mount loop, ...) has been made at runtime * I have also noticed that loop2 (or whatever the next unused one is) have some stat as oppose to other unused loop devices. The stat exist already right after the system boot for the next unused loop device. /sys/block/loop2/stat :::::::::::::: 2 0 10 0 1 0 0 0 0 0 0 2 = number of read I/Os processed 10 = number of sectors read 1 = number of write I/Os processed Explanation of each column: https://www.kernel.org/doc/html/latest/block/stat.html while /dev/loop3 doesn't /sys/block/loop3/stat :::::::::::::: 0 0 0 0 0 0 0 0 0 0 0 Which tells me that something during the boot process most likely acquired (on purpose or not) the next unused loop and possibly didn't released it well. If loop2 is generating errors, and I install a snap, the snap squashfs will take loop2, making loop3 the next unused loop device. If I query loop3 with 'parted' right after, no errors. If I reboot, and query loop3 again, then no I'll have an error. To triggers the errors it need to be after a reboot and it only impact the first unused loop device available (losetup -f). This was tested with focal/systemd whic his very close to latest upstream code. This has been test with latest v5.5 kernel as well. For now, I don't think it's a kernel problem, I'm more thinking of a userspace misbehaviour dealing with loop device at boot.
2019-12-18 17:07:44 Eric Desrochers tags sts
2019-12-18 17:20:56 Eric Desrochers description This is reproducible in Bionic and late. Here's an example running 'focal': $ lsb_release -cs focal $ uname -r 5.3.0-24-generic How to trigger it: $ sosreport -o block or more precisely the command causing the situation inside the block plugin: $ parted -s /dev/$(losetup -f) unit s print https://github.com/sosreport/sos/blob/master/sos/plugins/block.py#L52 but if I run it on the next next unused loop device, in this case /dev/loop3 (which is also unused), no errors. While I agree that sosreport shouldn't query unused loop devices, there is definitely something going on with the next unused loop device. What is the difference between loop2 and loop3 and other unused one ? 3 things so far I have noticed: * The loop device need to be the next unused loop device (losetup -f) * A reboot is needed (if some loop modification (snap install, mount loop, ...) has been made at runtime * I have also noticed that loop2 (or whatever the next unused one is) have some stat as oppose to other unused loop devices. The stat exist already right after the system boot for the next unused loop device. /sys/block/loop2/stat :::::::::::::: 2 0 10 0 1 0 0 0 0 0 0 2 = number of read I/Os processed 10 = number of sectors read 1 = number of write I/Os processed Explanation of each column: https://www.kernel.org/doc/html/latest/block/stat.html while /dev/loop3 doesn't /sys/block/loop3/stat :::::::::::::: 0 0 0 0 0 0 0 0 0 0 0 Which tells me that something during the boot process most likely acquired (on purpose or not) the next unused loop and possibly didn't released it well. If loop2 is generating errors, and I install a snap, the snap squashfs will take loop2, making loop3 the next unused loop device. If I query loop3 with 'parted' right after, no errors. If I reboot, and query loop3 again, then no I'll have an error. To triggers the errors it need to be after a reboot and it only impact the first unused loop device available (losetup -f). This was tested with focal/systemd whic his very close to latest upstream code. This has been test with latest v5.5 kernel as well. For now, I don't think it's a kernel problem, I'm more thinking of a userspace misbehaviour dealing with loop device at boot. This is reproducible in Bionic and late. Here's an example running 'focal': $ lsb_release -cs focal $ uname -r 5.3.0-24-generic The error is: blk_update_request: I/O error, dev loop2, sector 0 How to trigger it: $ sosreport -o block or more precisely the cmd causing the situation inside the block plugin: $ parted -s /dev/$(losetup -f) unit s print https://github.com/sosreport/sos/blob/master/sos/plugins/block.py#L52 but if I run it on the next next unused loop device, in this case /dev/loop3 (which is also unused), no errors. While I agree that sosreport shouldn't query unused loop devices, there is definitely something going on with the next unused loop device. What is differentiate loop2 and loop3 and any other unused ones ? 3 things so far I have noticed: * loop2 is the next unused loop device (losetup -f) * A reboot is needed (if some loop modification (snap install, mount loop, ...) has been made at runtime * I have also noticed that loop2 (or whatever the next unused one is) have some stat as oppose to other unused loop devices. The stat exist already right after the system boot for the next unused loop device. /sys/block/loop2/stat :::::::::::::: 2 0 10 0 1 0 0 0 0 0 0 2 = number of read I/Os processed 10 = number of sectors read 1 = number of write I/Os processed Explanation of each column: https://www.kernel.org/doc/html/latest/block/stat.html while /dev/loop3 doesn't /sys/block/loop3/stat :::::::::::::: 0 0 0 0 0 0 0 0 0 0 0 Which tells me that something during the boot process most likely acquired (on purpose or not) the next unused loop and possibly didn't released it well enough. If loop2 is generating errors, and I install a snap, the snap squashfs will take loop2, making loop3 the next unused loop device. If I query loop3 with 'parted' right after, no errors. If I reboot, and query loop3 again, then no I'll have an error. To triggers the errors it need to be after a reboot and it only impact the first unused loop device available (losetup -f). This was tested with focal/systemd whic his very close to latest upstream code. This has been test with latest v5.5 mainline kernel as well. For now, I don't think it's a kernel problem, I'm more thinking of a userspace misbehaviour dealing with loop device (or block device) at boot.
2019-12-18 17:23:29 Eric Desrochers description This is reproducible in Bionic and late. Here's an example running 'focal': $ lsb_release -cs focal $ uname -r 5.3.0-24-generic The error is: blk_update_request: I/O error, dev loop2, sector 0 How to trigger it: $ sosreport -o block or more precisely the cmd causing the situation inside the block plugin: $ parted -s /dev/$(losetup -f) unit s print https://github.com/sosreport/sos/blob/master/sos/plugins/block.py#L52 but if I run it on the next next unused loop device, in this case /dev/loop3 (which is also unused), no errors. While I agree that sosreport shouldn't query unused loop devices, there is definitely something going on with the next unused loop device. What is differentiate loop2 and loop3 and any other unused ones ? 3 things so far I have noticed: * loop2 is the next unused loop device (losetup -f) * A reboot is needed (if some loop modification (snap install, mount loop, ...) has been made at runtime * I have also noticed that loop2 (or whatever the next unused one is) have some stat as oppose to other unused loop devices. The stat exist already right after the system boot for the next unused loop device. /sys/block/loop2/stat :::::::::::::: 2 0 10 0 1 0 0 0 0 0 0 2 = number of read I/Os processed 10 = number of sectors read 1 = number of write I/Os processed Explanation of each column: https://www.kernel.org/doc/html/latest/block/stat.html while /dev/loop3 doesn't /sys/block/loop3/stat :::::::::::::: 0 0 0 0 0 0 0 0 0 0 0 Which tells me that something during the boot process most likely acquired (on purpose or not) the next unused loop and possibly didn't released it well enough. If loop2 is generating errors, and I install a snap, the snap squashfs will take loop2, making loop3 the next unused loop device. If I query loop3 with 'parted' right after, no errors. If I reboot, and query loop3 again, then no I'll have an error. To triggers the errors it need to be after a reboot and it only impact the first unused loop device available (losetup -f). This was tested with focal/systemd whic his very close to latest upstream code. This has been test with latest v5.5 mainline kernel as well. For now, I don't think it's a kernel problem, I'm more thinking of a userspace misbehaviour dealing with loop device (or block device) at boot. This is reproducible in Bionic and late. Here's an example running 'focal': $ lsb_release -cs focal $ uname -r 5.3.0-24-generic The error is: blk_update_request: I/O error, dev loop2, sector 0 How to trigger it: $ sosreport -o block or more precisely the cmd causing the situation inside the block plugin: $ parted -s /dev/$(losetup -f) unit s print https://github.com/sosreport/sos/blob/master/sos/plugins/block.py#L52 but if I run it on the next next unused loop device, in this case /dev/loop3 (which is also unused), no errors. While I agree that sosreport shouldn't query unused loop devices, there is definitely something going on with the next unused loop device. What is differentiate loop2 and loop3 and any other unused ones ? 3 things so far I have noticed: * loop2 is the next unused loop device (losetup -f) * A reboot is needed (if some loop modification (snap install, mount loop, ...) has been made at runtime * I have also noticed that loop2 (or whatever the next unused one is) have some stat as oppose to other unused loop devices. The stat exist already right after the system boot for the next unused loop device. /sys/block/loop2/stat :::::::::::::: 2 0 10 0 1 0 0 0 0 0 0 2 = number of read I/Os processed 10 = number of sectors read 1 = number of write I/Os processed Explanation of each column: https://www.kernel.org/doc/html/latest/block/stat.html while /dev/loop3 doesn't /sys/block/loop3/stat :::::::::::::: 0 0 0 0 0 0 0 0 0 0 0 Which tells me that something during the boot process most likely acquired (on purpose or not) the next unused loop and possibly didn't released it well enough. If loop2 is generating errors, and I install a snap, the snap squashfs will take loop2, making loop3 the next unused loop device. If I query loop3 with 'parted' right after, no errors. If I reboot, and query loop3 again, then no I'll have an error. To triggers the errors it need to be after a reboot and it only impact the first unused loop device available (losetup -f). This was tested with focal/systemd whic his very close to latest upstream code. This has been test with latest v5.5 mainline kernel as well. For now, I don't think it's a kernel problem, I'm more thinking of a userspace misbehaviour dealing with loop device (or block device) at boot.
2019-12-18 17:36:42 Eric Desrochers description This is reproducible in Bionic and late. Here's an example running 'focal': $ lsb_release -cs focal $ uname -r 5.3.0-24-generic The error is: blk_update_request: I/O error, dev loop2, sector 0 How to trigger it: $ sosreport -o block or more precisely the cmd causing the situation inside the block plugin: $ parted -s /dev/$(losetup -f) unit s print https://github.com/sosreport/sos/blob/master/sos/plugins/block.py#L52 but if I run it on the next next unused loop device, in this case /dev/loop3 (which is also unused), no errors. While I agree that sosreport shouldn't query unused loop devices, there is definitely something going on with the next unused loop device. What is differentiate loop2 and loop3 and any other unused ones ? 3 things so far I have noticed: * loop2 is the next unused loop device (losetup -f) * A reboot is needed (if some loop modification (snap install, mount loop, ...) has been made at runtime * I have also noticed that loop2 (or whatever the next unused one is) have some stat as oppose to other unused loop devices. The stat exist already right after the system boot for the next unused loop device. /sys/block/loop2/stat :::::::::::::: 2 0 10 0 1 0 0 0 0 0 0 2 = number of read I/Os processed 10 = number of sectors read 1 = number of write I/Os processed Explanation of each column: https://www.kernel.org/doc/html/latest/block/stat.html while /dev/loop3 doesn't /sys/block/loop3/stat :::::::::::::: 0 0 0 0 0 0 0 0 0 0 0 Which tells me that something during the boot process most likely acquired (on purpose or not) the next unused loop and possibly didn't released it well enough. If loop2 is generating errors, and I install a snap, the snap squashfs will take loop2, making loop3 the next unused loop device. If I query loop3 with 'parted' right after, no errors. If I reboot, and query loop3 again, then no I'll have an error. To triggers the errors it need to be after a reboot and it only impact the first unused loop device available (losetup -f). This was tested with focal/systemd whic his very close to latest upstream code. This has been test with latest v5.5 mainline kernel as well. For now, I don't think it's a kernel problem, I'm more thinking of a userspace misbehaviour dealing with loop device (or block device) at boot. This is reproducible in Bionic and late. Here's an example running 'focal': $ lsb_release -cs focal $ uname -r 5.3.0-24-generic The error is: blk_update_request: I/O error, dev loop2, sector 0 How to trigger it: $ sosreport -o block or more precisely the cmd causing the situation inside the block plugin: $ parted -s $(losetup -f) unit s print https://github.com/sosreport/sos/blob/master/sos/plugins/block.py#L52 but if I run it on the next next unused loop device, in this case /dev/loop3 (which is also unused), no errors. While I agree that sosreport shouldn't query unused loop devices, there is definitely something going on with the next unused loop device. What is differentiate loop2 and loop3 and any other unused ones ? 3 things so far I have noticed: * loop2 is the next unused loop device (losetup -f) * A reboot is needed (if some loop modification (snap install, mount loop, ...) has been made at runtime * I have also noticed that loop2 (or whatever the next unused one is) have some stat as oppose to other unused loop devices. The stat exist already right after the system boot for the next unused loop device. /sys/block/loop2/stat :::::::::::::: 2 0 10 0 1 0 0 0 0 0 0 2 = number of read I/Os processed 10 = number of sectors read 1 = number of write I/Os processed Explanation of each column: https://www.kernel.org/doc/html/latest/block/stat.html while /dev/loop3 doesn't /sys/block/loop3/stat :::::::::::::: 0 0 0 0 0 0 0 0 0 0 0 Which tells me that something during the boot process most likely acquired (on purpose or not) the next unused loop and possibly didn't released it well enough. If loop2 is generating errors, and I install a snap, the snap squashfs will take loop2, making loop3 the next unused loop device. If I query loop3 with 'parted' right after, no errors. If I reboot, and query loop3 again, then no I'll have an error. To triggers the errors it need to be after a reboot and it only impact the first unused loop device available (losetup -f). This was tested with focal/systemd whic his very close to latest upstream code. This has been test with latest v5.5 mainline kernel as well. For now, I don't think it's a kernel problem, I'm more thinking of a userspace misbehaviour dealing with loop device (or block device) at boot.
2019-12-18 17:39:50 Dan Streetman bug added subscriber Dan Streetman
2019-12-18 18:08:14 Eric Desrochers bug watch added https://github.com/sosreport/sos/issues/1897
2019-12-18 18:13:37 Eric Desrochers description This is reproducible in Bionic and late. Here's an example running 'focal': $ lsb_release -cs focal $ uname -r 5.3.0-24-generic The error is: blk_update_request: I/O error, dev loop2, sector 0 How to trigger it: $ sosreport -o block or more precisely the cmd causing the situation inside the block plugin: $ parted -s $(losetup -f) unit s print https://github.com/sosreport/sos/blob/master/sos/plugins/block.py#L52 but if I run it on the next next unused loop device, in this case /dev/loop3 (which is also unused), no errors. While I agree that sosreport shouldn't query unused loop devices, there is definitely something going on with the next unused loop device. What is differentiate loop2 and loop3 and any other unused ones ? 3 things so far I have noticed: * loop2 is the next unused loop device (losetup -f) * A reboot is needed (if some loop modification (snap install, mount loop, ...) has been made at runtime * I have also noticed that loop2 (or whatever the next unused one is) have some stat as oppose to other unused loop devices. The stat exist already right after the system boot for the next unused loop device. /sys/block/loop2/stat :::::::::::::: 2 0 10 0 1 0 0 0 0 0 0 2 = number of read I/Os processed 10 = number of sectors read 1 = number of write I/Os processed Explanation of each column: https://www.kernel.org/doc/html/latest/block/stat.html while /dev/loop3 doesn't /sys/block/loop3/stat :::::::::::::: 0 0 0 0 0 0 0 0 0 0 0 Which tells me that something during the boot process most likely acquired (on purpose or not) the next unused loop and possibly didn't released it well enough. If loop2 is generating errors, and I install a snap, the snap squashfs will take loop2, making loop3 the next unused loop device. If I query loop3 with 'parted' right after, no errors. If I reboot, and query loop3 again, then no I'll have an error. To triggers the errors it need to be after a reboot and it only impact the first unused loop device available (losetup -f). This was tested with focal/systemd whic his very close to latest upstream code. This has been test with latest v5.5 mainline kernel as well. For now, I don't think it's a kernel problem, I'm more thinking of a userspace misbehaviour dealing with loop device (or block device) at boot. This is reproducible in Bionic and late. Here's an example running 'focal': $ lsb_release -cs focal $ uname -r 5.3.0-24-generic The error is: blk_update_request: I/O error, dev loop2, sector 0 and on more recent kernel: kernel: [18135.185709] blk_update_request: I/O error, dev loop18, sector 0 op 0x1:(WRITE) flags 0x800 phys_seg 0 prio class 0 How to trigger it: $ sosreport -o block or more precisely the cmd causing the situation inside the block plugin: $ parted -s $(losetup -f) unit s print https://github.com/sosreport/sos/blob/master/sos/plugins/block.py#L52 but if I run it on the next next unused loop device, in this case /dev/loop3 (which is also unused), no errors. While I agree that sosreport shouldn't query unused loop devices, there is definitely something going on with the next unused loop device. What is differentiate loop2 and loop3 and any other unused ones ? 3 things so far I have noticed: * loop2 is the next unused loop device (losetup -f) * A reboot is needed (if some loop modification (snap install, mount loop, ...) has been made at runtime * I have also noticed that loop2 (or whatever the next unused one is) have some stat as oppose to other unused loop devices. The stat exist already right after the system boot for the next unused loop device. /sys/block/loop2/stat :::::::::::::: 2 0 10 0 1 0 0 0 0 0 0 2 = number of read I/Os processed 10 = number of sectors read 1 = number of write I/Os processed Explanation of each column: https://www.kernel.org/doc/html/latest/block/stat.html while /dev/loop3 doesn't /sys/block/loop3/stat :::::::::::::: 0 0 0 0 0 0 0 0 0 0 0 Which tells me that something during the boot process most likely acquired (on purpose or not) the next unused loop and possibly didn't released it well enough. If loop2 is generating errors, and I install a snap, the snap squashfs will take loop2, making loop3 the next unused loop device. If I query loop3 with 'parted' right after, no errors. If I reboot, and query loop3 again, then no I'll have an error. To triggers the errors it need to be after a reboot and it only impact the first unused loop device available (losetup -f). This was tested with focal/systemd whic his very close to latest upstream code. This has been test with latest v5.5 mainline kernel as well. For now, I don't think it's a kernel problem, I'm more thinking of a userspace misbehaviour dealing with loop device (or block device) at boot.
2019-12-18 19:21:49 Eric Desrochers bug task added snapd (Ubuntu)
2019-12-18 23:05:49 Kai Kasurinen bug added subscriber Kai Kasurinen
2019-12-19 12:53:51 John Lenton snapd (Ubuntu): status New Invalid
2019-12-19 17:04:09 Eric Desrochers bug task added linux (Ubuntu)
2019-12-19 17:30:09 Ubuntu Kernel Bot linux (Ubuntu): status New Incomplete
2019-12-25 18:23:53 Eric Desrochers bug task added parted (Ubuntu)
2020-03-30 20:48:53 Dan Streetman systemd (Ubuntu): status New Invalid
2020-08-06 21:12:28 Kai Kasurinen udev (Ubuntu): status New Invalid
2021-02-22 16:50:57 Mauricio Faria de Oliveira linux (Ubuntu): status Incomplete In Progress
2021-02-22 16:51:03 Mauricio Faria de Oliveira linux (Ubuntu): importance Undecided Medium
2021-02-22 16:51:09 Mauricio Faria de Oliveira linux (Ubuntu): assignee Mauricio Faria de Oliveira (mfo)
2021-02-22 16:51:15 Mauricio Faria de Oliveira parted (Ubuntu): status New Invalid
2021-02-22 16:53:58 Mauricio Faria de Oliveira linux (Ubuntu): assignee Mauricio Faria de Oliveira (mfo) Eric Desrochers (slashd)
2021-05-25 11:18:37 Eric Desrochers linux (Ubuntu): assignee Eric Desrochers (slashd)
2023-03-14 23:06:11 Mauricio Faria de Oliveira bug added subscriber Mauricio Faria de Oliveira
2023-03-14 23:07:08 Mauricio Faria de Oliveira nominated for series Ubuntu Jammy
2023-03-14 23:07:08 Mauricio Faria de Oliveira bug task added parted (Ubuntu Jammy)
2023-03-14 23:07:08 Mauricio Faria de Oliveira bug task added udev (Ubuntu Jammy)
2023-03-14 23:07:08 Mauricio Faria de Oliveira bug task added linux (Ubuntu Jammy)
2023-03-14 23:07:08 Mauricio Faria de Oliveira bug task added systemd (Ubuntu Jammy)
2023-03-14 23:07:08 Mauricio Faria de Oliveira bug task added snapd (Ubuntu Jammy)
2023-03-14 23:07:27 Mauricio Faria de Oliveira linux (Ubuntu): status In Progress Fix Released
2023-03-14 23:08:12 Mauricio Faria de Oliveira linux (Ubuntu Jammy): status New Fix Released
2023-03-14 23:08:24 Mauricio Faria de Oliveira parted (Ubuntu Jammy): status New Invalid
2023-03-14 23:08:43 Mauricio Faria de Oliveira bug task deleted parted (Ubuntu Jammy)
2023-03-14 23:08:52 Mauricio Faria de Oliveira bug task deleted snapd (Ubuntu Jammy)
2023-03-14 23:09:01 Mauricio Faria de Oliveira bug task deleted udev (Ubuntu Jammy)
2023-03-14 23:09:14 Mauricio Faria de Oliveira bug task deleted systemd (Ubuntu Jammy)
2023-03-20 14:17:02 Dan Streetman removed subscriber Dan Streetman
2023-03-29 20:25:43 Jorge Merlino description This is reproducible in Bionic and late. Here's an example running 'focal': $ lsb_release -cs focal $ uname -r 5.3.0-24-generic The error is: blk_update_request: I/O error, dev loop2, sector 0 and on more recent kernel: kernel: [18135.185709] blk_update_request: I/O error, dev loop18, sector 0 op 0x1:(WRITE) flags 0x800 phys_seg 0 prio class 0 How to trigger it: $ sosreport -o block or more precisely the cmd causing the situation inside the block plugin: $ parted -s $(losetup -f) unit s print https://github.com/sosreport/sos/blob/master/sos/plugins/block.py#L52 but if I run it on the next next unused loop device, in this case /dev/loop3 (which is also unused), no errors. While I agree that sosreport shouldn't query unused loop devices, there is definitely something going on with the next unused loop device. What is differentiate loop2 and loop3 and any other unused ones ? 3 things so far I have noticed: * loop2 is the next unused loop device (losetup -f) * A reboot is needed (if some loop modification (snap install, mount loop, ...) has been made at runtime * I have also noticed that loop2 (or whatever the next unused one is) have some stat as oppose to other unused loop devices. The stat exist already right after the system boot for the next unused loop device. /sys/block/loop2/stat :::::::::::::: 2 0 10 0 1 0 0 0 0 0 0 2 = number of read I/Os processed 10 = number of sectors read 1 = number of write I/Os processed Explanation of each column: https://www.kernel.org/doc/html/latest/block/stat.html while /dev/loop3 doesn't /sys/block/loop3/stat :::::::::::::: 0 0 0 0 0 0 0 0 0 0 0 Which tells me that something during the boot process most likely acquired (on purpose or not) the next unused loop and possibly didn't released it well enough. If loop2 is generating errors, and I install a snap, the snap squashfs will take loop2, making loop3 the next unused loop device. If I query loop3 with 'parted' right after, no errors. If I reboot, and query loop3 again, then no I'll have an error. To triggers the errors it need to be after a reboot and it only impact the first unused loop device available (losetup -f). This was tested with focal/systemd whic his very close to latest upstream code. This has been test with latest v5.5 mainline kernel as well. For now, I don't think it's a kernel problem, I'm more thinking of a userspace misbehaviour dealing with loop device (or block device) at boot. [Impact] * There's an I/O error on fsync() in a detached loop device if it has been previously attached. The issue is that write cache is enabled in the attach path in loop_configure() but it isn't disabled in the detach path; thus it remains enabled in the block device regardless of whether it is attached or not. * fsync() on detached loop devices can be called by partition tools and commands run by sosreport, so the unexpected kernel error message might surprise users or even distract from the actual issue being investigatedr. It might also trigger alerts in logging/monitoring/alerting stacks [Fix] * Disable write cache in the detach path [Test Plan] * Attach and detach an image to a loop device and test fsync return value aterwards # DEV=/dev/loop7 # IMG=/tmp/image # truncate --size 1M $IMG # losetup $DEV $IMG # losetup -d $DEV Before: # strace -e fsync parted -s $DEV print 2>&1 | grep fsync fsync(3) = -1 EIO (Input/output error) Warning: Error fsyncing/closing /dev/loop7: Input/output error [ 982.529929] blk_update_request: I/O error, dev loop7, sector 0 op 0x1:(WRITE) flags 0x800 phys_seg 0 prio class 0 After: # strace -e fsync parted -s $DEV print 2>&1 | grep fsync fsync(3) = 0 [Where problems could occur] * The detach path for block devices is modified. Worst case scenario would be an error when detaching loop devices.
2023-03-29 20:26:00 Jorge Merlino linux (Ubuntu): milestone focal-updates
2023-03-29 20:26:13 Jorge Merlino linux (Ubuntu): milestone focal-updates
2023-03-29 20:26:59 Jorge Merlino nominated for series Ubuntu Focal
2023-03-29 20:26:59 Jorge Merlino bug task added parted (Ubuntu Focal)
2023-03-29 20:26:59 Jorge Merlino bug task added udev (Ubuntu Focal)
2023-03-29 20:26:59 Jorge Merlino bug task added linux (Ubuntu Focal)
2023-03-29 20:26:59 Jorge Merlino bug task added systemd (Ubuntu Focal)
2023-03-29 20:26:59 Jorge Merlino bug task added snapd (Ubuntu Focal)
2023-03-29 20:27:20 Jorge Merlino bug task deleted udev (Ubuntu Focal)
2023-03-29 20:27:27 Jorge Merlino bug task deleted systemd (Ubuntu Focal)
2023-03-29 20:27:33 Jorge Merlino bug task deleted snapd (Ubuntu Focal)
2023-03-29 20:27:43 Jorge Merlino bug task deleted parted (Ubuntu Focal)
2023-03-29 20:27:59 Jorge Merlino linux (Ubuntu Focal): status New In Progress
2023-03-29 20:28:04 Jorge Merlino linux (Ubuntu Focal): assignee Jorge Merlino (jorge-merlino)
2023-03-30 07:35:04 Stefan Bader linux (Ubuntu Focal): importance Undecided Medium
2023-03-30 07:35:12 Stefan Bader linux (Ubuntu Jammy): importance Undecided Medium
2023-03-30 20:08:13 Jorge Merlino nominated for series Ubuntu Bionic
2023-03-30 20:08:13 Jorge Merlino bug task added parted (Ubuntu Bionic)
2023-03-30 20:08:13 Jorge Merlino bug task added udev (Ubuntu Bionic)
2023-03-30 20:08:13 Jorge Merlino bug task added linux (Ubuntu Bionic)
2023-03-30 20:08:13 Jorge Merlino bug task added systemd (Ubuntu Bionic)
2023-03-30 20:08:13 Jorge Merlino bug task added snapd (Ubuntu Bionic)
2023-03-30 20:08:21 Jorge Merlino bug task deleted parted (Ubuntu Bionic)
2023-03-30 20:08:28 Jorge Merlino bug task deleted snapd (Ubuntu Bionic)
2023-03-30 20:08:36 Jorge Merlino bug task deleted systemd (Ubuntu Bionic)
2023-03-30 20:08:43 Jorge Merlino bug task deleted udev (Ubuntu Bionic)
2023-03-30 20:08:56 Jorge Merlino linux (Ubuntu Bionic): status New In Progress
2023-03-30 20:08:59 Jorge Merlino linux (Ubuntu Bionic): assignee Jorge Merlino (jorge-merlino)
2023-04-06 13:22:28 Stefan Bader linux (Ubuntu Focal): status In Progress Fix Committed
2023-04-06 13:22:32 Stefan Bader linux (Ubuntu Bionic): importance Undecided Medium
2023-04-19 17:46:33 Ubuntu Kernel Bot tags sts kernel-spammed-focal-linux sts verification-needed-focal
2023-04-20 20:27:32 Jorge Merlino tags kernel-spammed-focal-linux sts verification-needed-focal kernel-spammed-focal-linux sts verification-done-focal
2023-05-11 13:00:15 Luke Nowakowski-Krijger linux (Ubuntu Bionic): status In Progress Fix Committed
2023-05-22 09:24:28 Launchpad Janitor linux (Ubuntu Focal): status Fix Committed Fix Released
2023-05-22 09:24:28 Launchpad Janitor cve linked 2023-1075
2023-05-22 09:24:28 Launchpad Janitor cve linked 2023-1118
2023-05-22 22:11:36 Ubuntu Kernel Bot tags kernel-spammed-focal-linux sts verification-done-focal kernel-spammed-focal-linux kernel-spammed-focal-linux-bluefield sts verification-needed-focal
2023-05-23 14:49:16 Jorge Merlino tags kernel-spammed-focal-linux kernel-spammed-focal-linux-bluefield sts verification-needed-focal kernel-spammed-focal-linux kernel-spammed-focal-linux-bluefield sts verification-done-focal
2023-06-06 12:31:40 Ubuntu Kernel Bot tags kernel-spammed-focal-linux kernel-spammed-focal-linux-bluefield sts verification-done-focal kernel-spammed-focal-linux kernel-spammed-focal-linux-aws kernel-spammed-focal-linux-bluefield sts verification-needed-focal
2023-06-06 12:52:45 Ubuntu Kernel Bot tags kernel-spammed-focal-linux kernel-spammed-focal-linux-aws kernel-spammed-focal-linux-bluefield sts verification-needed-focal kernel-spammed-focal-linux kernel-spammed-focal-linux-aws kernel-spammed-focal-linux-azure kernel-spammed-focal-linux-bluefield sts verification-needed-focal
2023-10-11 01:08:50 Nobuto Murata bug added subscriber Nobuto Murata