mdcheck_start.service trying to start unexisting file
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
mdadm (Debian) |
Fix Released
|
Unknown
|
|||
mdadm (Ubuntu) |
Fix Released
|
High
|
Eric Desrochers | ||
Bionic |
Won't Fix
|
Undecided
|
Unassigned | ||
Focal |
Fix Released
|
High
|
Eric Desrochers |
Bug Description
[Impact]
The mdadm package is missing the mdcheck script. This has two consequences:
In the immediate term, that means that we get failed systemd units on all of our physical machines (because they have mirrored disks) as we upgrade them to 20.04. This raises alarms in our monitoring system as we monitor systemd unit failures.
In the longer-term, this means that the arrays are not being checked. If a drive develops a bad sector, this would normally be caught by the checking and a good copy would be rewritten from the other side of the mirror. Without the checking, that will not happen. If the other drive (the one with the good version of the sector) dies, then that sector's data is lost permanently. The consequences of that depend on what that sector was storing, but it's not good, obviously.
[Test Case]
* systemctl start mdcheck_
* journalctl -u mdcheck_start
-- Logs begin at Wed 2020-09-23 18:33:35 UTC, end at Wed 2020-09-23 18:40:27 UTC. --
Sep 23 18:40:27 mdadmgroovy systemd[1]: Starting MD array scrubbing...
Sep 23 18:40:27 mdadmgroovy systemd[1515]: mdcheck_
Sep 23 18:40:27 mdadmgroovy systemd[1515]: mdcheck_
Sep 23 18:40:27 mdadmgroovy systemd[1]: mdcheck_
Sep 23 18:40:27 mdadmgroovy systemd[1]: mdcheck_
Sep 23 18:40:27 mdadmgroovy systemd[1]: Failed to start MD array scrubbing.
* ls -altr /usr/share/
ls: cannot access '/usr/share/
* dpkg -l mdadm
ii mdadm 4.1-5ubuntu1 amd64 tool to administer Linux MD arrays (software RAID)
* dpkg -L mdadm | grep -i mdcheck
/lib/systemd/
/lib/systemd/
/lib/systemd/
/lib/systemd/
* Also, we'd like to see if the mdcheck is performed under the 'natural' scheduled execution (so on nearest Sunday) and have impacted users to report feedback supported with logs.
* We found a regression fixed upstream:
https:/
* We then found a regression fix for the above regression fix, push into groovy, and then submitted upstream to linux-raid ML:
https:/
* We'd like to see if when mdcheck_start is enabled, enable mdcheck_continue too.
[Regression Potential]
* 'misc/mdcheck' will be introduced in Ubuntu for the first time, and is pretty young in the Debian mdadm story too (introduced in Sept 12 2020).
Not known fix since debian introduced it 2 weeks-ish ago has been added on top of it so far.
$ git log --oneline --grep="mdcheck"
5a3db0f Install misc/mdcheck; turn on hardening; enable dh_lintian. (Closes: #960132)
f258a5e mdcheck: improve cleanup
ea83549 mdcheck: add some logging.
979b1fe mdcheck: be careful when sourcing the output of "mdadm --detail --export"
36dab45 mdcheck: don't git error if not /dev/md?* devices exist.
868ab80 mdcheck: don't pass the '+' to "date".
df881f7 mdcheck: new script to help with regular checks of md arrays.
And no presence of new opened bug(s) related to mdcheck introduction.
At code inspection, 'mdcheck' script seems to be harmless (at least at first glance), of course, real case scenario testing within raid types situations will be needed to conclude during the verification testing phase, and if possible, running the script in debug mode (set -xv) might be a good idea to see the script workflow in action.
This change will permit 'mdcheck' to be run on the first Sunday of each month for 6 hours (mdcheck_
It's not a script that one would typically run manually on a regular basis.
The script uses 'logger' to enter messages into the system log, so we will have a trace of its execution (in addition the systemd unit,timer usual logs) when it begins, paused and continue. I also added in my upload a patch in which mdcheck logs the completion as well. Giving the opportunity to user to know how long the raid check took, which I think is paramount information to include with the introduction of this script in Ubuntu.
I would suggest we don't release the package in focal-updates before having at least one sample of a 'natural' scheduled execution on the first Sunday of the month (Next should be October 4th ?), and have impacted users to report feedback supported with logs.
I think running it on Sunday is reasonable, (just like fstrim, zfs scrub, ...). Typically, Sunday is a day when cron and timer runs to do some execution like that.
One thing, I would like to confirm, but maybe not a blocker for this case, is to make sure 'mdcheck_continue' starts fine when condition are met, since it has never been tester due to 'mdcheck_start' failure due to missing 'mdcheck' script.
[Other Info]
Debian bug:
https:/
salsa commit:
https:/
[Original Description]
mdcheck_
root@d:~# cat /lib/systemd/
ExecStart=
root@d:~# ls -la /usr/share/
ls: cannot access '/usr/share/
ProblemType: Bug
DistroRelease: Ubuntu 19.10
Package: mdadm 4.1-2ubuntu3
ProcVersionSign
Uname: Linux 5.3.0-19-generic x86_64
ApportVersion: 2.20.11-0ubuntu8.2
Architecture: amd64
Date: Fri Nov 15 13:13:17 2019
Lspci: Error: [Errno 2] No such file or directory: 'lspci': 'lspci'
Lsusb: Error: [Errno 2] No such file or directory: 'lsusb': 'lsusb'
MachineType: HP HP EliteBook x360 1030 G3
ProcEnviron:
LANG=C
TERM=screen
PATH=(custom, no user)
ProcKernelCmdLine: BOOT_IMAGE=
ProcMDstat:
Personalities :
unused devices: <none>
SourcePackage: mdadm
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 08/07/2019
dmi.bios.vendor: HP
dmi.bios.version: Q90 Ver. 01.08.01
dmi.board.name: 8438
dmi.board.vendor: HP
dmi.board.version: KBC Version 14.3F.00
dmi.chassis.
dmi.chassis.type: 31
dmi.chassis.vendor: HP
dmi.modalias: dmi:bvnHP:
dmi.product.family: 103C_5336AN HP EliteBook x360
dmi.product.name: HP EliteBook x360 1030 G3
dmi.product.sku: 5SR46ES#ACB
dmi.sys.vendor: HP
etc.blkid.tab: Error: [Errno 2] No such file or directory: '/etc/blkid.tab'
initrd.files: Error: [Errno 2] No such file or directory: '/boot/
tags: | added: focal |
tags: | added: groovy |
Changed in mdadm (Debian): | |
status: | Unknown → New |
Changed in mdadm (Ubuntu): | |
assignee: | nobody → Eric Desrochers (slashd) |
status: | Confirmed → In Progress |
description: | updated |
description: | updated |
description: | updated |
description: | updated |
description: | updated |
description: | updated |
description: | updated |
description: | updated |
description: | updated |
Changed in mdadm (Ubuntu): | |
importance: | Undecided → High |
description: | updated |
description: | updated |
description: | updated |
description: | updated |
Changed in mdadm (Ubuntu Focal): | |
status: | New → In Progress |
importance: | Undecided → High |
assignee: | nobody → Eric Desrochers (slashd) |
description: | updated |
description: | updated |
Changed in mdadm (Ubuntu Bionic): | |
status: | New → Won't Fix |
description: | updated |
description: | updated |
description: | updated |
description: | updated |
description: | updated |
description: | updated |
description: | updated |
description: | updated |
description: | updated |
tags: |
added: seg sts removed: sts-sponsors-slashd |
tags: | removed: eoan |
description: | updated |
description: | updated |
Changed in mdadm (Debian): | |
status: | New → Fix Released |
Status changed to 'Confirmed' because the bug affects multiple users.