juju storage mounting itself over itself

Bug #1830228 reported by Junien Fridrick
30
This bug affects 6 people
Affects Status Importance Assigned to Milestone
Canonical Juju
Fix Released
High
Ian Booth
3.1
Fix Released
High
Ian Booth
3.2
Fix Released
High
Ian Booth

Bug Description

Hi,

I have a model running juju 2.5.4, MAAS provider. We're using juju storage :

$ juju storage -m controller
Unit Storage id Type Pool Size Status Message
controller-backup/0 location/0 filesystem rootfs 20GiB attached

But juju (or _something_) keeps mounting the storage over itself, ie :

/dev/sda1 on /srv/controller-backups type ext4 (rw,relatime,data=ordered)
/dev/sda1 on /var/lib/juju/storage/rootfs/0/0 type ext4 (rw,relatime,data=ordered)
/dev/sda1 on /srv/controller-backups type ext4 (rw,relatime,data=ordered)
/dev/sda1 on /var/lib/juju/storage/rootfs/0/0 type ext4 (rw,relatime,data=ordered)
/dev/sda1 on /srv/controller-backups type ext4 (rw,relatime,data=ordered)

full listing at https://pastebin.ubuntu.com/p/YwyQg4KRWp/

More info about the model at https://pastebin.canonical.com/p/cmrqvT9JPd/ (sorry, Canonical-only link).

There's nothing relevant in juju logs : https://pastebin.canonical.com/p/f2wWb26XcN/ (note, all the logs are from 201*8*)

Thanks

Tags: canonical-is
Revision history for this message
Junien Fridrick (axino) wrote :

Note : this could be from an old version, as we upgraded to 2.5.4 fairly recently.

Changed in juju:
status: New → Triaged
importance: Undecided → High
Revision history for this message
Anastasia (anastasia-macmood) wrote :

@Junien Fridrick (axino),

This is fascinating! I wonder how it got into this stage - I wish we had an audit log of commands that were run...

Do you have any idea what a repro scenario would be?

Revision history for this message
Junien Fridrick (axino) wrote :

@anastasia-macmodd,

Sadly, I don't know how to repro. Is it normal that juju storage on MAAS just binds mount the filesystem over itself ? I don't see the point of doing that - it doesn't offer anything other than complexity. And it can potentially hide files.

Anyway, I figured I'd file this bug in case anyone had seen it before. We have a 2.6.2 controller that's not exhibiting this. Although it's been rebooted a few days ago and doesn't have anything mounted. So perhaps juju changed at some point and decided that for now, mounting storage with the MAAS provider is not useful ?

Revision history for this message
Joseph Phillips (manadart) wrote :
Revision history for this message
Benjamin Allot (ballot) wrote :

We have a jenkins unit mounting randomly its juju sotrage over itself once in a while.

I've currently disabled the jujud service to avoid it but happy to debug it if you want to.
Or tell me what I can get you as report.

I tried "juju-crashdupmp" but ...

$ juju-crashdump -o ballot/crashdump
Command "juju status --format=yaml" failed
Command "juju storage --format=yaml" failed
Traceback (most recent call last):
  File "/usr/local/bin/juju-crashdump", line 9, in <module>
    load_entry_point('jujucrashdump==0.0.0', 'console_scripts', 'juju-crashdump')()
  File "/usr/local/lib/python2.7/dist-packages/jujucrashdump/crashdump.py", line 407, in main
    filename = collector.collect()
  File "/usr/local/lib/python2.7/dist-packages/jujucrashdump/crashdump.py", line 286, in collect
    self.run_addons()
  File "/usr/local/lib/python2.7/dist-packages/jujucrashdump/crashdump.py", line 184, in run_addons
    juju_status = yaml.load(open('juju_status.yaml', 'r'))
IOError: [Errno 2] No such file or directory: 'juju_status.yaml'

Regards

Revision history for this message
Loïc Gomez (kotodama) wrote :

Hi,

I know there's not much data, but we're still having this issue with 2.9.18-ubuntu-amd64, on two distinct Jenkins units. For now we've disabled all juju agents on the unit going crazy.

Let us know if we can help in any way!

Thank you

Revision history for this message
Junien Fridrick (axino) wrote :

(to add to the comment above, we end up with a /proc/self/mountinfo having around 32k lines, which eventually makes the machine extremely slow, to a point where we're getting "BUG: soft lockup - CPU#2 stuck for 22s" messages on the console, at which point the instance has to be rebooted)

Ian Booth (wallyworld)
Changed in juju:
milestone: none → 2.9.23
Revision history for this message
Harry Pidcock (hpidcock) wrote :

JUJU-345

Changed in juju:
milestone: 2.9.23 → 2.9.24
Changed in juju:
milestone: 2.9.24 → 2.9.25
Changed in juju:
milestone: 2.9.25 → 2.9.26
Changed in juju:
milestone: 2.9.26 → 2.9.27
Changed in juju:
milestone: 2.9.27 → 2.9.28
Harry Pidcock (hpidcock)
Changed in juju:
milestone: 2.9.28 → 2.9-next
Harry Pidcock (hpidcock)
Changed in juju:
milestone: 2.9-next → 2.9.40
Changed in juju:
milestone: 2.9.40 → 2.9.41
Changed in juju:
milestone: 2.9.41 → 2.9.42
Changed in juju:
milestone: 2.9.42 → 2.9.43
Changed in juju:
milestone: 2.9.43 → 2.9.44
Changed in juju:
milestone: 2.9.44 → 2.9.45
Revision history for this message
Steve Langasek (vorlon) wrote :

Just saw this issue on a juju-deployed instance. The 'ubuntu' charm seems to have a /srv/data for some reason that is a bind mount of the rootfs. And there were *16383* bind mounts of the same path under /var/lib/juju/storage.

Noticed because the number of mounts accumulates over time and drastically slows down ssh into the system.

Revision history for this message
Junien Fridrick (axino) wrote :

@vorlon Interesting - with which Juju version did you see it ?

Thanks

Revision history for this message
Steve Langasek (vorlon) wrote :

Junien, this is juju 2.9.44-ubuntu-amd64 on the client with juju 2.60.2 on the server.

It's been suggested this may be a problem with the specific revision of the ubuntu charm that we happened to deploy and that the problem is hidden in the current version.

Revision history for this message
Niels Robin-Aubertin (nrobinaubertin) wrote :

It is happening for us too.

We have a postgresql machine unit becoming extremely slow and inspection of /proc/self/mountinfo (when logged in as `ssh ubuntu@IP`) shows a lot of lines like `1098 263120 252:1 /var/lib/juju/storage/rootfs/0/0 /var/snap/charmed-postgresql/common rw,relatime shared:1 - ext4 /dev/vda1 rw,discard,errors=remount-ro`.

I can't have an exact count because grep -c hangs but if I limit that to the first 5k lines of the file, we have 4968 matching '/var/lib/juju/storage'.

juju client version: 2.9.44-ubuntu-amd64
juju controller version: 2.9.44
ubuntu (postgres machine unit): 22.04.2 LTS

Tom Haddon (mthaddon)
tags: added: canonical-is
Revision history for this message
Tom Haddon (mthaddon) wrote :

We've now seen the same thing in a different juju model using revision 189 from the 8.0/edge channel of the mysql charm.

Revision history for this message
John A Meinel (jameinel) wrote :
Download full text (4.0 KiB)

From the Juju side, we can see that we have code that calls AttachFilesystems, that had the contract if source and target were stable, that it should be a 'noop'.

As far as we can tell, the rootfs code has always been wrong with its check for 'is it already mounted'.

The assumption is that if you do:
 mount --bind $SOURCE $TARGET
that you would have
 df --output=source $TARGET
return (in the second line) $SOURCE

eg:
 $ mount --bind /var/lib/juju/data /srv/data
would then have:
 $ df --output=source /srv/data
 Filesystem
 /var/lib/juju/data

In practice, what you actually get is:
 Filesystem
 /dev/sda1

That is a bad assumption on our part, but has been a bad assumption for all of the life of juju.

The other issue that caused us to notice this is that there was a change between juju 2.9.30 and 2.9.44 (I have to dig to find the exact patch, but it was by Simon Richardson).

That patch caused us to be more persistent with mounts, because of some other issues (specifically the order that volumes/blocks show up was not guaranteed, so we needed to try again). That bug fix is causing us to wake up periodically (current period unknown) and essentially call 'mount --bind /source/ /target' over and over again.

To get a workaround, we just need to interrupt something.
Either

1) Put a `df` into the $PATH of `jujud` such that `df --output source $TARGET` for a bind mount can return $SOURCE rather than SOURCE_DEVICE
2) Put a `mount` into $PATH of `jujud` so that `mount --bind $SOURCE $TARGET` is a no-op when source is already bind mounted at $TARGET.

For (1) we still haven't found what command could be run for bind mounts. Because reading /proc/mounts and 'df' always give the device, and we need to figure out the relative mount location.

https://unix.stackexchange.com/questions/18048/list-only-bind-mounts/346444#346444

Seems to say that the information might be available from '/proc/self/mountinfo'

From what I can tell, the issue is that you get 'mount the filesystem from this block device, taking into account this relative path' but you never have a way to unfold it unambiguously.

We did a test where '/home' was a mounted filesystem '/dev/nvme0n1p3' we then did 'mkdir a; mkdir b; mount --bind a b' and we could get:
198 29 259:3 / /home rw,relatime shared:117 - ext4 /dev/nvme0n1p3 rw
...
6314 198 259:3 /ian/a /home/ian/b rw,relatime shared:117 - ext4 /dev/nvme0n1p3 rw
6443 6314 259:3 /ian/a /home/ian/b rw,relatime shared:117 - ext4 /dev/nvme0n1p3 rw
6581 198 259:3 /ian/a /home/ian/a rw,relatime shared:117 - ext4 /dev/nvme0n1p3 rw

Which says that '/dev/nvme0n1p3' is mounted at '/home' from the '/' of its filesystem, and then mounted again at '/home/ian/b' with '/ian/a' as its relative filesystem.

There are a few possibilities to give some answers, in increasing complexity but also increasing correctness.

1) 'mount --bind /home/ian/a /home/ian/b' looks at /proc/self/mountinfo and sees that there exists an exact entry for /home/ian/b already, and just refuses to do anything

2) 'mount --bind ...' looks at the same line, and sees that there is a relative directory mount (/ian/a) and a non-relative mount ('/'). It then tacks on '/ian/a' to '/h...

Read more...

Revision history for this message
Tom Haddon (mthaddon) wrote :

found some interesting logs in syslog https://pastebin.canonical.com/p/vDHx8YNsQr/ (sorry, internal only). It looks to me like something happened here:
```
Sep 7 08:02:59 juju-9431ad-prod-events-db-7 snapd[247744]: storehelpers.go:773: cannot refresh:
Sep 7 08:02:59 juju-9431ad-prod-events-db-7 snapd[247744]: no snap revision available as specified: "charmed-postgresql"
```
and then shortly after that we get repeating messages like this:
```
Sep 7 08:31:56 juju-9431ad-prod-events-db-7 systemd[1]: snapd.service: start operation timed out. Terminating.
Sep 7 08:33:27 juju-9431ad-prod-events-db-7 systemd[1]: snapd.service: State 'stop-sigterm' timed out. Killing.
Sep 7 08:33:27 juju-9431ad-prod-events-db-7 systemd[1]: snapd.service: Killing process 1888670 (snapd) with signal SIGKILL.
Sep 7 08:33:27 juju-9431ad-prod-events-db-7 systemd[1]: snapd.service: Main process exited, code=killed, status=9/KILL
Sep 7 08:33:27 juju-9431ad-prod-events-db-7 systemd[1]: snapd.service: Failed with result 'timeout'.
Sep 7 08:33:27 juju-9431ad-prod-events-db-7 systemd[1]: Failed to start Snap Daemon.
Sep 7 08:33:27 juju-9431ad-prod-events-db-7 systemd[1]: snapd.service: Consumed 3min 43.451s CPU time.
Sep 7 08:33:27 juju-9431ad-prod-events-db-7 systemd[1]: snapd.service: Scheduled restart job, restart counter is at 3.
Sep 7 08:33:27 juju-9431ad-prod-events-db-7 systemd[1]: Stopped Snap Daemon.
Sep 7 08:33:27 juju-9431ad-prod-events-db-7 systemd[1]: snapd.service: Consumed 3min 43.451s CPU time.
Sep 7 08:33:27 juju-9431ad-prod-events-db-7 systemd[1]: Starting Snap Daemon...
Sep 7 08:33:27 juju-9431ad-prod-events-db-7 snapd[1889451]: overlord.go:272: Acquiring state lock file
Sep 7 08:33:27 juju-9431ad-prod-events-db-7 snapd[1889451]: overlord.go:277: Acquired state lock file
```
None since the instance was rebooted. I've given John access to the full syslog in case that's helpful.

Revision history for this message
Ian Booth (wallyworld) wrote :

I have a working prototype fix. I'll land something in the next day or so - there's a few things to tidy up first.

Changed in juju:
status: Triaged → In Progress
assignee: nobody → Ian Booth (wallyworld)
Revision history for this message
Ian Booth (wallyworld) wrote :
Ian Booth (wallyworld)
Changed in juju:
status: In Progress → Fix Committed
Revision history for this message
John A Meinel (jameinel) wrote :

This is a 'mount' command that detects when it is being run in '--bind' mode, and otherwise just execs to the normal mount that can be found in $PATH.
It checks to see if the desired source and target appear to already exist, in which case it prints that it already exists, and then exits happily. In all other cases it just execv into mount and lets mount handle everything.

I did test on my machine, seeing that it did the right thing when asking to mount a relative path to another relative path, as well as absolute paths, as well as when there were nested mounts (a mounted to b, and a mounted to b/c).

I tested that I could install it as '/usr/local/bin/mount' and it would be executed and it would find the real mount.
I tested that it allowed mounting a directory to itself (which is allowed) but no further.

I'm not sure where to put it exactly in $PATH, but you should be able to look at the environment for the jujud process in case you can find a location in PATH that jujud looks at, but wouldn't otherwise impact any other commands.
That said, it is programmed defensively to thunk to mount for just about everything, so is unlikely to break things if it was put in system paths.

Revision history for this message
Alex Lutay (taurus) wrote :

Thank you for the lightning fix in Juju 2.9.45!

Is it possible to include it in 3.1.6 as well?
The 3.1.6 is the primary Juju version for Data team: tests/demos/recommendations/...

Tnx!

Revision history for this message
Ian Booth (wallyworld) wrote :

Yes, it will be in 3.1.6 also :-)

Revision history for this message
Steve Langasek (vorlon) wrote :

Have deployed this workaround as /usr/local/sbin/mount; after 24 hours, there are no redundant mounts:

$ mount|grep srv
/dev/vda1 on /srv/data type ext4 (rw,relatime,discard,errors=remount-ro)
$ mount|grep juju
$

Changed in juju:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.