storage gone after attaching/detaching

Bug #1955857 reported by Erik Lönroth
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Canonical Juju
Expired
High
Unassigned

Bug Description

I'm experiencing some issue with juju storage.

Cloud: AWS
jaas version: 2.9.18
juju version: 2.9.22-ubuntu-amd64 (snap)

I deploy a charm which has storage defined.

After attaching and detaching a few times, I suddenly ended up with my storage missing:

$ juju storage
Unit Storage id Type Pool Size Status Message
      logdata/0 filesystem ebs 1.0GiB attaching mount failed: mount: /logs: special device /dev/nvme1n1 does not exist.: exit status 32

Here is my charm code. https://github.com/erik78se/juju-operators-examples/blob/main/storage-filesystem/src/charm.py

The steps that leads up to the problem is as the picture shows, but in general reproduce like this.

1. Charm deploy and gets attached storage.
2. I detach the storage from unit/0
3. I attach the same storage on unit/0
... repeat a few times.

... error occurs.

Nothing in the logs.

I can find the volume in the AWS volume list - so the volume is not gone but juju is stuck in attaching.

If I manually attach it from the console, the disk appears as nvme1n1 (as expected) but Juju has no clue about it and the error remains.

If I then again run "juju attach-storage" - it seems to work again.

However, the error returns again.

Tags: aws storage
Revision history for this message
Ian Booth (wallyworld) wrote :

At first glance, it looks like a race condition / AWS eventual consistency issue.

With manually attaching the volume from the AWS console, Juju will not know about the block device immediately, but it should eventually since the machine agent polls for block devices and reports those back to the controller.

Changed in juju:
milestone: none → 2.9.23
importance: Undecided → High
status: New → Triaged
tags: added: aws storage
Revision history for this message
Ian Booth (wallyworld) wrote :

I did a test with the postgresql charm.
After detaching the storage, the AWS node does get the device "/dev/nvme1n1" removed as expected.
Then attaching a second time, immediately running "status --storage" gives:

attaching mount failed: mount: /srv/pgdata: special device /dev/nvme1n1 does not exist.: exit status 32

However, waiting a few seconds and running status again shows the storage is now correctly attached.
Juju is reporting the error on the first attempt(s) but then will succeed when AWS eventually mounts the volume.

Can you confirm that waiting a short time works for you?

Changed in juju:
status: Triaged → Incomplete
Changed in juju:
milestone: 2.9.23 → 2.9.24
Changed in juju:
milestone: 2.9.24 → 2.9.25
Changed in juju:
milestone: 2.9.25 → 2.9.26
Changed in juju:
milestone: 2.9.26 → 2.9.27
Changed in juju:
milestone: 2.9.27 → 2.9.28
Changed in juju:
milestone: 2.9.28 → 2.9.29
Changed in juju:
milestone: 2.9.29 → 2.9.30
John A Meinel (jameinel)
Changed in juju:
milestone: 2.9.30 → 2.9-next
Ian Booth (wallyworld)
Changed in juju:
milestone: 2.9-next → none
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for Canonical Juju because there has been no activity for 60 days.]

Changed in juju:
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.