Failed to upload resource "tools": Put https://...: resource#hw-health/tools not found

Bug #1979981 reported by Peter Sabaini
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Canonical Juju
Incomplete
High
Ian Booth

Bug Description

Context: this is a recently upgraded cloud, upgraded from xenial to bionic, and upgraded Juju from 2.4 to 2.9.31

We're monitoring hardware via the hw-health charm, deployed from cs:hw-health-13

The existing machines are reporting no issues, but when deploying new machines the charm complains about a missing tools resource.

```
juju charm-resources cs:hw-health-13
Resource Revision
tools 0

juju list-resources hw-health
No resources to display.
```

When attempting to attach a resource I'm getting an error however:

```
juju attach-resource hw-health tools=./tools.zip
ERROR failed to upload resource "tools": Put https://x.x.x.x:17070/model/12fb71bc-4e18-47f8-898d-ab608eefcc41/applications/hw-health/resources/tools: resource#hw-health/tools not found
```

In the unit log I can see a similar error, not being able to download the resource:

```
2022-06-24 16:40:27 INFO unit.hw-health/1550.juju-log server.go:319 nrpe-external-master:228: status-set: maintenance: Installing from attached resource
2022-06-24 16:40:27 INFO unit.hw-health/1550.juju-log server.go:319 nrpe-external-master:228: status-set: maintenance: Installing tool MegaCLI
2022-06-24 16:40:27 WARNING unit.hw-health/1550.nrpe-external-master-relation-joined logger.go:60 ERROR could not download resource: HTTP request failed: Get https://x.x.x.x:17070/model/12fb71bc-4e18-47f8-898d-ab
608eefcc41/units/unit-hw-health-1550/resources/tools: resource#hw-health/tools not found
2022-06-24 16:40:27 ERROR unit.hw-health/1550.juju-log server.go:319 nrpe-external-master:228: Missing Juju resource: tools - alternative method is not available yet
```

If I can help out with any diagnostics or if theres any questions please let me know

Revision history for this message
John A Meinel (jameinel) wrote :

Slightly different symptoms than: https://bugs.launchpad.net/juju/+bug/1975726 but they seem related. (Both about resources, both about attach-resource, and both not reproducible on a clean deploy)

Revision history for this message
John A Meinel (jameinel) wrote :

Note that the errors are different, as the other one said:
```
 ERROR got bad data from server: unsupported resource type ""
```

Changed in juju:
importance: Undecided → High
milestone: none → 2.9-next
status: New → Triaged
Revision history for this message
Peter Sabaini (peter-sabaini) wrote :

Subscribe ~field-high

Revision history for this message
Peter Sabaini (peter-sabaini) wrote :

For clarity, the existing hw-health application should have the tools resource attached already; it was attached prior to the upgrade.

But it appears during the upgrade that resource was lost somehow:

$ juju list-resources hw-health
No resources to display.

For what it's worth, as a test I deployed a hw-health-test app with the same version; I can attach that tools resource fine there:

$ juju deploy cs:hw-health-13 hw-health-test

$ juju attach-resource hw-health-test tools=./tools.zip

$ juju list-resources hw-health-test
Resource Supplied by Revision
tools admin 2022-06-28T15:19

Revision history for this message
John A Meinel (jameinel) wrote :

While the error is different from bug #1975726 it could be a different code path with the same root cause. (how we were interacting with resources that had been updated was wrong prior to 2.9.32)
If there is a rush to get this working again, that is likely the most expedient path. We haven't been able to reproduce the original failure yet to confirm that we think the 2.9.31 bug was common (the original bug doesn't reproduce unless you trigger a refresh from charmhub, which doesn't naturally happen for 24hrs.)

Revision history for this message
Ian Booth (wallyworld) wrote :

I had a brief look at the code and it may not be the same bug - attaching a resource first attempts to read what resource metadata exists and that is where the not found error appears to be coming from.

It would be helpful to get a dump of the "resources" and "units" collections so we can see what's going on.

Revision history for this message
Peter Sabaini (peter-sabaini) wrote :

We have upgraded to 2.9.32 but unfort. we're still seeing this:

juju attach-resource hw-health tools=./tools.zip
ERROR failed to upload resource "tools": Put https://x.x.x.x:17070/model/12fb71bc-4e18-47f8-898d-ab608eefcc41/applications/hw-health/resources/tools: resource#hw-health/tools not found

Revision history for this message
Peter Sabaini (peter-sabaini) wrote :

I have uploaded a tar with a sanitized json export of the units/resources collections (Canonical only, sorry):

https://private-fileshare.canonical.com/~sabaini/89ba28e3-7e4e-485c-bec4-e3af4d4e83f0/lp1979981-collections.tar.gz

Revision history for this message
Ian Booth (wallyworld) wrote :

I think I've found the issue. It appears to be an issue in the juju attach-resource CLI code. I can try and reproduce the issue, but if not, I can also give you a "juju" CLI binary to try and that should fix it. I'll report back here once I have something further.

Changed in juju:
milestone: 2.9-next → 2.9.33
assignee: nobody → Ian Booth (wallyworld)
status: Triaged → In Progress
Revision history for this message
Ian Booth (wallyworld) wrote (last edit ):

So, looking at the database export, there's a document missing from the resources collection. This is why the attach-resource CLI code is failing with the not found error.

For each resource, there should be at least 2 records in the "resources" collection
eg

"_id" : "12fb71bc-4e18-47f8-898d-ab608eefcc41:resource#hw-health/tools#charmstore"

and

"_id" : "12fb71bc-4e18-47f8-898d-ab608eefcc41:resource#hw-health/tools"

For the "nw-health" application and "tools" resource, the second record above has been deleted according to the db export. I could not reproduce how this happened. I bootstrapped a 2.4.7 controller and deployed ubuntu hw-health etc and upgraded to 2.9.32. No issues.

I did manage to see the same symptoms if I manually deleted the relevant resource record.

A quick fix here might just be to replace the missing record. Open a mongodb shell

juju:PRIMARY> var new_doc = db.resources.findOne({_id: '12fb71bc-4e18-47f8-898d-ab608eefcc41:resource#hw-health/tools#charmstore'})
juju:PRIMARY> new_doc._id='12fb71bc-4e18-47f8-898d-ab608eefcc41:resource#hw-health/tools'
juju:PRIMARY> db.resources.insert(new_doc)

Changed in juju:
milestone: 2.9.33 → none
status: In Progress → Incomplete
Revision history for this message
Ian Booth (wallyworld) wrote :

I'll mark as incomplete - we can see if the above suggested workaround solves it.

Revision history for this message
Peter Sabaini (peter-sabaini) wrote :

Hey Ian,

the workaround succeeded! No idea how that DB record got lost, but adding it back indeed made re-attaching the zip succeed.

In the absence of constraints that check for data wellformedness I wonder if it would make sense for Juju to implement sanity check db structure to detect this kind of situation?

Many thanks for the debugging.

Revision history for this message
Peter Sabaini (peter-sabaini) wrote :

Unsubbing field-high as we have a workaround

Revision history for this message
Heather Lanigan (hmlanigan) wrote :

@wallyworld, do you remember what the issue was? Is there a fix we need? This has come up again on the same config.

Revision history for this message
Ian Booth (wallyworld) wrote :

I didn't find a root cause - all I could do was notice that the db was missing an expected record and manually adding that back fixed it. If the customer can help with any steps to reproduce that would be great.

Revision history for this message
Gui Maluf Balzana (guimalufb) wrote :

I faced the same issue on the same cloud as reported by @peter-sabaini, but in a charm without previous attached resource.

The workaround didn't work, as it throw a distinct issue
```
var new_doc = db.resources.find({_id: '12fb71bc-4e18-47f8-898d-ab608eefcc41:resource#neutron-api/policyd-override#charmstore'})
new_doc._id='12fb71bc-4e18-47f8-898d-ab608eefcc41:resource#neutron-api/policyd-override'
db.resources.insert(new_doc)

juju attach-resource neutron-api policyd-override=create-port-binding-profile-override.zip --debug --verbose
08:56:40 INFO juju.cmd supercommand.go:56 running juju [2.9.44 02d498631e196f2a37f9b7c3b5c31bdcb1dad333 gc go1.20.5]
08:56:40 DEBUG juju.cmd supercommand.go:57 args: []string{"/snap/juju/23558/bin/juju", "attach-resource", "neutron-api", "policyd-override=create-port-binding-profile-override.zip", "--debug", "--verbose"}
08:56:40 INFO juju.juju api.go:86 connecting to API addresses: [x.x.x.x:17070 x.x.x.x:17070 x.x.x.x:17070]
08:56:40 DEBUG juju.api apiclient.go:1152 successfully dialed "wss://x.x.x.x:17070/model/12fb71bc-4e18-47f8-898d-ab608eefcc41/api"
08:56:40 INFO juju.api apiclient.go:687 connection established to "wss://x.x.x.x:17070/model/12fb71bc-4e18-47f8-898d-ab608eefcc41/api"
08:56:40 DEBUG juju.api monitor.go:35 RPC connection died
ERROR failed to upload resource "policyd-override": Put https://x.x.x.x:17070/model/12fb71bc-4e18-47f8-898d-ab608eefcc41/applications/neutron-api/resources/policyd-override: got invalid data from DB: unsupported resource type ""
08:56:40 DEBUG cmd supercommand.go:537 error stack:
got invalid data from DB: unsupported resource type ""
/build/snapcraft-juju-b12ec14ae8781cd65bb197dc27b9cc5f/parts/juju/build/vendor/gopkg.in/httprequest.v1/client.go:307: Put https://x.x.x.x:17070/model/12fb71bc-4e18-47f8-898d-ab608eefcc41/applications/neutron-api/resources/policyd-override
/build/snapcraft-juju-b12ec14ae8781cd65bb197dc27b9cc5f/parts/juju/build/vendor/gopkg.in/httprequest.v1/client.go:185:
github.com/juju/juju/api/client/resources.Client.Upload:118:
github.com/juju/juju/cmd/juju/resource.(*UploadCommand).upload:155:
github.com/juju/juju/cmd/juju/resource.(*UploadCommand).Run:140: failed to upload resource "policyd-override"
```

I've upgraded juju controller and model to 2.9.44, rolledback the workaround and the resource was successfully attached.

```
db.resources.deleteOne({_id: '12fb71bc-4e18-47f8-898d-ab608eefcc41:resource#neutron-api/policyd-override'})

$ juju attach-resource neutron-api policyd-override=policyd-override.zip
```

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.