[3.0-rc2] k8s test suite fails with libjuju client error

Bug #1993533 reported by Bas de Bruijne
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Canonical Juju
Invalid
Undecided
Unassigned
Charmed Kubernetes Testing
In Progress
High
Adam Dyess

Bug Description

In testrun https://solutions.qa.canonical.com/testruns/testRun/3caa41ba-fa23-4e66-bae5-ecf1896a45fb, which is ck8s 1.25 on baremetal maas with juju 3.0 rc2 on arm64, the installation of k8s is successful but the validation fails with:

```
unexpected facade SecretsTriggerWatcher found, unable to decipher version to use
unknown common facade version for Storage
unknown common facade version for Subnets
unknown common facade version for Uniter
unknown common facade version for UpgradeSeries
unknown common facade version for UpgradeSteps
Task exception was never retrieved
future: <Task finished name='Task-17' coro=<Model._watch.<locals>._all_watcher() done, defined at /home/ubuntu/k8s-validation/.tox/py3/lib/python3.8/site-packages/juju/model.py:1043> exception=Exception("No facade ModelConfig in facades {'ActionPruner': 1, 'AgentTools': 1, 'Annotations': 2, 'ApplicationScaler': 1, 'Block': 2, 'Bundle': 6, 'CAASAdmission': 1, 'CAASApplication': 1, 'CAASApplicationProvisioner': 1, 'CAASFirewaller': 1, 'CAASModelConfigManager': 1, 'CAASModelOperator': 1, 'CAASOperator': 1, 'CAASOperatorProvisioner': 1, 'CAASOperatorUpgrader': 1, 'CAASUnitProvisioner': 2, 'CharmDownloader': 1, 'CharmRevisionUpdater': 2, 'Cleaner': 2, 'CredentialManager': 1, 'Deployer': 1, 'DiskManager': 2, 'EntityWatcher': 2, 'ExternalControllerUpdater': 1, 'FanConfigurer': 1, 'FilesystemAttachmentsWatcher': 2, 'FirewallRules': 1, 'HighAvailability': 2, 'HostKeyReporter': 1, 'ImageMetadata': 3, 'ImageMetadataManager': 1, 'InstancePoller': 4, 'KeyManager': 1, 'KeyUpdater': 1, 'LeadershipService': 2, 'LifeFlag': 1, 'LogForwarding': 1, 'Logger': 1, 'MachineActions': 1, 'MachineUndertaker': 1, 'MetricsAdder': 2, 'MetricsDebug': 2, 'MetricsManager': 1, 'MigrationFlag': 1, 'MigrationMinion': 1, 'MigrationStatusWatcher': 1, 'NotifyWatcher': 1, 'OfferStatusWatcher': 1, 'Payloads': 1, 'PayloadsHookContext': 1, 'Pinger': 1, 'ProxyUpdater': 2, 'RaftLease': 1, 'Reboot': 2, 'RelationStatusWatcher': 1, 'RelationUnitsWatcher': 1, 'RemoteRelationWatcher': 1, 'ResourcesHookContext': 1, 'RetryStrategy': 1, 'Secrets': 1, 'SecretsManager': 1, 'Singular': 2, 'Spaces': 6, 'StatusHistory': 2, 'StorageProvisioner': 4, 'StringsWatcher': 1, 'Undertaker': 1, 'UnitAssigner': 1, 'Upgrader': 1, 'VolumeAttachmentPlansWatcher': 1, 'VolumeAttachmentsWatcher': 2}")>
Traceback (most recent call last):
  File "/home/ubuntu/k8s-validation/.tox/py3/lib/python3.8/site-packages/juju/model.py", line 1046, in _all_watcher
    model_config = await self.get_config()
  File "/home/ubuntu/k8s-validation/.tox/py3/lib/python3.8/site-packages/juju/model.py", line 1992, in get_config
    config_facade = client.ModelConfigFacade.from_connection(
  File "/home/ubuntu/k8s-validation/.tox/py3/lib/python3.8/site-packages/juju/client/_client.py", line 66, in from_connection
    raise Exception('No facade {} in facades {}'.format(facade_name,
Exception: No facade ModelConfig in facades {'ActionPruner': 1, 'AgentTools': 1, 'Annotations': 2, 'ApplicationScaler': 1, 'Block': 2, 'Bundle': 6, 'CAASAdmission': 1, 'CAASApplication': 1, 'CAASApplicationProvisioner': 1, 'CAASFirewaller': 1, 'CAASModelConfigManager': 1, 'CAASModelOperator': 1, 'CAASOperator': 1, 'CAASOperatorProvisioner': 1, 'CAASOperatorUpgrader': 1, 'CAASUnitProvisioner': 2, 'CharmDownloader': 1, 'CharmRevisionUpdater': 2, 'Cleaner': 2, 'CredentialManager': 1, 'Deployer': 1, 'DiskManager': 2, 'EntityWatcher': 2, 'ExternalControllerUpdater': 1, 'FanConfigurer': 1, 'FilesystemAttachmentsWatcher': 2, 'FirewallRules': 1, 'HighAvailability': 2, 'HostKeyReporter': 1, 'ImageMetadata': 3, 'ImageMetadataManager': 1, 'InstancePoller': 4, 'KeyManager': 1, 'KeyUpdater': 1, 'LeadershipService': 2, 'LifeFlag': 1, 'LogForwarding': 1, 'Logger': 1, 'MachineActions': 1, 'MachineUndertaker': 1, 'MetricsAdder': 2, 'MetricsDebug': 2, 'MetricsManager': 1, 'MigrationFlag': 1, 'MigrationMinion': 1, 'MigrationStatusWatcher': 1, 'NotifyWatcher': 1, 'OfferStatusWatcher': 1, 'Payloads': 1, 'PayloadsHookContext': 1, 'Pinger': 1, 'ProxyUpdater': 2, 'RaftLease': 1, 'Reboot': 2, 'RelationStatusWatcher': 1, 'RelationUnitsWatcher': 1, 'RemoteRelationWatcher': 1, 'ResourcesHookContext': 1, 'RetryStrategy': 1, 'Secrets': 1, 'SecretsManager': 1, 'Singular': 2, 'Spaces': 6, 'StatusHistory': 2, 'StorageProvisioner': 4, 'StringsWatcher': 1, 'Undertaker': 1, 'UnitAssigner': 1, 'Upgrader': 1, 'VolumeAttachmentPlansWatcher': 1, 'VolumeAttachmentsWatcher': 2}
```

This is not an issue that we have seen before in the k8s test suite, it looks to me like a bug in libjuju. This testsuite uses juju==2.9.8, is that compatible with juju 3.0?

Crashdumps and configs can be found here:
https://oil-jenkins.canonical.com/artifacts/3caa41ba-fa23-4e66-bae5-ecf1896a45fb/index.html

Tags: cdo-qa
Revision history for this message
Bas de Bruijne (basdbruijne) wrote :
Revision history for this message
George Kraft (cynerva) wrote :

I don't know if python-libjuju 2.9.8 is compatible with Juju 3.0 or not, hopefully the Juju folks can speak to that.

The latest version of our test suite uses python-libjuju 2.9.11 which may or may not have the same issue. We are currently unable to update to python-libuju 3.0+ due to breaking changes. I believe our main blocker is this issue: https://github.com/juju/python-libjuju/issues/719

Revision history for this message
Caner Derici (cderici) wrote :

Pylibjuju 2.9.8 is not compatible with juju 3.0. You'll need at least libjuju 3.0.0. I suggest using the latest one. In fact, we're currently fixing some of the known issues that popped up recently and we're about the make a new release very soon with those fixes, so updating libjuju in couple days to 3.0.3 (is what we'll be releasing) should fix your problems.

Keep in mind that with the major version bump there might be some user code that needs to be adapted. Feel free to ping us on MM or file a bug on libjuju in that case.

Revision history for this message
Caner Derici (cderici) wrote :

So there's two different things in discussion here:

@Bas, Are you guys able to update the libjuju to the latest? If you can upgrade to at least 2.9.11, then you shouldn't be having any facade issues. If you could update to the latest, then https://github.com/juju/python-libjuju/pull/745 should have fixed the 3.0 facade issues. I'm gonna need a little bit more information on this. Thanks!

@George, Are you talking about blocking your update to the latest (like you're unable to update because of the action stuff), or are you talking about (that actions not working properly is) blocking your test suite/release process. Can you also give some more information? thanks!

Revision history for this message
George Kraft (cynerva) wrote :

If we upgrade the test suite to libjuju 3.x, then we encounter that action issue and it makes the tests fail. Our workaround for now is to stay on libjuju 2.9.11 and test against Juju 2.x controllers so that we're still able to test and release our charms.

So I'd say our test/release process is not currently blocked, but we cannot update the test suite and our test/release process to use libjuju 3.x until that issue is fixed.

Revision history for this message
Bas de Bruijne (basdbruijne) wrote :

I scheduled an extra test run with the latest k8s test suite, which is pinned to pylibjuju 2.9.11. Do I understand correctly that pylibjuju 2.9.11 is compatible with juju 3.0rc2?

Revision history for this message
Caner Derici (cderici) wrote :

@Bas That's a negative. Pylibjuju 2.9.11 is not compatible with juju 3.0. However, pylibjuju >=3.0 should continue to support Juju >=2.9.

If you want to use Juju 3.0, then you'll need pylibjuju >=3.0.0 (we're planning on releasing libjuju 3.0.3 tomorrow or early next week, I highly recommend using that one for consistent results)

@George Thank you for the explanation. I'm working on the Action.run stuff. However, just to make sure I understand correctly; you made the changes required to adapt to 3.0 (the run & run_action methods becoming asynchronous etc -- see https://github.com/juju/python-libjuju/issues/707 for detailed discussion), and after those changes you're intermittently getting inconsistent values on result.success field (e.g. "pending" instead of "completed"), that's what's preventing you from upgrading to libjuju 3.x, right?

Revision history for this message
George Kraft (cynerva) wrote :

That's my understanding, yes. Adam updated our test suite to libjuju 3.0.1[1][2], worked through some initial issues that included updating our unit.run calls to have the newly needed await[3], encountered and reported the action issue I mentioned before[4], then reverted us back to libjuju 2.9.11[5]. I'll ask Adam to look at this thread in case he has any further details that might help.

[1]: https://github.com/charmed-kubernetes/jenkins/pull/976
[2]: https://github.com/charmed-kubernetes/jenkins/pull/978
[3]: https://github.com/charmed-kubernetes/jenkins/pull/982/files#diff-ba7b4974291be4e7299ba770949340a1e2c2fb9b3a22d6d16cac8f1d0187c4d9R532-R533
[4]: https://github.com/juju/python-libjuju/issues/719
[5]: https://github.com/charmed-kubernetes/jenkins/pull/987

Revision history for this message
Adam Dyess (addyess) wrote (last edit ):

The only thing blocking k8s integration testing with libjuju 3.0.0 is [issue-719][1] where

# run a command on a unit:
   action = await ubuntu.units[0].run("sleep 30")
   before = datetime.now()

# await that result
   result = await action.wait()

# Rather than getting a "completed" -- we get pending?
   assert result.success == "completed", "Shouldn't be 'pending' -- only completed or failed"
   after = datetime.now()
# (after - before) should be >= 30 seconds, but instead its far less.

libjuju doesn't actually await the result of the long running command

We can modify one line to test out a new release of libjuju [here][2]

[1]: https://github.com/juju/python-libjuju/issues/719
[2]: https://github.com/charmed-kubernetes/jenkins/blob/main/jobs/integration/utils.py#L598-L603

Revision history for this message
Adam Dyess (addyess) wrote :

in preparation of an upcoming libjuju release to 3.0.3, i've prepared this branch and pr [1]

[1]: https://github.com/charmed-kubernetes/jenkins/pull/1072

Revision history for this message
Caner Derici (cderici) wrote :

https://github.com/juju/python-libjuju/pull/753 is landing, as a fix for the Action.run fix.

Revision history for this message
George Kraft (cynerva) wrote :

Thanks Caner. It looks like we're unblocked now. We should be able to land Adam's PR after testing/review to resolve this issue.

Changed in charmed-kubernetes-testing:
importance: Undecided → High
status: New → In Progress
assignee: nobody → Adam Dyess (addyess)
Revision history for this message
Ian Booth (wallyworld) wrote :

Marking as Invalid for Juju since the fix landed in libjuju

Changed in juju:
status: New → Invalid
tags: added: cdo-qa
Revision history for this message
Bas de Bruijne (basdbruijne) wrote :

Can we get a new tag with this fix on it?

Revision history for this message
Ian Booth (wallyworld) wrote :

@bas AKAIK there's already a libjuju 3.0.4 which i think has the fix

https://github.com/juju/python-libjuju/releases/tag/3.0.4

Revision history for this message
Bas de Bruijne (basdbruijne) wrote :

@ian Yes thanks, it's released on the juju side. My question is if we can get a release of the k8s test suite where libjuju is pinned to this version.

Revision history for this message
Adam Dyess (addyess) wrote :

as mentioned in LP#1997593, there's now a k8s test suite release with `juju==3.0.4` tagged with `1.26`

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.