It is not possible to actively use two offers of the same application at the same time

Bug #1815179 reported by Tytus Kurek
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Canonical Juju
Fix Released
High
Ian Booth

Bug Description

I am in a process of implementing a patch for OpenStack Swift charms and I am facing an issue with setting up Juju cross-controller relations. The patched charms I am using can be found at:

https://github.com/tytus-kurek/charm-swift-proxy/tree/swift-multi-region
https://github.com/tytus-kurek/charm-swift-storage/tree/swift-multi-region

When deploying the environment within the same Juju model everything works fine. The bundle I am using for the single-model scenario looks as follows:

series: bionic
machines:
  "0":
    constraints: tags=swift
    series: bionic
  "1":
    constraints: tags=swift
    series: bionic
services:
  keystone:
    charm: cs:keystone
    num_units: 1
    options:
      admin-password: admin
      token-provider: fernet
      worker-multiplier: 0.25
    to:
    - lxd:0
  mysql:
    charm: cs:percona-cluster
    num_units: 1
    options:
      innodb-buffer-pool-size: 256M
      max-connections: 1000
    to:
    - lxd:0
  swift-proxy-region1:
    charm: /home/guardian/git/charm-swift-proxy
    num_units: 1
    options:
      enable-multi-region: true
      read-affinity: "r1=100"
      region: "RegionOne"
      replicas: 2
      write-affinity: "r1, r2"
      write-affinity-node-count: 1
      zone-assignment: manual
    to:
    - lxd:0
  swift-proxy-region2:
    charm: /home/guardian/git/charm-swift-proxy
    num_units: 1
    options:
      enable-multi-region: true
      read-affinity: "r2=100"
      region: "RegionTwo"
      replicas: 2
      write-affinity: "r2, r1"
      write-affinity-node-count: 1
      zone-assignment: manual
    to:
    - lxd:1
  swift-storage-region1:
    charm: /home/guardian/git/charm-swift-storage
    num_units: 1
    options:
      block-device: sdb sdc sdd
      region: 1
      zone: 1
    to:
    - 0
  swift-storage-region2:
    charm: /home/guardian/git/charm-swift-storage
    num_units: 1
    options:
      block-device: sdb sdc sdd
      region: 2
      zone: 1
    to:
    - 1
relations:
  - [ "keystone:shared-db", "mysql:shared-db" ]
  - [ "keystone:identity-service", "swift-proxy-region1:identity-service" ]
  - [ "keystone:identity-service", "swift-proxy-region2:identity-service" ]
  - [ "swift-proxy-region1:swift-storage", "swift-storage-region1:swift-storage" ]
  - [ "swift-proxy-region1:swift-storage", "swift-storage-region2:swift-storage" ]
  - [ "swift-proxy-region2:swift-storage", "swift-storage-region1:swift-storage" ]
  - [ "swift-proxy-region2:swift-storage", "swift-storage-region2:swift-storage" ]
  - [ "swift-proxy-region1:master", "swift-proxy-region2:slave" ]

The problem starts when I try to segregate the applications so that:
- keystone, mysql, swift-proxy-region1 and swift-storage-region1 are hosted on one juju controller (maas-region1),
- swift-proxy-region2 and swift-storage-region2 are hosted on another juju controller (maas-region2).
The controllers use different MaaS clouds and there is full network communication between them and hosted machines. Everything runs on my laptop on two separate bridges.

The following part works:

cat <<EOF > /tmp/swift-region1.yaml
series: bionic
machines:
  "0":
    constraints: tags=swift
    series: bionic
services:
  keystone:
    charm: cs:keystone
    num_units: 1
    options:
      admin-password: admin
      token-provider: fernet
      worker-multiplier: 0.25
    to:
    - lxd:0
  mysql:
    charm: cs:percona-cluster
    num_units: 1
    options:
      innodb-buffer-pool-size: 256M
      max-connections: 1000
    to:
    - lxd:0
  swift-proxy-region1:
    charm: /home/guardian/git/charm-swift-proxy
    num_units: 1
    options:
      enable-multi-region: true
      read-affinity: "r1=100"
      region: "RegionOne"
      replicas: 1
      write-affinity: "r1, r2"
      write-affinity-node-count: 1
      zone-assignment: manual
    to:
    - lxd:0
  swift-storage-region1:
    charm: /home/guardian/git/charm-swift-storage
    num_units: 1
    options:
      block-device: sdb sdc sdd
      region: 1
      zone: 1
    to:
    - 0
relations:
  - [ "keystone:shared-db", "mysql:shared-db" ]
  - [ "keystone:identity-service", "swift-proxy-region1:identity-service" ]
  - [ "swift-proxy-region1:swift-storage", "swift-storage-region1:swift-storage" ]
EOF

cat <<EOF > /tmp/swift-region2.yaml
series: bionic
machines:
  "0":
    constraints: tags=swift
    series: bionic
services:
  swift-proxy-region2:
    charm: /home/guardian/git/charm-swift-proxy
    num_units: 1
    options:
      enable-multi-region: true
      read-affinity: "r2=100"
      region: "RegionTwo"
      replicas: 1
      write-affinity: "r2, r1"
      write-affinity-node-count: 1
      zone-assignment: manual
    to:
    - lxd:0
  swift-storage-region2:
    charm: /home/guardian/git/charm-swift-storage
    num_units: 1
    options:
      block-device: sdb sdc sdd
      region: 2
      zone: 1
    to:
    - 0
relations:
  - [ "swift-proxy-region2:swift-storage", "swift-storage-region2:swift-storage" ]
EOF

juju switch maas-region1
juju add-model swift-region1
juju deploy /tmp/swift-region1.yaml
juju switch maas-region2
juju add-model swift-region2
juju deploy /tmp/swift-region2.yaml

juju switch maas-region1
juju offer keystone:identity-service
juju switch maas-region2
juju consume maas-region1:admin/swift-region1.keystone keystone
juju relate swift-proxy-region2 keystone

juju switch maas-region1
juju offer swift-proxy-region1:master swift-proxy-region1-master
juju switch maas-region2
juju consume maas-region1:admin/swift-region1.swift-proxy-region1-master
juju relate swift-proxy-region2:slave swift-proxy-region1-master

juju switch maas-region1
juju offer swift-proxy-region1:swift-storage swift-proxy-region1-swift-storage
juju switch maas-region2
juju consume maas-region1:admin/swift-region1.swift-proxy-region1-swift-storage

Now when I try to run the following command:

juju relate swift-storage-region2 swift-proxy-region1-swift-storage

the following things happen:
- "swift-storage-relation-joined" and "swift-storage-relation-changed" hooks are executed on swift-proxy-region1
- "swift-storage-relation-joined" and "swift-storage-relation-changed" hooks are NOT executed on swift-storage-region2
- the following error messages are displayed in maas-region2 controller logs:

bf1c6107-cb8c-425b-8a19-b9afe8fa651e: machine-0 2019-02-08 09:41:22 ERROR juju.worker.remoterelations relationunitsworker.go:74 error in relation units worker for swift-proxy-region1-swift-storage:swift-storage swift-storage-region2:swift-storage: connection is shut down
bf1c6107-cb8c-425b-8a19-b9afe8fa651e: machine-0 2019-02-08 09:41:22 ERROR juju.worker.remoterelations relationunitsworker.go:74 error in relation units worker for swift-proxy-region1-swift-storage:swift-storage swift-storage-region2:swift-storage: connection is shut down
bf1c6107-cb8c-425b-8a19-b9afe8fa651e: machine-0 2019-02-08 09:41:22 ERROR juju.worker.remoterelations remoteapplicationworker.go:73 error in remote application worker for swift-proxy-region1-swift-storage: consuming relation change {RelationToken:61becab7-a77d-4703-8b13-4fe50587bf67 ApplicationToken:22dc8b56-8a65-463b-874d-2903629b94e4 Life: ForceCleanup:<nil> Suspended:<nil> SuspendedReason: ChangedUnits:[{UnitId:0 Settings:map[ingress-address:172.18.0.222 private-address:172.18.0.222 rsync_allowed_hosts:172.18.0.220 172.19.0.108 timestamp:1549618776.5363462 egress-subnets:172.18.0.222/32]}] DepartedUnits:[] Macaroons:[0xc002b35110]} from remote model a9fb3d72-f9ae-4199-8f88-6b7f2f322b10: application "swift-proxy-region1-master" is not a member of "swift-proxy-region1-swift-storage:swift-storage swift-storage-region2:swift-storage"

Same thing happens when I change the order of deployment:

juju switch maas-region1
juju offer swift-proxy-region1:swift-storage swift-proxy-region1-swift-storage
juju switch maas-region2
juju consume maas-region1:admin/swift-region1.swift-proxy-region1-swift-storage
juju relate swift-storage-region2 swift-proxy-region1-swift-storage

juju switch maas-region1
juju offer swift-proxy-region1:master swift-proxy-region1-master
juju switch maas-region2
juju consume maas-region1:admin/swift-region1.swift-proxy-region1-master
juju relate swift-proxy-region2:slave swift-proxy-region1-master

Just the error messages are slightly different.

It looks like it is not possible to actively use two offers of the same application at the same time. Attaching full logs from controllers and hosted units.

Revision history for this message
Tytus Kurek (tkurek) wrote :
Ian Booth (wallyworld)
Changed in juju:
milestone: none → 2.5.2
assignee: nobody → Ian Booth (wallyworld)
importance: Undecided → High
status: New → Triaged
Revision history for this message
Ian Booth (wallyworld) wrote :

To help debug the issue, we need the log level set to DEBUG for the following packages, before the bundles are deployed and offers created etc:

juju.worker.remoterelations
juju.apiserver.common.crossmodel
juju.apiserver.crossmodelrelations

Revision history for this message
Ian Booth (wallyworld) wrote :

I think I've managed to reproduce locally. Now I've got to figure out how to fix it.

Revision history for this message
Ian Booth (wallyworld) wrote :
Ian Booth (wallyworld)
Changed in juju:
milestone: 2.5.2 → 2.5.1
status: Triaged → Fix Committed
Revision history for this message
Ante Karamatić (ivoks) wrote :

Thank you Ian, it seems that this fix does indeed solve the problem. Please keep in mind that we have not tested upgrades, only new deployments.

Revision history for this message
John A Meinel (jameinel) wrote :
Revision history for this message
Tytus Kurek (tkurek) wrote :

I have just tested the upgrade and it behaves as follows: the error message is still displayed in the controller logs, but the issue seems gone.

Anyway, thank you very much for the patch!

Revision history for this message
Ian Booth (wallyworld) wrote :

@tkurek, I'd love if you could run up a system with logging configured as per https://bugs.launchpad.net/juju/+bug/1815179/comments/2 and attach the logs. If there are errors logged I'd like to be able to understand what's happening. In my test set up, the logs were clean.

Changed in juju:
status: Fix Committed → Fix Released
Revision history for this message
Tytus Kurek (tkurek) wrote :

@wallyworld: The test performed was held in the field, so I have limited control over this environment, but I'll enable logging and attach the logs if I have an opportunity.

Just to let you know it was tested in the following way:
1) Build a snap from the patched source
2) Upgrade juju client from the built snap
3) Upgrade agents from version 2.5.0 to 2.5.1 (controllers were NOT re-bootstrapped)

After that the same error message is being displayed in the controller logs, but the issue is gone.

Revision history for this message
Ian Booth (wallyworld) wrote :

Did you have any multi-offers before the upgrade to the new snap? These will continue to have errors. Were the errors new and continuous?

Revision history for this message
Tytus Kurek (tkurek) wrote :

No, there were no offers before. The error messages are:

fbc86642-4058-4f57-8bf6-a74cbbcea6fb: machine-0 2019-02-15 15:09:27 ERROR juju.worker.remoterelations relationunitsworker.go:74 error in relation units worker for swift-proxy-plwarb1-swift-storage:swift-storage swift-storage-hubudb2-zone3:swift-storage: connection is shut down
fbc86642-4058-4f57-8bf6-a74cbbcea6fb: machine-0 2019-02-15 15:09:27 ERROR juju.worker.remoterelations relationunitsworker.go:74 error in relation units worker for swift-proxy-plwarb1-swift-storage:swift-storage swift-storage-hubudb2-zone3:swift-storage: connection is shut down
fbc86642-4058-4f57-8bf6-a74cbbcea6fb: machine-0 2019-02-15 15:09:27 ERROR juju.worker.remoterelations relationunitsworker.go:74 error in relation units worker for swift-proxy-plwarb1-swift-storage:swift-storage swift-storage-hubudb2-zone1:swift-storage: connection is shut down
fbc86642-4058-4f57-8bf6-a74cbbcea6fb: machine-0 2019-02-15 15:09:27 ERROR juju.worker.remoterelations relationunitsworker.go:74 error in relation units worker for swift-proxy-plwarb1-swift-storage:swift-storage swift-storage-hubudb2-zone1:swift-storage: connection is shut down
fbc86642-4058-4f57-8bf6-a74cbbcea6fb: machine-0 2019-02-15 15:09:27 ERROR juju.worker.remoterelations relationunitsworker.go:74 error in relation units worker for swift-proxy-plwarb1-swift-storage:swift-storage swift-storage-hubudb2-zone2:swift-storage: connection is shut down
fbc86642-4058-4f57-8bf6-a74cbbcea6fb: machine-0 2019-02-15 15:09:27 ERROR juju.worker.remoterelations relationunitsworker.go:74 error in relation units worker for swift-proxy-plwarb1-swift-storage:swift-storage swift-storage-hubudb2-zone2:swift-storage: connection is shut down
fbc86642-4058-4f57-8bf6-a74cbbcea6fb: machine-0 2019-02-15 15:09:27 ERROR juju.worker.remoterelations remoteapplicationworker.go:73 error in remote application worker for swift-proxy-plwarb1-swift-storage: consuming relation change {RelationToken:25c0a41a-acb0-4414-8882-bd0bacb90863 ApplicationToken:1252686a-a8eb-4459-8a8e-d5ec1d4dd498 Life: ForceCleanup:<nil> Suspended:<nil> SuspendedReason: ChangedUnits:[{UnitId:0 Settings:map[rings_url:http://10.235.57.220/swift-rings egress-subnets:10.235.57.220/32 private-address:10.235.57.220 rsync_allowed_hosts:10.235.57.217 10.235.5.108 10.235.5.109 10.235.57.218 10.235.57.219 10.235.5.107 trigger:8c649558-14e0-497e-8eba-b275fc74a9e1 swift_hash:multi-region timestamp:1550146003.3573492 broker_timestamp:1550146740.041177 ingress-address:10.235.57.220]}] DepartedUnits:[] Macaroons:[0xc420196af0]} from remote model 1da892dd-74f6-470e-893e-37ccbecf401a: application "remote-1252686aa8eb44598a8ed5ec1d4dd498" is not a member of "swift-proxy-plwarb1-swift-storage:swift-storage swift-storage-hubudb2-zone1:swift-storage"

Revision history for this message
Ian Booth (wallyworld) wrote :

Thanks for the extra logs. We really need DEBUG turned on for the packages in comment #2 in order to be able to fully diagnose the issue. Without this additional information, it's difficult to determine what's happening. Ideally the debugging would be turned on before the offer is made.

Revision history for this message
Tytus Kurek (tkurek) wrote :

@wallyworld: How do I turn the DEBUG on? Assume I have an access to the Juju controller. What are the steps? I'll capture the logs if I have an opportunity.

Revision history for this message
Ian Booth (wallyworld) wrote :

You change debugging levels by setting the logging-config model config value.

$ juju model-config logging-config="juju.worker.remoterelations=DEBUG;juju.apiserver.common.crossmodel=DEBUG;juju.apiserver.crossmodelrelations=DEBUG"

Ideally the above would be down before creating an offer and attempting to relate to it.

Revision history for this message
Tytus Kurek (tkurek) wrote :

@wallyworld: Attached are the logs from both controllers after today's redeployment. Before creating the offers both models had been destroyed and the logging had been configured.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.