Bug #2038494 “Juju isn't handling errors from ec2 provider re fi...” : Bugs : Canonical Juju

Thomas Miller (tlmiller) on 2023-10-05

Changed in juju:
importance:	Undecided → High

Shunde Zhang (shunde-zhang) on 2023-10-05

tags:

added: sts

Joseph Phillips (manadart) on 2023-10-05

Changed in juju:
status:	New → Triaged

Joseph Phillips (manadart) on 2024-01-25

Changed in juju:
assignee:	nobody → Nicolas Vinuesa (nvinuesa)
milestone:	none → 3.1.8

Harry Pidcock (hpidcock) on 2024-02-19

Changed in juju:
milestone:	3.1.8 → 3.3.3

Revision history for this message

Nicolas Vinuesa (nvinuesa) wrote on 2024-02-22:

#1

@tlmiller I cannot reproduce this, maybe I'm doing something wrong. This is the scenario I'm following (both juju 3.1.7 and 3.3.1):

```
juju bootstrap aws/eu-west-3 c
juju add-model m
juju deploy ubuntu
juju exec --unit ubuntu/0 open-port 8080/tcp
```
At this point I see the logs
```
controller-0: 18:34:11 INFO juju.worker.firewaller opened port ranges [8080/tcp from 0.0.0.0/0,::/0] on "machine-0"
```
And the inbound rule has been correctly added to the security group.

Now, if I manually remove the rule from the security group in the aws console, and then run:
```
juju unexpose ubuntu
```
then I see the logs
```
controller-0: 18:35:16 INFO juju.worker.firewaller closed port ranges [8080/tcp from 0.0.0.0/0,::/0] on "machine-0"
```
And no error.

The same happens the other way around (if I manually create the rule before exposing the app).

Do you have a reproducer?

Nicolas Vinuesa (nvinuesa) on 2024-02-23

Changed in juju:
status:	Triaged → Invalid

Ian Booth (wallyworld) on 2024-02-27

Changed in juju:
milestone:	3.3.3 → 3.3.4

Joseph Phillips (manadart) on 2024-02-28

Changed in juju:
status:	Invalid → Incomplete

Revision history for this message

Koo Zhong Zheng (kzz333) wrote on 2024-03-01:

#2

Hello,

For your information, I had re-dived into this bug which basically involved manual changes of network security group on that particular cloud machines in case 00369877 and 00369883.

1) Regarding lost of service connection, basically it was due to removing port 17070 where this port is required during upgrade, please find the sample errors and status after this port was removed in my testing environment:

# juju status
Unit Workload Agent Machine Public address Ports Message
easyrsa/1 unknown lost 6 10.6.1.94 agent lost, see 'juju show-status-log easyrsa/1'

Machine State Address Inst id Series AZ Message
6 down 10.6.1.94 5b1612b0-65b8-4f43-819a-96195647729f jammy nova ACTIVE

# sample juju logs, which will fail during upgrade
2024-03-01 09:15:03 ERROR juju.worker.dependency engine.go:695 "api-caller" manifold worker returned unexpected error: [75f369] "machine-6" cannot open api: unable to connect to API: dial tcp 10.6.1.96:17070: i/o timeout
2024-03-01 09:16:09 ERROR juju.worker.dependency engine.go:695 "api-caller" manifold worker returned unexpected error: [75f369] "machine-6" cannot open api: unable to connect to API: dial tcp 252.1.96.1:17070: i/o timeout
2024-03-01 09:17:34 ERROR juju.worker.dependency engine.go:695 "api-caller" manifold worker returned unexpected error: [75f369] "machine-6" cannot open api: unable to connect to API: dial tcp 252.1.96.1:17070: i/o timeout
2024-03-01 09:19:08 ERROR juju.worker.dependency engine.go:695 "api-caller" manifold worker returned unexpected error: [75f369] "machine-6" cannot open api: unable to connect to API: dial tcp 10.6.1.96:17070: i/o timeout

2) The looping of closing and opening ports, due to the juju will confuse about previous and current state of ports during the upgrade after unexpected manual alteration was made. The juju will expect the network security group is juju-managed only. This looping scenario will most likely happen especially if there are changes for any new rules in network security group of newer application version.

3) Furthermore, there are more useful ingress ports had been removed like port 22 (ssh) and other egress ports to contact other services by the non-official custom script.

4) Hence, I would like to suggest to set the status of this bug report to "invalid", since this is not likely a bug that is caused by juju logic itself.

Best Regards,
Koo