juju does not expose ports in OCI

Bug #1834974 reported by David Lawson
32
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Canonical Juju
Fix Released
High
Nicolas Vinuesa
2.7
Won't Fix
High
Unassigned

Bug Description

While working on getting HA controllers working in the OCI cloud, I tried running "open-port 17070" and "open-port 37017" via juju run, the standard method of opening ports manually. While both commands completed apparently successfully, they did not in fact open the ports. OCI uses host firewalls as well as per network security groups and it appears juju does not yet appropriately manipulate those host firewalls while attempting to expose ports.

Revision history for this message
Joseph Phillips (manadart) wrote :

Can you provide the version of juju that you were running here?

The instance level firewalling was added in this patch:
https://github.com/juju/juju/pull/10288

The releases with this change are 2.5.8, 2.6.5 and edge.

Changed in juju:
assignee: nobody → Joseph Phillips (manadart)
status: New → Incomplete
Revision history for this message
David Lawson (deej) wrote :

Sorry, yeah, this was 2.6.4.

Revision history for this message
Joseph Phillips (manadart) wrote :

If you can confirm that this resolves it for you, I will close the ticket.

Revision history for this message
David Lawson (deej) wrote :

Doesn't appear to be fixed in 2.6.5:

(mojo-prod-oracle-archive-mirror-controller)prod-oracle-archive-mirror-controller@wekufe:~$ juju enable-ha
maintaining machines: 0
adding machines: 1, 2

(mojo-prod-oracle-archive-mirror-controller)prod-oracle-archive-mirror-controller@wekufe:~$ jsft
Model Controller Cloud/Region Version SLA Timestamp
controller oracle-archive-mirror-controller oracle/us-ashburn-1 2.6.5 unsupported 19:53:34Z

Machine State DNS Inst id Series AZ Message
0 started 132.145.201.142 ocid1.instance.oc1.iad.abuwcljshgwm4nlbs5vc5y34s2i3bp2ingxsdzokkqf274jdjotjx26ruu5q bionic running
1 pending 129.213.156.42 ...aqus6a bionic running
2 pending 129.213.35.154 ...sw6caq bionic running

The juju status is from about three hours after enabling HA, this is the same symptom I saw with 2.6.4.

Changed in juju:
status: Incomplete → New
Changed in juju:
status: New → Triaged
Revision history for this message
David Lawson (deej) wrote :

Sorry, I got a little confused about which bug was updated, I thought this was the controller HA bug, but this still appears to be a problem in 2.6.5. You can see I've tried to open a port here:

(mojo-prod-oracle-archive-mirror-controller)prod-oracle-archive-mirror-controller@wekufe:~$ jsft
Model Controller Cloud/Region Version SLA Timestamp
controller oracle-archive-mirror-controller oracle/us-ashburn-1 2.6.5 unsupported 13:32:04Z

App Version Status Scale Charm Store Rev OS Notes
ubuntu 18.04 active 1 ubuntu jujucharms 12 ubuntu

Unit Workload Agent Machine Public address Ports Message
ubuntu/0* active idle 0 132.145.201.142 37017/tcp ready

And on the unit, no firewall rules have been added:

sudo iptables -Lubuntu@juju-6eb109-0:~$ sudo iptables -L -v -n
Chain INPUT (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target prot opt in out source destination
 730K 169M ACCEPT tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp dpt:17070
3373K 1650M ACCEPT all -- * * 0.0.0.0/0 0.0.0.0/0 state RELATED,ESTABLISHED
  143 8521 ACCEPT icmp -- * * 0.0.0.0/0 0.0.0.0/0
81934 8542K ACCEPT all -- lo * 0.0.0.0/0 0.0.0.0/0
    5 160 ACCEPT udp -- * * 0.0.0.0/0 0.0.0.0/0 udp spt:123
  530 26268 ACCEPT tcp -- * * 0.0.0.0/0 0.0.0.0/0 state NEW tcp dpt:22
 497K 30M REJECT all -- * * 0.0.0.0/0 0.0.0.0/0 reject-with icmp-host-prohibited

Revision history for this message
Junien F (axino) wrote :

You need to "juju expose ubuntu" before complaining :)

Revision history for this message
David Lawson (deej) wrote :

My impression is that it shouldn't matter and "open-port X" should open port X, but I may be mistaken. Regardless, exposure doesn't impact this failure mode, I've exposed telegraf on this unit and there's no firewall rule for it.

Revision history for this message
Richard Harding (rharding) wrote :

open-port doesn't change any firewall rules. It lists in the model that if an application is exposed those ports will be opened as instructed. However, it doesn't open them as not all uses of an application are for exposed/public use.

expose then looks at the list of ports that are defined to be opened and adjusts the firewall rules.

It appears that's not working appropriately in this case.

Juju managed the firewall at the cloud API level and doesn't manage machine based software like iptables. Looking at the provider code:

https://github.com/juju/juju/blob/develop/provider/oci/firewall.go

It looks like the firewall is not implemented.

There is support around security lists though in the networking code. Looking at the docs we should be using https://docs.cloud.oracle.com/iaas/Content/Network/Concepts/securitylists.htm but we'd not expect a firewall to show any updates on the host machine.

Thanks for the added info.

Revision history for this message
David Lawson (deej) wrote :

Interestingly, as I mentioned in bug 1834972, juju DOES actually open the API port in the host firewall when it bootstraps, so it's interacting with the firewall to some extent.

Changed in juju:
importance: Undecided → High
Revision history for this message
Tim Penhey (thumper) wrote :

Expose should only be necessary to expose the port outside of the model. Meaning open-port should make the connection available to other applications within the model.

If there is per instance firewalling using ip tables, then open-port should at least make that port available to the model. Exposing it should then have the firewall updated to accept external connections as well.

Revision history for this message
Ian Booth (wallyworld) wrote :

I'm marking this as Fix Committed as it appears that the core issue mentioned in this bug - the need for instance level firewalling to support open-port - has been done in

https://github.com/juju/juju/pull/10288

There's also commentary about HA not working due to port 37017 not being opened between controller instances. This is bug 1834972 and is fixed in

https://github.com/juju/juju/pull/10531

which will appear in Juju 2.6.9 (already it's available in the 2.6 edge snap).

Please feel free to re-open if necessary.

Changed in juju:
status: Triaged → Fix Committed
Revision history for this message
Haw Loeung (hloeung) wrote :

Re-opening this. With Juju 2.7.2 and ports open, firewall rules are still missing:

Juju status:

| content-cache active 2 content-cache local 0 ubuntu exposed
| ...
| content-cache/1* active idle 1 150.136.239.201 80/tcp,9145/tcp ready

Yet, firewall rules on the unit:

| ubuntu@juju-5687ef-1:~$ sudo iptables -L -vn
| Chain INPUT (policy ACCEPT 0 packets, 0 bytes)
| pkts bytes target prot opt in out source destination
| 461K 427M ACCEPT all -- * * 0.0.0.0/0 0.0.0.0/0 state RELATED,ESTABLISHED
| 13 697 ACCEPT icmp -- * * 0.0.0.0/0 0.0.0.0/0
| 23980 1975K ACCEPT all -- lo * 0.0.0.0/0 0.0.0.0/0
| 0 0 ACCEPT udp -- * * 0.0.0.0/0 0.0.0.0/0 udp spt:123
| 994 59196 ACCEPT tcp -- * * 0.0.0.0/0 0.0.0.0/0 state NEW tcp dpt:22
| 4589 228K REJECT all -- * * 0.0.0.0/0 0.0.0.0/0 reject-with icmp-host-prohibited

Changed in juju:
status: Fix Committed → Confirmed
assignee: Joseph Phillips (manadart) → nobody
Revision history for this message
Haw Loeung (hloeung) wrote :

Also, can the rules use something else rather than icmp-host-prohibited? That gives back 'No route to host' which is wrong:

| $ nc -vz 150.136.239.201 80
| nc: connect to 150.136.239.201 port 80 (tcp) failed: No route to host

This made me waste a bit of time trying to figure out if it's routing somewhere between our network (and my local network) and OCI.

For TCP, perhaps tcp-reset?

For UDP, or the rest, icmp-port-unreachable?

Revision history for this message
Gary.Wang (gary-wzl77) wrote :

Deploy the Anbox Cloud on the OCI with juju 2.9.32 and hit the same issue.
Check the output of juju status
https://paste.ubuntu.com/p/WT8ZsK7Y3V/

1. Access the LXD daemon from an external network(my host):

$ nc -vz 132.145.160.190 8443
nc: connect to 132.145.160.190 port 8443 (tcp) failed: No route to host

2. Access the LXD daemon from the AMS unit
  - With the private address (the ip addresses of LXD unit and AMD unit are under the same subnet)
    $ juju ssh ams/0 -- "nc -vz 10.0.0.122 8443"
      nc: connect to 10.0.0.122 port 8443 (tcp) failed: No route to host
  - With the public address
    $ juju ssh ams/0 -- "nc -vz 132.145.160.190 8443"
      nc: connect to 132.145.160.190 port 8443 (tcp) failed: No route to host

3. Access the LXD daemon from the LXD unit
  - With the private address
    $ juju ssh lxd/0 -- "nc -vz 10.0.0.122 8443"
      Connection to 10.0.0.122 8443 port [tcp/*] succeeded!
  - With the public address
    $ juju ssh lxd/0 -- "nc -vz 132.145.160.190 8443"
      nc: connect to 132.145.160.190 port 8443 (tcp) failed: No route to host

-------------------------------------------------------
 Manually setup IP table rule to expose LXD daemon with
  $ juju ssh lxd/0 -- "sudo iptables -I INPUT 1 -p tcp -m state --state NEW -m tcp --dport 8443 -j ACCEPT"

4. Access the LXD daemon from the LXD unit
  - With the private address
    $ juju ssh lxd/0 -- "nc -vz 10.0.0.122 8443"
      Connection to 10.0.0.122 8443 port [tcp/*] succeeded!
  - With the public address
    $ juju ssh lxd/0 -- "nc -vz 132.145.160.190 8443"
      Connection to 132.145.160.190 8443 port [tcp/*] succeeded!

5. Access the LXD daemon from the AMS unit
  - With the private address
    $ juju ssh ams/0 -- "nc -vz 10.0.0.122 8443"
      Connection to 10.0.0.122 8443 port [tcp/*] succeeded!
  - With the public address
    $ juju ssh ams/0 -- "nc -vz 132.145.160.190 8443"
     Connection to 132.145.160.190 8443 port [tcp/*] succeeded!

Revision history for this message
John A Meinel (jameinel) wrote :

We definitely won't fix this in 2.7, we can take a look for 2.9 / 3.0

Ian Booth (wallyworld)
tags: added: oracle-provider
Revision history for this message
Joseph Phillips (manadart) wrote :

I just checked this using "juju expose <app> --to-cidrs <my IP>/32".

This worked to give me access, and I can see the rule affecting the change.

ACCEPT tcp -- <my IP> anywhere tcp dpt:http /* juju ingress */

Revision history for this message
Joseph Phillips (manadart) wrote :

Made it in as far back as 2.5
https://github.com/juju/juju/pull/10288

Revision history for this message
Nicolas Vinuesa (nvinuesa) wrote :

so, I have tried the QA from @mandart 's PR https://github.com/juju/juju/pull/10288 and indeed it's failing now on 2.9.46. The problem seems to be the sshInstanceConfigurator that's failing to change or read iptables' rules and this kills the firewaller worker over and over:

```
machine-0: 18:52:15 ERROR juju.worker.dependency "firewaller" manifold worker returned unexpected error: cannot respond to units changes for "machine-0", "90d83707-6a90-48f5-850f-f1599433d42b": configuring ports for address "": : subprocess encountered error code 255
machine-0: 18:54:21 ERROR juju.worker.dependency "firewaller" manifold worker returned unexpected error: cannot respond to units changes for "machine-1", "90d83707-6a90-48f5-850f-f1599433d42b": configuring ports for address "": : subprocess encountered error code 255
machine-0: 18:56:23 ERROR juju.worker.dependency "firewaller" manifold worker returned unexpected error: cannot respond to units changes for "machine-0", "90d83707-6a90-48f5-850f-f1599433d42b": configuring ports for address "": : subprocess encountered error code 255
```

I'm going to investigate.

Changed in juju:
assignee: nobody → Nicolas Vinuesa (nvinuesa)
status: Confirmed → In Progress
Revision history for this message
Nicolas Vinuesa (nvinuesa) wrote :

@deej after discussing with @jameinel, we have found a workaround to unblock you (and everyone else coming across this bug). The basic idea is to add the juju system public key that's missing on any machine that you create and thus not allowing the controller to update the iptables and open the ports you need.

These are the steps:
1) Retrieve the fingerprint corresponding to the juju system key (located on the controller machine's authorized keys). It should be commented with `Juju:juju-system-key`:
$ juju ssh -m controller 0
ubuntu@juju-165b85-0:~$ cat ~/.ssh/authorized_keys
ssh-rsa aaa Juju:juju-client-key
ssh-ed25519 bbb Juju:nicolas@home
ssh-rsa ccc Juju:juju-system-key

(in this case copy the last line, containing the comment `Juju:juju-system-key`)
If you are on HA repeat this process for all controller machines and copy all of the corresponding keys.

2) Add this (these if you are on HA) key(s) to your model:
$ juju add-ssh-key "ssh-rsa ccc Juju:juju-system-key"

Now you should be able to open ports.

If you want a separate scenario, you can try:
$ juju bootstrap OCI-Cloud c
$ juju add-model m
{steps 1 and 2}
$ juju deploy ubuntu -n 2
$ juju deploy 'juju-qa-network-health'
$ juju expose network-health
$ juju add-relation ubuntu network-health
$ juju run --unit ubuntu/0 curl <public IP of machine 1>:8039

The last step of this scenario should return `pass`.

Revision history for this message
Haw Loeung (hloeung) wrote :

@nvinuesa, @deej is no longer with Canonical and no longer assigned to work on this.

We've, Canonical IS, also put on hold Oracle OCI work so unable to test and confirm this until such time as we resume that work.

Thanks and appreciate the updates.

Revision history for this message
Nicolas Vinuesa (nvinuesa) wrote :
Changed in juju:
status: In Progress → Fix Committed
milestone: none → 3.3.1
milestone: 3.3.1 → 3.3-rc1
Changed in juju:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.