incorrect controller address for cross model consumers
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Canonical Juju |
Fix Committed
|
High
|
Joseph Phillips |
Bug Description
For a cross model relation, the consuming controller queries the offering controller to find out the IP addresses it should connect to in order to access the offer. There's 2 APIs available:
// APIHostPortsFor
// be used by agents.
// If the controller model is CAAS type, the return will be the controller
// k8s service addresses in cloud service.
// If there is no management network space configured for the controller,
// or if the space is misconfigured, the return will be the same as
// APIHostPortsFor
// Otherwise the returned addresses will correspond with the management net space.
// If there is no document at all, we simply fall back to APIHostPortsFor
func (st *State) APIHostPortsFor
or
// APIHostPortsFor
func (st *State) APIHostPortsFor
The API being used is APIHostPortsFor
For k8s offers, the effect is that the consuming side tries to resolve the FQDN address and fails, causing workers to bounce. On k8s, there's no space support so the only difference between the agent and client addresses is the FQDN. On machine controllers, this has the possibility to break CMR if the consuming controller can't reach addresses on the offering controller mgmt space.
Where offer and consumer are in different controllers, in the absence of extended network modelling which Juju does not yet support, it should default to choosing the most public addresses available, which means APIHostPortsFor
Changed in juju: | |
assignee: | nobody → Joseph Phillips (manadart) |
Changed in juju: | |
status: | In Progress → Fix Committed |
Changed in juju: | |
milestone: | 3.4.5 → 3.4.6 |
For cross model interactions across controllers, I think it is correct to model the connection to the remote controller as a "Client" connection, not as an "Agent" connection. (APIHostPortsFo rAgents is very much for things that are in the "same" location as the controller, and thus expected that they could route to the controllers at `juju-ha-space`.
If you consider a CMR that had one controller in GCE and one controller in AWS, I would expect that the agents running in GCE would access the controller in an HA space that was local to GCE, and the agents in AWS would access the controller in the HA space that was local to AWS, but that the GCE controller should reach the AWS controller as a normal 'client' connection.