juju 3.5 / jammy
rabbitmq-server charm channel 3.9/stable rev 188
When attempting to deploy rabbitmq in a microcloud environment (microceph, microovn, lxd), with OVN enabled on a juju profile enabling the OVN network, the rabbitmq server charm sits at "(install) Installing/upgrading RabbitMQ packages" indefinitely.
inspecting into the machine, it is failing to install the rabbitmq-server debian package. The package eventually fails during a reconfigure with the following output:
```
BOOT FAILED
===========
Exception during startup:
error:{badmatch,{error,timeout}}
rabbit_prelaunch_dist:dist_port_use_check_fail/2, line 127
rabbit_prelaunch_dist:setup/1, line 22
2024-09-16 19:46:48.733741+00:00 [erro] <0.130.0>
2024-09-16 19:46:48.733741+00:00 [erro] <0.130.0> BOOT FAILED
2024-09-16 19:46:48.733741+00:00 [erro] <0.130.0> ===========
2024-09-16 19:46:48.733741+00:00 [erro] <0.130.0> Exception during startup:
2024-09-16 19:46:48.733741+00:00 [erro] <0.130.0>
2024-09-16 19:46:48.733741+00:00 [erro] <0.130.0> error:{badmatch,{error,timeout}}
2024-09-16 19:46:48.733741+00:00 [erro] <0.130.0>
2024-09-16 19:46:48.733741+00:00 [erro] <0.130.0> rabbit_prelaunch_dist:dist_port_use_check_fail/2, line 127
2024-09-16 19:46:48.733741+00:00 [erro] <0.130.0> rabbit_prelaunch_dist:setup/1, line 22
2024-09-16 19:46:48.733741+00:00 [erro] <0.130.0> rabbit_prelaunch:do_run/0, line 115
2024-09-16 19:46:48.733741+00:00 [erro] <0.130.0> rabbit_prelaunch:run_prelaunch_first_phase/0, line 32
2024-09-16 19:46:48.733741+00:00 [erro] <0.130.0> supervisor:do_start_child_i/3, line 414
2024-09-16 19:46:48.733741+00:00 [erro] <0.130.0> supervisor:do_start_child/2, line 400
2024-09-16 19:46:48.733741+00:00 [erro] <0.130.0> supervisor:-start_children/2-fun-0-/3, line 384
2024-09-16 19:46:48.733741+00:00 [erro] <0.130.0> supervisor:children_map/4, line 1250
2024-09-16 19:46:48.733741+00:00 [erro] <0.130.0>
rabbit_prelaunch:do_run/0, line 115
rabbit_prelaunch:run_prelaunch_first_phase/0, line 32
supervisor:do_start_child_i/3, line 414
supervisor:do_start_child/2, line 400
supervisor:-start_children/2-fun-0-/3, line 384
supervisor:children_map/4, line 1250
2024-09-16 19:46:49.737111+00:00 [erro] <0.130.0> supervisor: {local,rabbit_prelaunch_sup}
2024-09-16 19:46:49.737111+00:00 [erro] <0.130.0> errorContext: start_error
2024-09-16 19:46:49.737111+00:00 [erro] <0.130.0> reason: {badmatch,{error,timeout}}
2024-09-16 19:46:49.737111+00:00 [erro] <0.130.0> offender: [{pid,undefined},
2024-09-16 19:46:49.737111+00:00 [erro] <0.130.0> {id,prelaunch},
2024-09-16 19:46:49.737111+00:00 [erro] <0.130.0> {mfargs,{rabbit_prelaunch,run_prelaunch_first_phase,[]}},
2024-09-16 19:46:49.737111+00:00 [erro] <0.130.0> {restart_type,transient},
2024-09-16 19:46:49.737111+00:00 [erro] <0.130.0> {significant,false},
2024-09-16 19:46:49.737111+00:00 [erro] <0.130.0> {shutdown,5000},
2024-09-16 19:46:49.737111+00:00 [erro] <0.130.0> {child_type,worker}]
2024-09-16 19:46:49.737111+00:00 [erro] <0.130.0>
2024-09-16 19:46:49.737314+00:00 [erro] <0.128.0> crasher:
2024-09-16 19:46:49.737314+00:00 [erro] <0.128.0> initial call: application_master:init/4
2024-09-16 19:46:49.737314+00:00 [erro] <0.128.0> pid: <0.128.0>
2024-09-16 19:46:49.737314+00:00 [erro] <0.128.0> registered_name: []
2024-09-16 19:46:49.737314+00:00 [erro] <0.128.0> exception exit: {{shutdown,
2024-09-16 19:46:49.737314+00:00 [erro] <0.128.0> {failed_to_start_child,prelaunch,
2024-09-16 19:46:49.737314+00:00 [erro] <0.128.0> {badmatch,{error,timeout}}}},
2024-09-16 19:46:49.737314+00:00 [erro] <0.128.0> {rabbit_prelaunch_app,start,[normal,[]]}}
2024-09-16 19:46:49.737314+00:00 [erro] <0.128.0> in function application_master:init/4 (application_master.erl, line 142)
2024-09-16 19:46:49.737314+00:00 [erro] <0.128.0> ancestors: [<0.127.0>]
2024-09-16 19:46:49.737314+00:00 [erro] <0.128.0> message_queue_len: 1
2024-09-16 19:46:49.737314+00:00 [erro] <0.128.0> messages: [{'EXIT',<0.129.0>,normal}]
2024-09-16 19:46:49.737314+00:00 [erro] <0.128.0> links: [<0.127.0>,<0.44.0>]
2024-09-16 19:46:49.737314+00:00 [erro] <0.128.0> dictionary: []
2024-09-16 19:46:49.737314+00:00 [erro] <0.128.0> trap_exit: true
2024-09-16 19:46:49.737314+00:00 [erro] <0.128.0> status: running
2024-09-16 19:46:49.737314+00:00 [erro] <0.128.0> heap_size: 376
2024-09-16 19:46:49.737314+00:00 [erro] <0.128.0> stack_size: 29
2024-09-16 19:46:49.737314+00:00 [erro] <0.128.0> reductions: 168
2024-09-16 19:46:49.737314+00:00 [erro] <0.128.0> neighbours:
2024-09-16 19:46:49.737314+00:00 [erro] <0.128.0>
2024-09-16 19:46:49.741671+00:00 [noti] <0.44.0> Application rabbitmq_prelaunch exited with reason: {{shutdown,{failed_to_start_child,prelaunch,{badmatch,{error,timeout}}}},{rabbit_prelaunch_app,start,[normal,[]]}}
{"Kernel pid terminated",application_controller,"{application_start_failure,rabbitmq_prelaunch,{{shutdown,{failed_to_start_child,prelaunch,{badmatch,{error,timeout}}}},{rabbit_prelaunch_app,start,[normal,[]]}}}"}
Kernel pid terminated (application_controller) ({application_start_failure,rabbitmq_prelaunch,{{shutdown,{failed_to_start_child,prelaunch,{badmatch,{error,timeout}}}},{rabbit_prelaunch_app,start,[normal,[]]}}})
Crash dump is being written to: erl_crash.dump...done
Job for rabbitmq-server.service failed because the control process exited with error code.
See "systemctl status rabbitmq-server.service" and "journalctl -xeu rabbitmq-server.service" for details.
invoke-rc.d: initscript rabbitmq-server, action "start" failed.
● rabbitmq-server.service - RabbitMQ Messaging Server
Loaded: loaded (/lib/systemd/system/rabbitmq-server.service; enabled; vendor preset: enabled)
Active: activating (auto-restart) (Result: exit-code) since Mon 2024-09-16 19:47:07 UTC; 5ms ago
Process: 7331 ExecStart=/usr/lib/rabbitmq/bin/rabbitmq-server (code=exited, status=1/FAILURE)
Main PID: 7331 (code=exited, status=1/FAILURE)
Status: "Standing by"
CPU: 523ms
dpkg: error processing package rabbitmq-server (--configure):
installed rabbitmq-server package post-installation script subprocess returned error exit status 1
Errors were encountered while processing:
rabbitmq-server
```
It should also be noted that /etc/hosts is never configured with the IP / container name
This happens in both an LXC container as well as an LXC Virtual machine.
If the same scenario is applied to a profile/model with local bridge networking enabled, the charm installs as expected. This is true for both containers as well as Virtual machine, bridging does not exhibit this behavior.
This feels more like an issue in the rabbitmq-server debian package rather than the charm itself. The charm is simply performing the apt install command and its failing there.
Now from that, I'm wondering if the rabbitmq service is failing to start in a reasonable time - but completes after the fact. The following snippet in the trace suggests that the rmq port is already in use:
rabbit_ prelaunch_ dist:dist_ port_use_ check_fail/ 2, line 127