bifrost on ubuntu fails on dnsmasq task in bifrost

Bug #1844714 reported by Radosław Piliszek
18
This bug affects 2 people
Affects Status Importance Assigned to Milestone
kolla-ansible
Triaged
Medium
Unassigned
Train
Triaged
Medium
Unassigned
Ussuri
Triaged
Medium
Unassigned

Bug Description

Task name: Ensure dnsmasq is running with current config
Error message:
Unable to start service dnsmasq: Job for dnsmasq.service failed because the control process exited with error code.

Train release.

See: https://b32d03500643014759d0-c3c850383d3c4b677d9bd09df1596ea2.ssl.cf1.rackcdn.com/682292/3/check/kolla-ansible-bifrost-ubuntu-source/5b991e2/primary/logs/ansible/deploy-bifrost

and: https://review.opendev.org/682292

mgoddard commented:
I think I remember seeing something like this - the DNS resolver in bifrost's dnsmasq clashes with the local resolver on ubuntu. Workaround was to disable DNS in dnsmasq by setting disable_dnsmasq_dns in bifrost.yml.

side note:
tagged as CI as it might well be due to how CI is wired and not a problem in general but it's worth investigating nonetheless

Tags: bifrost ubuntu
Revision history for this message
Radosław Piliszek (yoctozepto) wrote :
Revision history for this message
Radosław Piliszek (yoctozepto) wrote :

Indeed, ubuntu images use local caching resolver in the form of unbound. It binds to localhost:53 (udpt/tcp, lo, 127.0.0.1) on the host. There is also systemd-resolver on 127.0.0.53:53 - not used by resolver.

Is it the default ubuntu server behavior here?

Revision history for this message
Radosław Piliszek (yoctozepto) wrote :

There is unbound running on CentOS images in CI as well - must be something else.

Revision history for this message
Radosław Piliszek (yoctozepto) wrote :

On CentOS:
dnsmasq: failed to create listening socket for 127.0.0.1: Address already in use

but bifrost could not care less. :-)

Revision history for this message
Radosław Piliszek (yoctozepto) wrote :
no longer affects: bifrost
Revision history for this message
Radosław Piliszek (yoctozepto) wrote :
Revision history for this message
Jan Horstmann (janhorstmann) wrote :

I have not fully understood the openstack gate yet, but from "https://opendev.org/openstack/project-config/src/branch/master/nodepool/elements/nodepool-base/README.rst" I gather that unbound is rather deeply entangled in it.
Maybe it would be possible to use dnsmasq's "--except-interface" parameter to exclude 127.0.0.1 as a workaround. At least when running in CI?

Revision history for this message
Radosław Piliszek (yoctozepto) wrote :

We get duplicate reports now: https://bugs.launchpad.net/kolla-ansible/+bug/1859503
so this case moved out of CI

summary: - CI: bifrost-ubuntu job fails on dnsmasq task in bifrost
+ bifrost on ubuntu fails on dnsmasq task in bifrost
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.