nova dns lookups can block the nova api process leading to 503 errors.

Bug #1964149 reported by sean mooney
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Medium
sean mooney
Train
In Progress
Undecided
Unassigned
Ussuri
In Progress
Undecided
Unassigned
Victoria
Fix Released
Undecided
Unassigned
Wallaby
Fix Released
Undecided
Unassigned
Xena
Fix Released
Undecided
Unassigned

Bug Description

we currently have 4 possibly related downstream bugs whereby DNS lookups can
result in 503 errors as we do not monkey patch green DNS and that can result in blocking behavior.

specifically we have seen callses to socket.getaddrinfo in py-amqp block the API
when using ipv6.

https://bugzilla.redhat.com/show_bug.cgi?id=2037690
https://bugzilla.redhat.com/show_bug.cgi?id=2050867
https://bugzilla.redhat.com/show_bug.cgi?id=2051631
https://bugzilla.redhat.com/show_bug.cgi?id=2056504

copying a summary of the rca

from one of the bugs

What happens:

- A request comes in which requires rpc, so a new connection to rabbitmq is to be established

- The hostname(s) from the transport_url setting are ultimately passed to py-amqp, which attempts to resolve the hostname to an ip address so it can set up the underlying socket and connect

- py-amqp explicitly tries to resolve with AF_INET first and then only if that fails, then it tries with AF_INET6[1]

- The customer environment is primarily IPv6. Attempting to resolve the hostname via AF_INET fails nss_hosts (the /etc/hosts file only have IPv6 addrs), and falls through to nss_dns

- Something about the customer DNS infrastructure is slow, so it takes a long time (~10 seconds) for this IPv4-lookup to fail.

- py-amqp finally tries with AF_INET6 and the hostname is resolved immediately via nss_hosts because the entry is in the /etc/hosts

Critically, because nova explicitly disables greendns[2] with eventlet, the *entire* nova-api worker is blocked during the duration of the slow name resolution, because socket.getaddrinfo is a blocking call into glibc.

[1] https://github.com/celery/py-amqp/blob/1f599c7213b097df07d0afd7868072ff9febf4da/amqp/transport.py#L155-L208
[2] https://github.com/openstack/nova/blob/master/nova/monkey_patch.py#L25-L36

nova currently disables greendns monkeypatch because of a very old bug on centos 6 on python 2.6 and the havana release of nova https://bugs.launchpad.net/nova/+bug/1164822

ipv6 support was added in v0.17 in the same release that added python 3 support back in 2015
https://github.com/eventlet/eventlet/issues/8#issuecomment-75490457

so we should not need to work around the lack of ipv6 support anymore.
https://review.opendev.org/c/openstack/nova/+/830966

Revision history for this message
sean mooney (sean-k-mooney) wrote :

nit: technially this could block any dns lookup not just in the API such as in the compute agent when making calls to neuton exctra.

Changed in nova:
assignee: nobody → sean mooney (sean-k-mooney)
importance: Undecided → Medium
status: New → Triaged
tags: added: api yoga-rc-potential
Changed in nova:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.opendev.org/c/openstack/nova/+/830966
Committed: https://opendev.org/openstack/nova/commit/fe1ebe69f358cbed62434da3f1537a94390324bb
Submitter: "Zuul (22348)"
Branch: master

commit fe1ebe69f358cbed62434da3f1537a94390324bb
Author: Sean Mooney <email address hidden>
Date: Fri Feb 25 11:09:50 2022 +0000

    reenable greendns in nova.

    Back in the days of centos 6 and python 2.6 eventlet
    greendns monkeypatching broke ipv6. As a result nova
    has run without greendns monkey patching ever since.
    This removes that old workaround allowing modern
    eventlet to use greendns for non blocking dns lookups.

    Closes-Bug: #1964149
    Change-Id: Ia511879d2f5f50a3f63d180258abccf046a7264e

Changed in nova:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 25.0.0.0rc1

This issue was fixed in the openstack/nova 25.0.0.0rc1 release candidate.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/xena)

Fix proposed to branch: stable/xena
Review: https://review.opendev.org/c/openstack/nova/+/833411

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/wallaby)

Fix proposed to branch: stable/wallaby
Review: https://review.opendev.org/c/openstack/nova/+/833435

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/victoria)

Fix proposed to branch: stable/victoria
Review: https://review.opendev.org/c/openstack/nova/+/833436

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/ussuri)

Fix proposed to branch: stable/ussuri
Review: https://review.opendev.org/c/openstack/nova/+/833437

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/train)

Fix proposed to branch: stable/train
Review: https://review.opendev.org/c/openstack/nova/+/833438

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/xena)

Reviewed: https://review.opendev.org/c/openstack/nova/+/833411
Committed: https://opendev.org/openstack/nova/commit/a913ab1aabb827771e54ef64b579cebe44ad53d1
Submitter: "Zuul (22348)"
Branch: stable/xena

commit a913ab1aabb827771e54ef64b579cebe44ad53d1
Author: Sean Mooney <email address hidden>
Date: Fri Feb 25 11:09:50 2022 +0000

    reenable greendns in nova.

    Back in the days of centos 6 and python 2.6 eventlet
    greendns monkeypatching broke ipv6. As a result nova
    has run without greendns monkey patching ever since.
    This removes that old workaround allowing modern
    eventlet to use greendns for non blocking dns lookups.

    Closes-Bug: #1964149
    Change-Id: Ia511879d2f5f50a3f63d180258abccf046a7264e
    (cherry picked from commit fe1ebe69f358cbed62434da3f1537a94390324bb)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/wallaby)

Reviewed: https://review.opendev.org/c/openstack/nova/+/833435
Committed: https://opendev.org/openstack/nova/commit/edb8bcb0294fb1488c49ad76f2422378d0c495eb
Submitter: "Zuul (22348)"
Branch: stable/wallaby

commit edb8bcb0294fb1488c49ad76f2422378d0c495eb
Author: Sean Mooney <email address hidden>
Date: Fri Feb 25 11:09:50 2022 +0000

    reenable greendns in nova.

    Back in the days of centos 6 and python 2.6 eventlet
    greendns monkeypatching broke ipv6. As a result nova
    has run without greendns monkey patching ever since.
    This removes that old workaround allowing modern
    eventlet to use greendns for non blocking dns lookups.

    Closes-Bug: #1964149
    Change-Id: Ia511879d2f5f50a3f63d180258abccf046a7264e
    (cherry picked from commit fe1ebe69f358cbed62434da3f1537a94390324bb)
    (cherry picked from commit a913ab1aabb827771e54ef64b579cebe44ad53d1)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 23.2.2

This issue was fixed in the openstack/nova 23.2.2 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 24.2.0

This issue was fixed in the openstack/nova 24.2.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (stable/train)

Change abandoned by "Elod Illes <email address hidden>" on branch: stable/train
Review: https://review.opendev.org/c/openstack/nova/+/833438
Reason: stable/train branch of nova projects' have been tagged as End of Life. All open patches have to be abandoned in order to be able to delete the branch.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/victoria)

Reviewed: https://review.opendev.org/c/openstack/nova/+/833436
Committed: https://opendev.org/openstack/nova/commit/f02099ea537aa3feba15cfea900a58b5c593f343
Submitter: "Zuul (22348)"
Branch: stable/victoria

commit f02099ea537aa3feba15cfea900a58b5c593f343
Author: Sean Mooney <email address hidden>
Date: Fri Feb 25 11:09:50 2022 +0000

    reenable greendns in nova.

    Back in the days of centos 6 and python 2.6 eventlet
    greendns monkeypatching broke ipv6. As a result nova
    has run without greendns monkey patching ever since.
    This removes that old workaround allowing modern
    eventlet to use greendns for non blocking dns lookups.

    Closes-Bug: #1964149
    Change-Id: Ia511879d2f5f50a3f63d180258abccf046a7264e
    (cherry picked from commit fe1ebe69f358cbed62434da3f1537a94390324bb)
    (cherry picked from commit a913ab1aabb827771e54ef64b579cebe44ad53d1)
    (cherry picked from commit edb8bcb0294fb1488c49ad76f2422378d0c495eb)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (stable/ussuri)

Change abandoned by "Elod Illes <email address hidden>" on branch: stable/ussuri
Review: https://review.opendev.org/c/openstack/nova/+/833437
Reason: stable/ussuri branch of openstack/nova transitioned to End of Life and is about to be deleted. To be able to do that, all open patches need to be abandoned.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova victoria-eom

This issue was fixed in the openstack/nova victoria-eom release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.