getaddrinfo() returns -11 (EAI_SYSTEM) instead of -2

Bug #1154599 reported by Scott Moser
40
This bug affects 7 people
Affects Status Importance Assigned to Milestone
eglibc (Ubuntu)
Fix Released
Medium
Unassigned
Raring
Won't Fix
Undecided
Unassigned
python2.7 (Ubuntu)
Fix Released
Medium
Unassigned
Raring
Won't Fix
Undecided
Unassigned

Bug Description

$ cat lookup.py
#!/usr/bin/python
import sys, socket
names = ["slashdot.org", "foooooooooowhizzzzzzzz.com"]
if len(sys.argv) > 1:
   names = sys.argv[1:]
for iname in names:
    try:
        result = socket.getaddrinfo(iname, None, 0, 0, socket.SOCK_STREAM,
                                    socket.AI_CANONNAME)
        for (fam, stype, proto, cname, sockaddr) in result:
            sys.stdout.write("cname=%s, sockaddr=%s\n" % (cname, sockaddr))
    except socket.gaierror as error:
        sys.stderr.write("%s failed lookup" % iname)

$ python2.7 lookup.py
cname=slashdot.org, sockaddr=('216.34.181.45', 0)
Traceback (most recent call last):
  File "/tmp/x.py", line 10, in <module>
    socket.AI_CANONNAME)
socket.error: [Errno 2] No such file or directory

shell returned 1

$ dpkg -S /usr/bin/python2.7
python2.7-minimal: /usr/bin/python2.7
$ dpkg-query --show python2.7-minimal
python2.7-minimal 2.7.3-16ubuntu1

This is a behavioral change from quantal (2.7.3-5ubuntu4).

ProblemType: Bug
DistroRelease: Ubuntu 13.04
Package: python 2.7.3-10ubuntu5
ProcVersionSignature: Ubuntu 3.5.0-21.32-generic 3.5.7.1
Uname: Linux 3.5.0-21-generic x86_64
ApportVersion: 2.9.1-0ubuntu1
Architecture: amd64
Date: Wed Mar 13 09:52:55 2013
EcryptfsInUse: Yes
InstallationDate: Installed on 2011-10-19 (511 days ago)
InstallationMedia: Ubuntu 11.10 "Oneiric Ocelot" - Release amd64 (20111012)
MarkForUpload: True
ProcEnviron:
 TERM=xterm
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=en_US.UTF-8
 SHELL=/bin/bash
SourcePackage: python-defaults
UpgradeStatus: Upgraded to raring on 2013-01-07 (64 days ago)

Related branches

Revision history for this message
Scott Moser (smoser) wrote :
affects: python-defaults (Ubuntu) → python2.7 (Ubuntu)
Revision history for this message
Scott Kitterman (kitterman) wrote :

Works for me.

Revision history for this message
Scott Moser (smoser) wrote :

simplified a little bit:

#!/usr/bin/python
import sys, socket
names = ["slashdot.org.", "foooooooooowhizzzzzzzz.com."]
if len(sys.argv) > 1:
   names = sys.argv[1:]
for iname in names:
    try:
        sys.stdout.write("== %s ==\n" % iname)
        result = socket.getaddrinfo(iname, None)
        for (fam, stype, proto, cname, sockaddr) in result:
            sys.stdout.write(" sockaddr=%s\n" % str(sockaddr))
    except socket.gaierror as error:
        sys.stderr.write("%s failed lookup\n" % iname)

It must be local networking related, because it works for me elsewhere also.
But here:
== slashdot.org. ==
  sockaddr=('216.34.181.45', 0)
  sockaddr=('216.34.181.45', 0)
  sockaddr=('216.34.181.45', 0)
== foooooooooowhizzzzzzzz.com. ==
Traceback (most recent call last):
  File "/tmp/lookup.py", line 9, in <module>
    result = socket.getaddrinfo(iname, None)
socket.error: [Errno 2] No such file or directory

I'm not really sure what it is that differs between the 2 systems that are failing. I even tried installing dnsmasq oas my failing system was a desktop system.

Changed in python2.7 (Ubuntu):
importance: Undecided → Medium
Changed in python:
status: Unknown → New
Revision history for this message
Barry Warsaw (barry) wrote :

For me, this fails on Ubuntu 13.04 both with system Python and upstream's hg trunk, so it's unlikely to be an Ubuntu delta to Python. It does *not* fail for me with hg trunk on Wheezy. I've closed the Python bug as invalid.

Revision history for this message
Barry Warsaw (barry) wrote :

The problem is that getaddrinfo() returns different error codes on Ubuntu.

On Wheezy, getaddrinfo() returns -2 with errno set to 2 (ENOENT). On Raring, getaddrinfo() returns -11 (EAI_SYSTEM) with errno also set to 2 (ENOENT). Python has this bit of code to set the error type based on the return value of getaddrinfo() -- see Modules/socketmodule.c:

static PyObject *
set_gaierror(int error)
{
    PyObject *v;

#ifdef EAI_SYSTEM
    /* EAI_SYSTEM is not available on Windows XP. */
    if (error == EAI_SYSTEM)
        return set_error();
#endif

#ifdef HAVE_GAI_STRERROR
    v = Py_BuildValue("(is)", error, gai_strerror(error));
#else
    v = Py_BuildValue("(is)", error, "getaddrinfo failed");
#endif
    if (v != NULL) {
        PyErr_SetObject(socket_gaierror, v);
        Py_DECREF(v);
    }

    return NULL;
}

So now it should be obvious why a different error type is getting raised in Python (i.e. socket.error vs socket.gaierror). This is likely a change in glibc's getaddrinfo() and nothing to do with Python.

Wheezy has libc6 2.13-38, Raring has libc6 2.17-0ubuntu4

Revision history for this message
Barry Warsaw (barry) wrote :
Revision history for this message
Barry Warsaw (barry) wrote :

Reproducible with pure C, so this has nothing to do with Python.

no longer affects: python
affects: python2.7 (Ubuntu) → eglibc (Ubuntu)
Revision history for this message
Barry Warsaw (barry) wrote :

$ gcc -g -o gaitest gaitest.c
./gaitest

Wheezy:

% ./gaitest
status = -2, errno = 2

Raring:

% ./gaitest
status = -11, errno = 2

summary: - dns lookup failure raises socket.error not socket.gaierror
+ getaddrinfo() returns -11 instead of -2
summary: - getaddrinfo() returns -11 instead of -2
+ getaddrinfo() returns -11 (EAI_SYSTEM) instead of -2
Matthias Klose (doko)
Changed in python2.7 (Ubuntu):
importance: Undecided → Medium
Revision history for this message
Barry Warsaw (barry) wrote :

Yep, experimental's 2.17-0experimental2 prints the same output as raring.

Revision history for this message
Adam Conrad (adconrad) wrote :

Colour me confused, but this seems to be behaving exactly as the manpage suggests it can. Where is the actual bug in eglibc here? It seems monumentally unclever for python to raise a different exception type depending on how gai() returns, which seems to be what's causing the headache here.

Revision history for this message
Barry Warsaw (barry) wrote : Re: [Bug 1154599] Re: getaddrinfo() returns -11 (EAI_SYSTEM) instead of -2

On Apr 03, 2013, at 03:04 PM, Adam Conrad wrote:

>Colour me confused, but this seems to be behaving exactly as the manpage
>suggests it can. Where is the actual bug in eglibc here?

It's a change in behavior somewhere between 2.13 and 2.17.

>It seems monumentally unclever for python to raise a different exception type
>depending on how gai() returns, which seems to be what's causing the headache
>here.

Except that it's been that way in Python for *years*. It might be worth
changing this for 3.4, but it can't be changed in anything earlier.

OTOH, by Python 3.3 it probably makes no sense to catch socket.error directly
because of the exception hierarchy reorganization in PEP 3151. Now,
socket.error *is* OSError and socket.gaierror is a subclass of OSError, so
going forward, it makes sense just to catch OSError when you don't care about
the distinction.

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in eglibc (Ubuntu):
status: New → Confirmed
Changed in python2.7 (Ubuntu):
status: New → Confirmed
Revision history for this message
Martin Capitanio (capnm) wrote :

Note: the glibc original is already fixed.
http://sourceware.org/bugzilla/show_bug.cgi?id=15339

Revision history for this message
Thomas Hood (jdthood) wrote :
Download full text (5.4 KiB)

(Copied here from a post by me to debian-devel.)

Executive summary: The getaddrinfo() returns different values
depending on the OS and on nsswitch.conf settings, making it
very difficult to use getaddrinfo() return values to deciding how
to handle an error.

Here are the results of further experiments with getaddrinfo().
I am using the attached x.c program. It tries to look up the
valid domain name 'www.google.com' and an invalid name
four times:
* once with empty /etc/resolv.conf (in which case the resolver tries 127.0.0.1:53)
* once with /etc/resolv.conf pointing to a working nameserver on my LAN
* once with empty /etc/resolv.conf (again)
* once with /etc/resolv.conf pointing to an IP address where there is no working nameserver

OS is Debian 7.0.

================================
# ./a.out
Making resolv.conf empty
Results of looking up www.google.com: status = -2, errno = 2
Results of looking up a bogus name: status = -2, errno = 2
Writing nameserver option to resolv.conf
Results of looking up www.google.com: status = 0, errno = 101
Results of looking up a bogus name: status = -2, errno = 2
Making resolv.conf empty
Results of looking up www.google.com: status = -2, errno = 2
Results of looking up a bogus name: status = -2, errno = 2
Writing incorrect nameserver option to resolv.conf
Results of looking up www.google.com: status = -2, errno = 2
Results of looking up a bogus name: status = -2, errno = 2
================================

As I saw before, status is always -2 (EAI_NONAME). The manpage
doesn't say that errno is significant in that case. (It is significant
when status is -11.)

Helmut got different results. Is the difference between my machine
and Helmut's machine attributable to some diff in nsswitch.conf,
perhaps? I have:

    hosts: files mdns4_minimal [NOTFOUND=return] dns mdns4

I tested next with

    hosts: dns

and got different results.

================================
Making resolv.conf empty
Results of looking up www.google.com: status = -2, errno = 111
Results of looking up a bogus name: status = -2, errno = 111
Writing nameserver option to resolv.conf
Results of looking up www.google.com: status = 0, errno = 101
Results of looking up a bogus name: status = -2, errno = 101
Making resolv.conf empty
Results of looking up www.google.com: status = -2, errno = 111
Results of looking up a bogus name: status = -2, errno = 111
Writing incorrect nameserver option to resolv.conf
Results of looking up www.google.com: status = -2, errno = 111
Results of looking up a bogus name: status = -2, errno = 111
================================

Now errno for empty or incorrect resolv.conf is 111 (ECONNREFUSED ).
And with correct resolv.conf and bogus domain name errno is 101
(ENETUNREACH). That doesn't make too much sense but as I
said I don't think we are supposed to pay attention to errno if
status is -2.

Next I ran the program on Ubuntu 13.04 with "hosts: dns".

================================
Making resolv.conf empty
Results of looking up www.google.com: status = -11, errno = 111
Results of looking up a bogus name: status = -11, errno = 111
Writing nameserver option to resolv.conf
Results of looking up www.google.com: status = ...

Read more...

Revision history for this message
Thomas Hood (jdthood) wrote :
Revision history for this message
Thomas Hood (jdthood) wrote :

Martin Capitanio (capnm) wrote on 2013-07-04 (#14):
> Note: the glibc original is already fixed.
> http://sourceware.org/bugzilla/show_bug.cgi?id=15339

The patch in question was included in (Debian) eglibc 2.17-7.

Testing Debian 7.0 with libc6 2.17-7 I find that the status is -2 if no nameserver can be reached or the name does not exist and -3 if the external interface is deconfigured.

Testing Ubunt 13.04 with libc6 2.17-0ubuntu5 I find that the status is -11 if no nameserver can be reached, -2 if the name does not exist and -11 if the external interface is deconfigured.

See also the discussion "getaddrinfo() return value chaos" on debian-devel.

    http://lists.debian.org/debian-devel/2013/07/msg00235.html

Revision history for this message
Adam Conrad (adconrad) wrote :

This was fixed in saucy in eglibc 2.17-7ubuntu1

Changed in eglibc (Ubuntu):
status: Confirmed → Fix Released
Revision history for this message
Thomas Hood (jdthood) wrote :
Download full text (3.2 KiB)

Running my test program I find that the new eglibc returns status -2 where the old one returned -11.

New situation: As with Debian 7.0 with libc6 2.17-7, with the default nsswitch.conf, the returned status / errno is now -2 / 2 (EAI_NONAME / ENOENT) if no nameserver can be reached or the name does not exist.

As before, shortening the value of "hosts:" in nsswitch.conf to just "dns" changes errno from 2 to 111 (ECONNREFUSED) when resolv.conf is empty, 110 (ETIMEDOUT) where the indicated external nameserver can't be reached and 101 (ENETUNREACH) where the external nameserver says that name does not exist... although note that errno isn't supposed to be significant for any return status except -11 (EAI_SYSTEM); returned status / errno is always -3 / 11 (EAI_AGAIN / EAGAIN) if the external interface is deconfigured, even if /etc/resolv.conf is empty — which surprizes me a bit.

Here is the program output for the indicated values of "hosts:" in /etc/nsswitch.conf. If you compare these results with earlier results I posted, note that I changed the order of the tests in the program.

# # With "hosts: files mdns4_minimal [NOTFOUND=return] dns mdns4"
# ./a.out
Making resolv.conf empty
Results of looking up www.google.com: status = -2, errno = 2
Results of looking up a bogus name: status = -2, errno = 2
Writing incorrect nameserver option to resolv.conf
Results of looking up www.google.com: status = -2, errno = 2
Results of looking up a bogus name: status = -2, errno = 2
Making resolv.conf empty
Results of looking up www.google.com: status = -2, errno = 2
Results of looking up a bogus name: status = -2, errno = 2
Writing correct nameserver option to resolv.conf
Results of looking up www.google.com: status = 0, errno = 101
Results of looking up a bogus name: status = -2, errno = 2

# # With "hosts: dns"
# ./a.out
Making resolv.conf empty
Results of looking up www.google.com: status = -2, errno = 111
Results of looking up a bogus name: status = -2, errno = 111
Writing incorrect nameserver option to resolv.conf
Results of looking up www.google.com: status = -2, errno = 110
Results of looking up a bogus name: status = -2, errno = 110
Making resolv.conf empty
Results of looking up www.google.com: status = -2, errno = 111
Results of looking up a bogus name: status = -2, errno = 111
Writing correct nameserver option to resolv.conf
Results of looking up www.google.com: status = 0, errno = 101
Results of looking up a bogus name: status = -2, errno = 101

# # After taking down the external network interface
# ./a.out
Making resolv.conf empty
Results of looking up www.google.com: status = -3, errno = 11
Results of looking up a bogus name: status = -3, errno = 11
Writing incorrect nameserver option to resolv.conf
Results of looking up www.google.com: status = -3, errno = 11
Results of looking up a bogus name: status = -3, errno = 11
Making resolv.conf empty
Results of looking up www.google.com: status = -3, errno = 11
Results of looking up a bogus name: status = -3, errno = 11
Writing correct nameserver option to resolv.conf
Results of looking up www.google.com: status = -3, errno = 11
Results of looking up a bogus name: status = -3, errno = 11

# dpkg -l libc...

Read more...

Revision history for this message
Thomas Hood (jdthood) wrote :

Ha. And if I restore "hosts: files mdns4_minimal [NOTFOUND=return] dns mdns4" and take down network interfaces then the status / errno is different again: it's -5 / 110 (EAI_FAMILY / ETIMEDOUT). Onzin, as the Dutch say.

# ./a.out
Making resolv.conf empty
Results of looking up www.google.com: status = -5, errno = 110
Results of looking up a bogus name: status = -5, errno = 110
Writing incorrect nameserver option to resolv.conf
Results of looking up www.google.com: status = -5, errno = 110
Results of looking up a bogus name: status = -5, errno = 110
Making resolv.conf empty
Results of looking up www.google.com: status = -5, errno = 110
Results of looking up a bogus name: status = -5, errno = 110
Writing correct nameserver option to resolv.conf
Results of looking up www.google.com: status = -5, errno = 110
Results of looking up a bogus name: status = -5, errno = 110

Revision history for this message
Thomas Hood (jdthood) wrote :

What the return values of getaddrinfo() should be is being discussed in an upstream bug report: http://sourceware.org/bugzilla/show_bug.cgi?id=15726

Revision history for this message
Thomas Hood (jdthood) wrote :

Bug still present.

$ dpkg -l libc6
[...]
ii libc6:amd64 2.17-93ubuntu4 amd64

$ grep hosts /etc/nsswitch.conf
hosts: files mdns4_minimal [NOTFOUND=return] dns mdns4

$ sudo su
# ./a.out
Making resolv.conf empty
Results of looking up www.google.com: status = -2, errno = 2
Results of looking up a bogus name: status = -2, errno = 2
Writing correct nameserver option to resolv.conf
Results of looking up www.google.com: status = 0, errno = 101
Results of looking up a bogus name: status = -2, errno = 2
Writing incorrect nameserver option to resolv.conf
Results of looking up www.google.com: status = -2, errno = 2
Results of looking up a bogus name: status = -2, errno = 2

# ifconfig eth0 down
# ./a.out
Making resolv.conf empty
Results of looking up www.google.com: status = -5, errno = 110
Results of looking up a bogus name: status = -5, errno = 110
Writing correct nameserver option to resolv.conf
Results of looking up www.google.com: status = -5, errno = 110
Results of looking up a bogus name: status = -5, errno = 110
Writing incorrect nameserver option to resolv.conf
Results of looking up www.google.com: status = -5, errno = 110
Results of looking up a bogus name: status = -5, errno = 110

# cat x.c
#include <sys/types.h>
#include <sys/socket.h>
#include <netdb.h>
#include <errno.h>
#include <stdio.h>

struct addrinfo *res;

void check_google()
{
    int status;
    status = getaddrinfo("www.google.com", NULL, NULL, &res);
    printf("Results of looking up www.google.com: status = %d, errno = %d\n", status, errno);
    status = getaddrinfo("sjfkdsjfswfloo0f02938sjf28398sd.com", NULL, NULL, &res);
    printf("Results of looking up a bogus name: status = %d, errno = %d\n", status, errno);
}

int main()
{
    FILE *fp;

    printf("Making resolv.conf empty\n");
    fp = fopen("/etc/resolv.conf", "w+"); fclose(fp);
    sleep(1);
    check_google();

    printf("Writing nameserver option to resolv.conf\n");
    fp = fopen("/etc/resolv.conf", "w+"); fprintf(fp, "nameserver 193.67.79.39\n"); fclose(fp);
    sleep(1);
    check_google();

    printf("Writing incorrect nameserver option to resolv.conf\n");
    fp = fopen("/etc/resolv.conf", "w+"); fprintf(fp, "nameserver 192.168.5.4\n"); fclose(fp);
    sleep(1);
    check_google();
}

Revision history for this message
Thomas Hood (jdthood) wrote :

I created a new report, bug #1295229.

Revision history for this message
Thomas Hood (jdthood) wrote :

By the way, I was wrong earlier: return value -5 is not EAI_FAMILY but EAI_NODATA.

$ grep -r EAI_ /usr/include
/usr/include/netdb.h:# define EAI_BADFLAGS -1 /* Invalid value for `ai_flags' field. */
/usr/include/netdb.h:# define EAI_NONAME -2 /* NAME or SERVICE is unknown. */
/usr/include/netdb.h:# define EAI_AGAIN -3 /* Temporary failure in name resolution. */
/usr/include/netdb.h:# define EAI_FAIL -4 /* Non-recoverable failure in name res. */
/usr/include/netdb.h:# define EAI_FAMILY -6 /* `ai_family' not supported. */
/usr/include/netdb.h:# define EAI_SOCKTYPE -7 /* `ai_socktype' not supported. */
/usr/include/netdb.h:# define EAI_SERVICE -8 /* SERVICE not supported for `ai_socktype'. */
/usr/include/netdb.h:# define EAI_MEMORY -10 /* Memory allocation failure. */
/usr/include/netdb.h:# define EAI_SYSTEM -11 /* System error returned in `errno'. */
/usr/include/netdb.h:# define EAI_OVERFLOW -12 /* Argument buffer overflow. */
/usr/include/netdb.h:# define EAI_NODATA -5 /* No address associated with NAME. */
/usr/include/netdb.h:# define EAI_ADDRFAMILY -9 /* Address family for NAME not supported. */
/usr/include/netdb.h:# define EAI_INPROGRESS -100 /* Processing request in progress. */
/usr/include/netdb.h:# define EAI_CANCELED -101 /* Request canceled. */
/usr/include/netdb.h:# define EAI_NOTCANCELED -102 /* Request not canceled. */
/usr/include/netdb.h:# define EAI_ALLDONE -103 /* All requests done. */
/usr/include/netdb.h:# define EAI_INTR -104 /* Interrupted by a signal. */
/usr/include/netdb.h:# define EAI_IDN_ENCODE -105 /* IDN encoding failed. */

Changed in python2.7 (Ubuntu):
status: Confirmed → Fix Released
Revision history for this message
Rolf Leggewie (r0lf) wrote :

raring has seen the end of its life and is no longer receiving any updates. Marking the raring task for this ticket as "Won't Fix".

Changed in eglibc (Ubuntu Raring):
status: New → Won't Fix
Changed in python2.7 (Ubuntu Raring):
status: New → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.