Comment 13 for bug 326718

Revision history for this message
floid (jkanowitz) wrote :

Hm. Specific to Heni's issue here:

Nothing really rings a bell for me right now, but `dig` working reliably is telling - packets must be getting through, so you're having some type of resolver or nsswitch trouble.

One way to achieve a little more determinism is to remove, or reorder, the "mdns4_minimal" priority in nsswitch.conf; for instance, change:
hosts: files mdns4_minimal [NOTFOUND=return] dns mdns4
to:
hosts: files dns mdns4

This removes the priority given to Avahi/mDNS lookups, which I see people have sometimes blamed for erratic resolver behavior. I doubt this is really the culprit, but it never hurts to simplify while troubleshooting. (One thing that has bothered me: what is the rationale for making [NOTFOUND=return] the default there? For desktop installs, shouldn't it be =continue, so if mDNS is somehow "accidentally" tried -- say in situations with both mDNS and a .local domain in the conventional DNS -- there's the best chance that the right thing happens in the end?)

...

For a basic primer on how DNS works, I've always found djb's writings helpful and to-the-point, if technical -- have a look at http://cr.yp.to/djbdns/intro-dns.html and the rest of the djbdns documentation if you need a quick understanding of how the protocol works; you'll probably have to pull out `tcpdump` or another sniffer to get to the bottom of your situation (and if using tcpdump, remember the -n option, or the act of tcpdumping will itself generate a lot of DNS lookups!).

Things have gotten more complicated recently, as resolvers have added more protections against poisoning (though I think any major changes would've come after glibc 2.7, unless "ubuntu4" includes a backported security patch -- I only see up to ubuntu3 in the changelog!), and as they try to be more IPv6-ready or otherwise deal with the reality of a world with both A and AAAA records; still, that should still get you started. I need to eat my own dogfood and analyze my own dumps there, but I was hoping someone who could read some of the more arcane aspects 'by eye' would come along and spare me nights flipping back and forth with reference material. :}

...

In your case, if it was working "until recently," and nothing obvious is haywire in packet dumps, it would be interesting to try to figure out when you last upgraded the libc6 package, what the previous version had been, and if downgrading to it magically solves your problem. I don't have a lot of experience with apt forensics, but a record of that change might be hidden somewhere. That's what the kernel kids call "bisecting," a fancy word for flipping between versions until you narrow down which one started causing problems!