Mozilla Thunderbird Mail and News

Thunderbird complains there is no connection on resume from suspend, when there is a connection

Reported by Michael Doube on 2010-05-23
80
This bug affects 18 people
Affects Status Importance Assigned to Milestone
Mozilla Thunderbird
Confirmed
Medium
thunderbird (Ubuntu)
Low
Unassigned

Bug Description

Binary package hint: thunderbird

When I resume from suspend, network manager quickly re-establishes a WiFi internet connection. With an active connection, I try to check my mail with Thunderbird, but it complains that it can't connect to the IMAP server. This seems to happen after longer suspend periods because if I try to replicate the behaviour by quickly suspending and resuming (suspend of a few seconds), I get no warning.

ProblemType: Bug
DistroRelease: Ubuntu 10.04
Package: thunderbird 3.0.4+nobinonly-0ubuntu4
ProcVersionSignature: Ubuntu 2.6.32-22.33-generic 2.6.32.11+drm33.2
Uname: Linux 2.6.32-22-generic x86_64
Architecture: amd64
Date: Sun May 23 12:26:28 2010
InstallationMedia: Ubuntu 10.04 "Lucid Lynx" - Alpha amd64 (20100412)
ProcEnviron:
 LANG=en_GB.utf8
 SHELL=/bin/bash
SourcePackage: thunderbird

Michael Doube (michael-doube) wrote :
Micah Gersten (micahg) wrote :

Thank you for reporting this to Ubuntu. Does it then work on the second try? This usually works for me.

Changed in thunderbird (Ubuntu):
status: New → Incomplete
Michael Doube (michael-doube) wrote :

Michah, yes, I have to flick back to another folder, say "Sent Items", then to Inbox for Thunderbird to stop complaining and become aware that there is an active connection, as well as dismiss all the "no connection" warning dialogs (painful if there is more than one account it is checking).

Micah Gersten (micahg) wrote :

I can confirm this. This should probably be upstreamed.

Changed in thunderbird (Ubuntu):
status: Incomplete → Confirmed

User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-GB; rv:1.9.2.3) Gecko/20100423 Ubuntu/10.04 (lucid) Firefox/3.6.3
Build Identifier: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.9) Gecko/20100423 Thunderbird/3.0.4

When I resume from suspend, network manager quickly re-establishes a WiFi internet connection. With an active connection, I try to check my mail with Thunderbird, but it complains that it can't connect to the IMAP server. This seems to happen after longer suspend periods because if I try to replicate the behaviour by quickly suspending and resuming (suspend of a few seconds), I get no warning.

This could be a duplicate of the reported fixed bug
https://bugzilla.mozilla.org/show_bug.cgi?id=473483
although that one was reported against Mac OSX.

Reproducible: Always

Steps to Reproduce:
1. Open Thunderbird, check mail
2. Suspend laptop, let it sleep for a while
3. Resume laptop, wait for connection, go to inbox
4. Thunderbird complains there is no connection
5. Mail can be read by going to some other folder, then back to Inbox
Actual Results:
Thunderbird complains there is no connection

Expected Results:
Thunderbird checks mail without complaining

Reported on Launchpad:
https://bugs.launchpad.net/ubuntu/+source/thunderbird/+bug/584529?comments=all

Michael Doube (michael-doube) wrote :

This could be a re-emergence of https://bugzilla.mozilla.org/show_bug.cgi?id=473483, which was meant to be fixed more than a year ago

Changed in thunderbird:
status: Unknown → New

Anything in Tools->Error console ?

If you launch tb from a terminal anything on that terminal's stdout ?

Nothing in stdout

In Tools->Error Console I see this, several times:
Error: [Exception... "update.locale file doesn't exist in either the XCurProcD or GreD directories" nsresult: "0x80520012 (NS_ERROR_FILE_NOT_FOUND)" location: "JS frame :: file:///usr/lib/thunderbird-3.0.4/components/nsUpdateService.js :: getLocale :: line 549" data: no]
Source File: file:///usr/lib/thunderbird-3.0.4/components/nsUpdateService.js
Line: 549

Will keep an eye on this log.

I can confirm this bug. It happens on first click on "receive messages" every time I come back from suspend after "a while" as Michael notes. It happens with POP accounts, not only IMAP. Second click on "receive messages" works fine.

Perhaps this "while" is the "check for messages every X minutes" time?

No relevant information on stdout or error console.

I have Mozilla/5.0 (X11; U; Linux i686; es-ES; rv:1.9.1.10) Gecko/20100520 and ThunderBird 3.0.5

Sorry, seems a dup.

Micah Gersten (micahg) wrote :

Marking this Triaged as we have an upstream bug.

Changed in thunderbird (Ubuntu):
importance: Undecided → Low
status: Confirmed → Triaged

It seems no one's working on this one even when it is quite annoying, but at least status should be changed to "confirmed"

(In reply to comment #6)
> It seems no one's working on this one even when it is quite annoying, but at
> least status should be changed to "confirmed"

Why we don't reproduce - so mightsomething external causing the issue.

Michał Gołębiowski (mgol) wrote :

Upstream cannot reproduce it, so maybe it's Ubuntu's fault?

So maybe some OS pattern? I use Ubuntu 10.04 amd64 with Thunderbird 3.0.6.

I use opensuse 11.3 TB 3.1.2, NetworkManager and nm-applet as frontend

Ubuntu 10.04 amd64 TB 3.0.6 here too.

Changed in thunderbird:
importance: Unknown → Medium
Stefan Oschkera (steff35) wrote :

Ubuntu 10.04, issue persists also in TB version 3.0.10

https://bugzilla.redhat.com/show_bug.cgi?id=510005
I believe Fedora 14 also has this problem. On the other hand, firefox in Fedora can detect the connection state and go to offline mode, so this seems possible.

I can confirm that this also happens with Fedora 14/TB 3.1.6 using NetworkManager/nm-applet.

After resuming from suspend I always get the "Failed to connect to server" pop up for every account the first time I hit "Get Mail" for that account, unless I give it enough time to make an automatic background check (which, by the way, seems to take a *very* long time after resuming - a lot longer than the configured 10 minute check interval, which is why I always need to click "Get Mail" in the first place). As others have noted, it does not happen for short suspends.

Maybe someone could at least provide some ideas on how this could be diagnosed further?

A nsSocketTransport:5 log might help .

Matthias G. (matgnt) wrote :

Seems to be solved in the latest thunderbird update for 10.04 (thunderbird 3.1.7+build3+nobinonly-0ubuntu0.1).

Matthias G. (matgnt) wrote :

To correct myself, it's not really solved, but the annoying popup message is gone. The message now (sometimes) appears through the desktop notification system in the top right corner.

Stefan Oschkera (steff35) wrote :

As mentioned above, Thunderbird has been recently updated to 3.1.7 in Lucid (10.04 LTS). I confirm the observations of Matthias (#23): the issue still persists, though it's now less annoying, as the error message is handled/displayed by Ubuntu's notification subsystem, which at least doesn't require to take any further action by the user. Personally, however, I wouldn't consider the issue as resolved.

Michael Doube (michael-doube) wrote :

10.10 has handled this in the notification area for a while now but I agree with Stefan, the underlying issue is not solved. Thunderbird is still complaining and I still have to click on an Inbox to kick it into action.

Created attachment 499753
nsSocketTransport:5 log

nsSocketTransport:5 log demonstrating the problem. The log was taken from a TB instance with a single IMAP account.
2010-12-25 21:58:42.339538 TB starts
2010-12-26 00:47:59.055689 Last log entry before suspend
2010-12-26 09:02:30.261338 First log entry after wakeup
2010-12-26 09:02:53.550120 First "Get Mail" click (fails) (?)
2010-12-26 09:02:57.739030 Second "Get Mail" click (successful) (?)
2010-12-26 09:03:08.135790 Third "Get Mail" click (successful) (?)

I don't have any experience of interpreting nspr logs, so take this with a grain of salt...

It seems that almost immediately after wakeup, TB tries to re-establish the connection but hostname lookup fails (NS_ERROR_UNKNOWN_HOST). This probably makes sense because the network is not yet up at that time.

After a while, I click "Get Mail". TB tries to open a new connection, but immediately fails again with NS_ERROR_UNKNOWN_HOST. It appears that this result is cached from the previous attempt.

I will add resolver logging too to see if that gives additional any insight.

Created attachment 499762
nsSocketTransport:5,nsHostResolver:5 log

Another log adding nsHostResolver:5. The behaviour was a little bit different this time in that it actually updated the mailbox before I got a chance to hit "Get Mail" a second time. First time, though, I got the error message as usual.
AFAICT it confirms that nsHostResolver is indeed caching the initial lookup failure (timestamp 2010-12-26 15:54:33.688102).
It appears that nsHostResolver treats any lookup failure as NS_ERROR_UNKNOWN_HOST which is perhaps a bit too naive. Ideally, it would distinguish between a communication failure and an actual NXDOMAIN condition, and avoid caching the former, or at least use a much shorter cache timeout.

does log show anything interesting ?

My theory is, also as Christer mentioned in comment 16, that we try to connect immediately after wake up, but since cached host name has expired we do complete name resolving again. But the system is not up that soon (few milliseconds after the wake up) to return some result. We should then not cache the unknown-host state.

I'm not a linux developer (I don't see this behavior on a Windows machine, but also may be just related to record expiration times) and also I'm not directly a DNS code maintainer (however, I can do the fix my self).

Adding Michal, since he may know better what type of error state we get from getaddrinfo w/o a net connection up on a linux system.

Actually, as I tried to explain, what's going on is this;

1. After wakeup TB immediately tries to reconnect
2. TB calls nsHostResolver to resolve the server name which immediately gets an error because at this time, the network interface is not yet up
3. nsHostResolver treats this error (in fact, any error) as an unknown host, and puts that that "fact" in its cache
4. TB recognizes that the request is not initiated by the user and therefore suppresses any UI level error message

5. User comes around and clicks "Get Mail"
6. TB again calls nsHostResolver to resolve the server name, which returns "unknown host" from its cache
7. This time the request IS initiated by the user so an error message is shown
8. Not until the cache entry expires can the user successfully reconnect

Having said that, it seems that this problem is gone in TB 10.0!

/C

(In reply to Christer Palm from comment #19)
> Having said that, it seems that this problem is gone in TB 10.0!

Thanks Christer. Based on this I close this as WFM.

Changed in thunderbird:
status: New → Invalid
mlaverdiere (mlaverdiere) wrote :

In my case, this was still present with TB 10 under Ubuntu 11.10 and now, under 12.04 beta. So I'm not sure it should be marked as invalid...

User Agent: Mozilla/5.0 (X11; Linux i686) AppleWebKit/535.11 (KHTML, like Gecko) Chrome/17.0.963.79 Safari/535.11

Steps to reproduce:

Make sure you're not connected to network (in my case it's wifi, dunnow if this is relevant)
Open Thunderbird
Connect to network.
Check that the network connection works by opening a web page in a browser
Go to Thunderbird and click "get mail"

Actual results:

"looked up gmail.com" (or whatever is the mail server domain) appears on the status bar and stays there for a while. Then it disappears.
Mail is not checked, and no error message is shown.
This happens systematically, 100% of the times.

Then I click "get mail" again, and it ALWAYS works the second time.

Expected results:

thunderbird should have connected to the mail server and checked for new messages the first time.

This issue has always existed ever since I've used Thunderbird, back to version 3 or something. On older versions it was only _slightly_ better, in that you would get a baloon notification (or is it called a toast?) saying "unable to connect" and a (wrong) message on the status bar saying "no messages to download" on the first try.

(In reply to matteo sisti sette from comment #0)
> Actual results:
> "looked up gmail.com" (or whatever is the mail server domain) appears on
> the status bar and stays there for a while. Then it disappears.
> Mail is not checked, and no error message is shown.
> This happens systematically, 100% of the times.
> Then I click "get mail" again, and it ALWAYS works the second time.

This phenomenon can occur if IPv6 relevant problem exists in your environment.
- first DNS lookup(IPv6 address resolution) takes very long
  => timeout, Tb is unable to login.
- upon second DNS lookup by Tb, IPv6 address resolution is alrweady done by
  first request, then second DNS look up ends within a shor period.
This started to occur from Tb 3, because defaullt of network.dns.disableIPv6 was changef rom true to false by Tb 3.

What happens if network.dns.disableIPv6=true is set?
(restart Tb after setting change to avoid needless problems)

Because Wifi, wireless connection is established upon first network request from PC. It may take long and first DNS Lookup by Tb may timeout.

How long does "first ping(or tracert) imap.gmail.com after re-boot" take in your environment?
(1) Re-boot PC, (2) First "ping imap.gmail.com", (3) Second "ping imap.gmail.com", (4) Start Tb with network.dns.disableIPv6=false(default) or true, (5) Gmail IMAP folder access, Get Msgs etc.

I can confirm this is an issue in Thunderbird 12 on Ubuntu 12.04.

Changed in thunderbird:
importance: Medium → Unknown
status: Invalid → Unknown
Changed in thunderbird:
importance: Unknown → Medium
status: Unknown → Confirmed
Michael Doube (michael-doube) wrote :

Another version combination where this bug is still present:

Ubuntu 11.04 with Thunderbird 13.0.1 (from the PPA)

mlaverdiere (mlaverdiere) wrote :

I'm on Ubuntu 12.10 beta 2 with TB 15.0.1 and so far, I don't experience this bug anymore.

matteo, can you reply to comment 2 and comment 1?

Vincent, Steve, can you reproduce?

Reply to comment 1: Exact same behavior after setting network.dns.disableIPv6=true (and restarting Tb) (tested twice)

Regarding comment 2: too much work to test, however the premise "connection is established upon first network request" doesn't make any sense to me. Also note the fourth step in "steps to reproduce".

Yes, I can reproduce with Thunderbird 17.0.2 on Ubuntu 12.04.

This needs more analysis.

It would not surprise me if this is already described in a hard to find (duplicate) bug.

 matteo, is not the precise wording of what you see "Failed to connect to server"?
(*not* "unable to connect")

Wayne, neither of the two!!!!
As I mentioned in my report, what I see is "Looked up gmail.com..." (which is complete nonsense) and then nothing. It doesn't show any error message at all.

IN SOME PREVIOUS VERSION it did show an error message, but I can't check now the exact wording; it may well have been "Failed to connect to server", as you say.

(In reply to matteo sisti sette from comment #10)
> As I mentioned in my report, what I see is "Looked up gmail.com..."
> (which is complete nonsense)

This message/situation is same as "when FQDN is not found in DNS", and is easily seen by POP3 definition with dummy/non-existent server such as x.x.x.
The messag is "DNS is looke up".
What is base of your "nosense" on message related to DNS look up?

> and then nothing. It doesn't show any error message at all.
> IN SOME PREVIOUS VERSION it did show an error message,
> but I can't check now the exact wording;
> it may well have been "Failed to connect to server", as you say.

With POP3 definition with dummy/non-existent server such as x.x.x, old Tb showed connection error message when server is not found in DNS.
Recent Tb stopped annoying connection error error message.
  While looking up DNS, "Looking up".
  After end of DNS look up, "Looked up".
  If server is not found, stop further action,
  because trying to connect to non-exstent server is nonsense.

Because phenomenon with Wifi and phenomenon is when Wifi router doesn't have server connection yet, it may be following.
- In PC, DNS server is defined as 192.168.0.1.
- Wifi router's local IP address is 192.168.0.1
  i.e. Wifi roter behaves as proxy server to DNS, or a DNS server.
- Wifi router gets actual DNS address of his provider upon connection
  establishment with his server.
- Wifi router returns "not found", if actual IP address of DNS of his
  provider is not known yet.

If you can reproduce your problem consistently,
(A) Before you try to connect to server from Tb,
connect to 192.168.0.1 from Browser and check Wifi router's WAN side status. Is DNS address set always? If DNS IP address is shown, do "tracert or ping the-ip-address-of-DNS" at Terminal. Is response returned within reasonable period?
Write ddown the actual DNS's IP address.
(B) Actual DNS address is usualy not changed so frequently.
Before you try to connect to server from Tb, do "tracert written-down-ip-address-of-DNS" at Terminal. Is response returned within reasonable period?
(C) How about "tracert or ping ???.gmail.com"?
    IIRC, I already requested this to you in comment #2...

This has nothing to do with the wifi router.
I see the exact same issue when I use a USB mobile broadband modem instead.
(whops, I thought I had already mentioned that, I hadn't).

Also note (again) my steps to reproduce the issue:

 Steps to reproduce:

 Make sure you're not connected to network
 Open Thunderbird
 Connect to network.
 CHECK THAT THE NETWORK CONNECTION WORKS BY OPENING A WEB PAGE IN A BROWSER
 Go to Thunderbird and click "get mail"

So, between connecting to the network and having TB check mail, I check that everything else works by surfing the web with a browser. No matter how much time I wait, the FIRST time TB tries to fetch mail it systematically fails, even if it is several minutes after connecting to network; the SECOND time it succeeds.

Frankly, I wouldn't look for the cause of this outside TB.

By the way, my DNS is NOT defined as 192.168.1.1, it is 8.8.4.4 (Google's DNS).

Regarding the other issue, the NONSENSE is that:
1. when connection to the server fails for whatever reason, no matter whether it is because the host is unreachable or because the name can't be resolved, it must show a message telling you what the failure is (such as "couldn't resolve domain name", or whatever), not a message that tells you what was the last thing it did and which doesn't even tell you whether it was a success or failure! ("looked up xxxx", as in "I looked up the domain. Don't ask me whether I found it or not"). When you tell a program to do something it must end in either or two ways: (a) "I'm finished doing what you asked me: the result is xxxx", or (b) "I couldn't complete the task because something went wrong: the error was xxxxx".
2. Even if you are not connected to the internet at all, it still says "looked up gmail.com"!!!! That's complete nonsense.

(In reply to matteo sisti sette from comment #12)
> I see the exact same issue when I use a USB mobile broadband modem instead.
> By the way, my DNS is NOT defined as 192.168.1.1, it is 8.8.4.4 (Google's DNS).

If problem happens when no route to internet(phone cable is pulled of from modem), it's perhaps one of next;
- timeout in DNS server access
- Tb's problem when PC's IP address is changed by DHCP retention,
  DNS cache is not cleared, ...
If problem happens when other software like broweser can access to intenet normally, other component of Tb like SMTP can send mail, other mail client can access ...gmail.com with no problem even though Tb fails with "looked up gmail.com", ..., etc.,
it's perhaps one of next;
- timeout in DNS server access i Tb
- Tb's problem when PC's IP address is changed by DHCP retention,
  DNS cache is not cleared, ... (known issues)

If you can reproduce your problem consistently, do following, please.
(i) Before try to access server from Tb, go Work Offline, then go Work Online. This disconnects from server, and restart network connection from scratch. IIRC, DNS cache related issue, IP address change relate issue, is bypassed by this.
(ii) Get DNS server access log with timestamp.
     See bug 402793 comment #28.
> Win example :
> SET NSPR_LOG_MODULES=timestamp,nsHostResolver:5,imap:5,pop3:5
Timeout in DNS server access?

>
>
> Regarding the other issue, the NONSENSE is that:
> 1. when connection to the server fails for whatever reason, no matter
> whether it is because the host is unreachable or because the name can't be
> resolved, it must show a message telling you what the failure is (such as
> "couldn't resolve domain name", or whatever), not a message that tells you
> what was the last thing it did and which doesn't even tell you whether it
> was a success or failure! ("looked up xxxx", as in "I looked up the domain.
> Don't ask me whether I found it or not"). When you tell a program to do
> something it must end in either or two ways: (a) "I'm finished doing what
> you asked me: the result is xxxx", or (b) "I couldn't complete the task
> because something went wrong: the error was xxxxx".
> 2. Even if you are not connected to the internet at all, it still says
> "looked up gmail.com"!!!! That's complete nonsense.

(In reply to matteo sisti sette from comment #12)
> 2. Even if you are not connected to the internet at all,
> it still says "looked up gmail.com"!!!! That's complete nonsense.
(i) Open separate bug for error message improvement.
    Keep "one proble per a bug" at B.M.O, please.
(ii) Until message will be improved, read "looked up ..." as "looked up ..., but it failed", with complement string like ",but it failed", ",but it's not found", ..., as you like, in your brain, please.

(In reply to WADA from comment #14)
> (In reply to matteo sisti sette from comment #12)
> > 2. Even if you are not connected to the internet at all,
> > it still says "looked up gmail.com"!!!! That's complete nonsense.
> (i) Open separate bug for error message improvement.
> Keep "one proble per a bug" at B.M.O, please.
> (ii) Until message will be improved, read "looked up ..." as "looked up ...,
> but it failed", with complement string like ",but it failed", ",but it's not
> found", ..., as you like, in your brain, please.

I also saw this too. I'm not sure if a bug exists for it - I started to look a few days ago but did not finish

wada, thanks for all the ideas. I hope to check some of them. However, my sense is the problem is thunderbrd, because I do not have trouble with firefox.

matteo, please be sure to use exact wording of what you see in messages and dialogs, not approximation. exact wording allows us to better search bugzilla and source code.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.