DNS resolution thread lock

Bug #887255 reported by dlh on 2011-11-07
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Armagetron Advanced
Undecided
Unassigned

Bug Description

Browsing a subculture on the Mac OS X client (which uses pthreads) results in a deadlock when doing DNS resolution.

I've attempted to create a recording, but it doesn't reproduce the issue. The manual step to reproduce:

1. Browse a subculture, such as generalconsumption.org:4533.
2. Game will lock up.

Here is a backtrace:

"""
#0 0x955a1b42 in semaphore_wait_signal_trap ()
#1 0x955a7646 in pthread_mutex_lock ()
#2 0x000b6f8b in boost::unique_lock<boost::mutex>::lock () at tools/tMutex.h:944
#3 0x000b6f8b in tBackgroundSyncEvent::Sync (s=@0xea5224) at /Users/lee/Projects/OSS/armagetronad/bzr/0.4-armagetronad-subculture_crash/MacOS/../src/tools/tRecorder.cpp:947
#4 0x000b6ff1 in tBackgroundSyncEvent::Sync (this=0xe) at /Users/lee/Projects/OSS/armagetronad/bzr/0.4-armagetronad-subculture_crash/MacOS/../src/tools/tRecorder.cpp:934
#5 0x000b76bb in st_SyncBackgroundThreads () at /Users/lee/Projects/OSS/armagetronad/bzr/0.4-armagetronad-subculture_crash/MacOS/../src/tools/tRecorder.cpp:860
#6 0x000bcb17 in st_DoToDo () at /Users/lee/Projects/OSS/armagetronad/bzr/0.4-armagetronad-subculture_crash/MacOS/../src/tools/tToDo.cpp:85
#7 0x0008a831 in lock_guard [inlined] () at tools/tMutex.h:1157
#8 0x0008a831 in nDNSResolver::Wait (this=0xea5190) at /Users/lee/Projects/OSS/armagetronad/bzr/0.4-armagetronad-subculture_crash/MacOS/../src/network/nSocket.cpp:1158
#9 0x00087988 in tJUST_CONTROLLED_PTR<nDNSResolver>::Release () at /Users/lee/Projects/OSS/armagetronad/bzr/0.4-armagetronad-subculture_crash/MacOS/../src/network/nSocket.cpp:1732
#10 0x00087988 in ~tJUST_CONTROLLED_PTR [inlined] () at tools/tSafePTR.h:376
#11 0x00087988 in nAddress::CompleteDNS (this=0xe97650) at /Users/lee/Projects/OSS/armagetronad/bzr/0.4-armagetronad-subculture_crash/MacOS/../src/network/nSocket.cpp:1733
#12 0x00087a30 in nAddress::IsSet (this=0xe97650) at /Users/lee/Projects/OSS/armagetronad/bzr/0.4-armagetronad-subculture_crash/MacOS/../src/network/nSocket.cpp:1716
#13 0x0007f768 in nServerInfoBase::Connect (this=0xbfffbf8c, loginType=Login_Post0252, socket=0x0) at /Users/lee/Projects/OSS/armagetronad/bzr/0.4-armagetronad-subculture_crash/MacOS/../src/network/nServerInfo.cpp:2757
#14 0x00081db7 in nServerInfo::GetFromMaster (masterInfo=0xbfffbf8c, fileSuffix=0xea072c "_0") at /Users/lee/Projects/OSS/armagetronad/bzr/0.4-armagetronad-subculture_crash/MacOS/../src/network/nServerInfo.cpp:1656
#15 0x0013b37a in gServerBrowser::BrowseSpecialMaster (master=0xbfffbf8c, prefix=0xe <Address 0xe out of bounds>) at /Users/lee/Projects/OSS/armagetronad/bzr/0.4-armagetronad-subculture_crash/MacOS/../src/tron/gServerBrowser.cpp:220
"""

Manuel Moos (z-man) wrote :

Epsy's recent change made a stacked object get added to a hidden linked list where it is destroyed from. I suspect that's causing the threading trouble you observe, the callstack is from shortly after the destruction and the object (the gServerInfoFavorite) is actively getting used; the nAddress part of it contains thread sync code.

I didn't even get that far in local testing, the debugging memory manager complained earlier; so please verify whether the issue is solved.

dlh (dlh) on 2011-11-07
Changed in armagetronad:
status: New → Fix Committed
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers