DNS resolution thread lock

Bug #887255 reported by dlh
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Armagetron Advanced
Fix Committed
Undecided
Unassigned

Bug Description

Browsing a subculture on the Mac OS X client (which uses pthreads) results in a deadlock when doing DNS resolution.

I've attempted to create a recording, but it doesn't reproduce the issue. The manual step to reproduce:

1. Browse a subculture, such as generalconsumption.org:4533.
2. Game will lock up.

Here is a backtrace:

"""
#0 0x955a1b42 in semaphore_wait_signal_trap ()
#1 0x955a7646 in pthread_mutex_lock ()
#2 0x000b6f8b in boost::unique_lock<boost::mutex>::lock () at tools/tMutex.h:944
#3 0x000b6f8b in tBackgroundSyncEvent::Sync (s=@0xea5224) at /Users/lee/Projects/OSS/armagetronad/bzr/0.4-armagetronad-subculture_crash/MacOS/../src/tools/tRecorder.cpp:947
#4 0x000b6ff1 in tBackgroundSyncEvent::Sync (this=0xe) at /Users/lee/Projects/OSS/armagetronad/bzr/0.4-armagetronad-subculture_crash/MacOS/../src/tools/tRecorder.cpp:934
#5 0x000b76bb in st_SyncBackgroundThreads () at /Users/lee/Projects/OSS/armagetronad/bzr/0.4-armagetronad-subculture_crash/MacOS/../src/tools/tRecorder.cpp:860
#6 0x000bcb17 in st_DoToDo () at /Users/lee/Projects/OSS/armagetronad/bzr/0.4-armagetronad-subculture_crash/MacOS/../src/tools/tToDo.cpp:85
#7 0x0008a831 in lock_guard [inlined] () at tools/tMutex.h:1157
#8 0x0008a831 in nDNSResolver::Wait (this=0xea5190) at /Users/lee/Projects/OSS/armagetronad/bzr/0.4-armagetronad-subculture_crash/MacOS/../src/network/nSocket.cpp:1158
#9 0x00087988 in tJUST_CONTROLLED_PTR<nDNSResolver>::Release () at /Users/lee/Projects/OSS/armagetronad/bzr/0.4-armagetronad-subculture_crash/MacOS/../src/network/nSocket.cpp:1732
#10 0x00087988 in ~tJUST_CONTROLLED_PTR [inlined] () at tools/tSafePTR.h:376
#11 0x00087988 in nAddress::CompleteDNS (this=0xe97650) at /Users/lee/Projects/OSS/armagetronad/bzr/0.4-armagetronad-subculture_crash/MacOS/../src/network/nSocket.cpp:1733
#12 0x00087a30 in nAddress::IsSet (this=0xe97650) at /Users/lee/Projects/OSS/armagetronad/bzr/0.4-armagetronad-subculture_crash/MacOS/../src/network/nSocket.cpp:1716
#13 0x0007f768 in nServerInfoBase::Connect (this=0xbfffbf8c, loginType=Login_Post0252, socket=0x0) at /Users/lee/Projects/OSS/armagetronad/bzr/0.4-armagetronad-subculture_crash/MacOS/../src/network/nServerInfo.cpp:2757
#14 0x00081db7 in nServerInfo::GetFromMaster (masterInfo=0xbfffbf8c, fileSuffix=0xea072c "_0") at /Users/lee/Projects/OSS/armagetronad/bzr/0.4-armagetronad-subculture_crash/MacOS/../src/network/nServerInfo.cpp:1656
#15 0x0013b37a in gServerBrowser::BrowseSpecialMaster (master=0xbfffbf8c, prefix=0xe <Address 0xe out of bounds>) at /Users/lee/Projects/OSS/armagetronad/bzr/0.4-armagetronad-subculture_crash/MacOS/../src/tron/gServerBrowser.cpp:220
"""

Revision history for this message
Manuel Moos (z-man) wrote :

Epsy's recent change made a stacked object get added to a hidden linked list where it is destroyed from. I suspect that's causing the threading trouble you observe, the callstack is from shortly after the destruction and the object (the gServerInfoFavorite) is actively getting used; the nAddress part of it contains thread sync code.

I didn't even get that far in local testing, the debugging memory manager complained earlier; so please verify whether the issue is solved.

dlh (dlh)
Changed in armagetronad:
status: New → Fix Committed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.