Looking at the traceback that John Clemens just posted -- it looks pretty obviously like another variation of the same class of deadlock that I described before:
- BeaconTimeout() ran from a timer, and calls into MlmeHandler()
- MlmeHandler() runs the table-based state machine
- The state machine calls MlmeJoinReqAction()
- Right at the top of the function, MlmeJoinReqAction() does del_timer_sync(&pAd->MlmeAux.BeaconTimer);
- But we're running from the BecaonTimer callback already
and voila, we have another "wait for timer to finish from within timer callback" deadlock.
Looking at the rt61 driver source, I really feel it was a mistake to merge this into the Ubuntu kernel in the first place. I realize that there are probably some people who are using the rt61 driver without lock ups, but at this point I don't think feisty should ship with rt61 enabled by default.
I think the reason this bug is having a more severe impact now is that all kernels are built with CONFIG_SMP now, so del_timer_sync() is never converted to del_timer(). So another possible fix would be to enable the code in rtmp.h that does:
that would probably convert this easily-triggered deadlock into much rarer strange crashes on SMP systems. (Although I know that almost every modern system is dual-core at least)
Looking at the traceback that John Clemens just posted -- it looks pretty obviously like another variation of the same class of deadlock that I described before:
- BeaconTimeout() ran from a timer, and calls into MlmeHandler()
del_timer_ sync(&pAd- >MlmeAux. BeaconTimer) ;
- MlmeHandler() runs the table-based state machine
- The state machine calls MlmeJoinReqAction()
- Right at the top of the function, MlmeJoinReqAction() does
- But we're running from the BecaonTimer callback already
and voila, we have another "wait for timer to finish from within timer callback" deadlock.
Looking at the rt61 driver source, I really feel it was a mistake to merge this into the Ubuntu kernel in the first place. I realize that there are probably some people who are using the rt61 driver without lock ups, but at this point I don't think feisty should ship with rt61 enabled by default.
I think the reason this bug is having a more severe impact now is that all kernels are built with CONFIG_SMP now, so del_timer_sync() is never converted to del_timer(). So another possible fix would be to enable the code in rtmp.h that does:
#undef del_timer_sync
#define del_timer_sync(x) del_timer(x)
that would probably convert this easily-triggered deadlock into much rarer strange crashes on SMP systems. (Although I know that almost every modern system is dual-core at least)