Hang at ORTS.Train.UpdateSignalState? (X.1634, freeze when running AI service)

Bug #1189811 reported by Dennis A T
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Open Rails
Fix Released
Medium
r.roeterdink

Bug Description

OR freezes at the moment a particular AI train is due to start.
This consist will run with no problems on the AI path when selected in Explore mode.
The location of the Player train does not seem to affect this freeze.
Windows does not detect that OR has stopped responding even after a quarter hour and there is continuing CPU activity.
OR will close when the "Close Window" button is pressed.
No helpful information appears on the log (attached).
A Crash dump is attached.

Additional:
Removing the suspect AI service cured the freeze.
Changing the AI consist did not cure the freeze.
Moving start and end points of the AI path cured the freeze.

Tags: crash signals
Dennis A T (dennisat)
information type: Public → Private
Revision history for this message
James Ross (twpol) wrote :

Thanks for the dump; I will try and analyse it in the next couple of days.

Dennis A T (dennisat)
description: updated
James Ross (twpol)
Changed in or:
assignee: nobody → James Ross (twpol)
cjakeman (cjakeman)
Changed in or:
milestone: none → 1.0
status: New → Triaged
Revision history for this message
Dennis A T (dennisat) wrote :

Open Rails Log after freeze

Revision history for this message
James Ross (twpol) wrote :
Revision history for this message
James Ross (twpol) wrote :

My guess is that the updater process has gotten in to a loop inside one of the update methods, since sound process is looking normal, loader is waiting for something to load, and render is waiting for the updater to finish, while updater process is here:

> ORTS.Train.UpdateSignalState(Int32)
> ORTS.Train.Update(Single)
> ORTS.AITrain.AIUpdate(Single, Double, Boolean)
> ORTS.AI.AIUpdate(Single, Boolean)
> ORTS.AI.Update(Single)
> ORTS.Simulator.Update(Single)
> ORTS.Viewer3D.Update(Single, ORTS.RenderFrame)
> ORTS.UpdaterProcess.Update()

The most likely suspect of this list would be ORTS.Train.UpdateSignalState, since I don't think any of the others can actually loop indefinitely.

tags: added: crash signals
Changed in or:
assignee: James Ross (twpol) → nobody
importance: Undecided → Medium
Dennis A T (dennisat)
information type: Private → Public
Revision history for this message
r.roeterdink (r-roeterdink) wrote :

Hello,

attached are two special versions of RunActivity.exe which print lots of debug information.
First to run is RunActivity_print.exe - this version prints a series of files which contain analysis data of the route and train build-up, e.g. section, signal positions, train routes etc. This will help in sorting through the relevant route data.
The second is RunActivity_loop.exe - this version prints info on the train behaviour as well as specific info of the routine where (hopefully) the loop occurs.
Please ensure you have a directory C:\temp as that is where the output will be generated.
PLEASE RUN THE PROGRAMS ONLY ONCE !!
Not all data is cleared at the start of the programs - running the program more than once will garble the information and make it useless.
Also please make sure the loop occurs fairly at the beginning of the activity, otherwise the files will get extremely large.
The RunActivity_print.exe can be stopped as soon as the OR window opens - all info is printed during initialization of the program.
The RunActivity_loop.exe must be run until you are sure the program is looping.

Thanks for your help.

Changed in or:
assignee: nobody → r.roeterdink (r-roeterdink)
Revision history for this message
Dennis A T (dennisat) wrote :

Attached are the files generated by the debug process. Hope I haven't snarled it up.

Dennis

Revision history for this message
r.roeterdink (r-roeterdink) wrote :

Thanks - data was quite allright.
But, alas, it has not yet revealed where the loop is. Back to the drawing board.
Will be continued.

James Ross (twpol)
summary: - Freeze when running AI service
+ Hang at ORTS.Train.UpdateSignalState? (X.1634, freeze when running AI
+ service)
Revision history for this message
r.roeterdink (r-roeterdink) wrote :

Next attempt - same procedure as before, but only running the LOOP test is required.
Let's hope that this will lead to the problem spot.

Thanks,

  Rob Roeterdink

Revision history for this message
Dennis A T (dennisat) wrote :

Generated files are attached

Dennis

Revision history for this message
r.roeterdink (r-roeterdink) wrote :

Hello Dennis,

thanks for the data.
But, alas, it still does not reveal anything of where this elusive loop is supposed to be.
I'm rather beginning to doubt the loop is in this routine at all.
So, this is not really getting anywhere - change of course is required.
I've downloaded the route (version 6 - are you using that as well?).
With other things to do, I need a few days to install this properly in my environment and get things running.
I'll be back on this when that's done.

To be continued ....

Regards,
    Rob Roeterdink

Revision history for this message
Dennis A T (dennisat) wrote :

I'm still using version 5 of this route because I've yet to download any activities that use the extension added in version 6.

The activity I'm running is included in uktrainsim file 20779. Unfortunately, there are quite a few stock downloads that need to be made as well - all detailed in the ReadMe included with the file.

As I remember, when I went to version 5, the failing activity needed to be re-saved in AE because of out-of-date services and, I think, an AI path difference (not the AI causing the hang!).

The AI causing the failure, "atw sarn - bridgend (153)" starts an hour into the activity but to get a quick result for your debug data, I moved its start time forward 1 hour to 14:20, just after the activity starts.

As I mentioned in the original report, moving the start point of the AI path forward a mile or two cured the hang. Also, driving the AI consist in explore mode on its AI path causes no problems.

I should find time sometime this weekend to install version 6 and re-test the activity.

Dennis

Revision history for this message
Dennis A T (dennisat) wrote :

I've now installed version 6 and the hang still happens. I've noticed now that OR is not totally unresponsive, you can switch between cameras and one operation of one of the cab controls is acknowledged but not acted upon.

Revision history for this message
r.roeterdink (r-roeterdink) wrote :

Found it.
It's not a loop, it is not even a 'real' program error.
The problem is caused by a data error in the route's tdb-file.
The speedpost with TDB ref. no. 5196 (just south of Sarn) is referenced twice in the list of objects for section 2635.

What happens here because of this is that the train checks the next object ahead - which is speedpost 5196.
The check moves from this position to check for the next item - which again is speedpost 5196 (two items at same location are set slightly apart so they can both be found).
The program takes the distance of this object - but there is only one actual item, and so it starts it's search back from the first position. So the next object found is ... 5196. And so it goes on until the program has collected millions of references to this speedpost, and the loop processing them all just takes too long - and the whole thing collapses.

To resolve this, either use the route-editor to remove and then reinstate this speedpost (tile -6203 14927, x/y/z pos : -381.522 44.1154 -13.2029).
The other option is to edit the tdb and remove the spurious entry - search for "( 5196 )" and you can't miss it (don't forget to alter to number at the top of the list).

I will try and see how I can protect the program against such errors but that may take a little longer.

Revision history for this message
Dennis A T (dennisat) wrote :

Thank you for your efforts.
I'm attempting to contact the route author to inform him of this TDB inconsistency.

I've had a very similar problem before on the London & South East route, also corrected (avoided!) by altering the AI path. I'll revisit that and see if it still exists on version 0.9. Would TSUtils show up the error you found?

Dennis

Revision history for this message
r.roeterdink (r-roeterdink) wrote :

Quote : "Would TSUtils show up the error you found?"

I do not think so - as far as I know, TSUtils checks the integrity of the tdb-file but not the references.
As the problem has been cleared I will mark this bug as fixed (allthough no patch is provided).

Changed in or:
status: Triaged → Fix Committed
James Ross (twpol)
Changed in or:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.