Yade

crash after add particles in parallel mode

Bug #724396 reported by Sergei Dorofeenko on 2011-02-24

This bug affects 2 people

Affects		Status	Importance	Assigned to	Milestone
	Yade	Fix Released	High	Unassigned

Bug Description

Yade crash in parallel mode if to add particles after run of simulation.
Attached script runs without errors in serial mode (-j 1 option),
but in parallel mode (-j 2 or more) it crash at second O.run.

Related branches

lp:~yade-dev/yade/trunk

lp:debian/yade

Revision history for this message

Sergei Dorofeenko (sergei.dorofeenko) wrote on 2011-02-24:

model.py Edit (642 bytes, text/x-python)

Revision history for this message

Sergei Dorofeenko (sergei.dorofeenko) wrote on 2011-02-24:

crash.log Edit (17.0 KiB, text/plain)

Seems, something wrong with ForceContainer, see attached log file.

Revision history for this message

Anton Gladky (gladky-anton) wrote on 2011-02-25:

Hi, Sergei.

I have tried your script, but I cannot reproduce the crash.

Will try on other systems.

Revision history for this message

Anton Gladky (gladky-anton) wrote on 2011-02-25:

I have tried it on Ubuntu 10.04

Revision history for this message

Sergei Dorofeenko (sergei.dorofeenko) wrote on 2011-02-25:

Seems, cause is next:

After adding new particles, the first thread runs on the "old" particles, so without resizing of _forceData in ensureSize() (ForceContainer.hpp:87)
And at time, when first thread adds new force to _forceData, second thread starts to run on the "new" particles and do resize of _forceData in ensureSize(). After this first thread have segfault.

Bug fixed after adding "forces.resize(bodies->size());" to Scene.cpp, after line 76.

Vaclav, could not you fix it?

Revision history for this message

Václav Šmilauer (eudoxos) wrote on 2011-02-25:

Yes, you're right, the locking logic in ForceContainer::resize is not sufficient. What you propose does not solve the problem of adding bodies inside the loop, though -- that is what a factory does, for instance.

Revision history for this message

Sergei Dorofeenko (sergei.dorofeenko) wrote on 2011-02-25:

I want to rewrite ensureSize() in order to the each thread do resize only the "own" vector, not all.
So, no need locking at all. Vaclav, do you agree?

Revision history for this message

Sergei Dorofeenko (sergei.dorofeenko) wrote on 2011-02-25:

Apparently, _forceData will not store vvector, but vvector*.
Because, if it will store vvector (as now) then _forceData[A].resize(N) can affect _forceData[B]. Or no?

Revision history for this message

Václav Šmilauer (eudoxos) wrote on 2011-02-27:

I don't think so. I think sizeof(vector<T>) is always sizeof(size_t)+sizeof(void*), the array is allocated on the heap dynamically. For the independent resizing: sounds fine, though sizes should be re-synced in sync() before the summation loop, otherwise it will run slower due to in-loop size checks.

Anton Gladky (gladky-anton) on 2011-03-08

Changed in yade:
importance:	Undecided → High
status:	New → Confirmed

Revision history for this message

Anton Gladky (gladky-anton) wrote on 2011-03-08:

#10

Patch deletes parallel implementation of ForceContainer Edit (8.4 KiB, text/plain)

I have done some performance tests on 6-core machine.

The simulation have been carried out with different thread numbers and execution time has been measured. In the first case it was the latest trunk revision (2781) clean compile, in the second case the ForceContainer used only "non-parallel flavor" even in multy-threading mode, so the "parallel flavor" have been deleted from ForceContainer.hpp (patch provided).

Number of bodies was 47412, number of iterations was 5000.

Results are presented below (execution time).

-j Parallel Non-Parallel %
------------------------------------------------------------
1 997 984 1.3
2 821 800 2.6
3 764 736 3.6
4 752 709 5,7
5 745 699 6,2
6 742 693 6,6

So using non-parallel ForceContainer makes the simulation sufficiently faster (from 1 to 6% and more).

My proposal is to delete parallel implementation of ForceContainer. It will increase the productivity and "fixes" this bug.

What ideas and opinions?

Revision history for this message

Anton Gladky (gladky-anton) wrote on 2011-03-08:

#11

Scripts to measure performance Edit (1.3 KiB, application/x-tar)

Revision history for this message

Anton Gladky (gladky-anton) wrote on 2011-03-08:

#12

Results of NonParallel ForceContainer Edit (1.2 KiB, application/x-tar)

Revision history for this message

Anton Gladky (gladky-anton) wrote on 2011-03-08:

#13

Results of Parallel ForceContainer Edit (1.2 KiB, application/x-tar)

Revision history for this message

Václav Šmilauer (eudoxos) wrote on 2011-03-09:

#14

Do you mean that you used non-parallel ForceContainer with OpenMP-enabled simulation run with multiple threads? If so... threads could have been overwriting each other's data. https://www.yade-dem.org/doc/prog.html#forcecontainer explains why threads must have separate data (unless you want to use locking, which will be very very likely much less efficient).

Revision history for this message

Anton Gladky (gladky-anton) wrote on 2011-03-09:

#15

Yes, you are right. The proposed way breaks simulations (--check does not path with j>1).

I have done locks in non-parallel implementation, and the performance goes down, as you said:

-j Parallel Non-Parallel %
------------------------------------
1 997 1002 -0.5
2 821 842 -2.6
3 764 825 -8
4 752 908 -20.7
5 745 1132 -51.9
6 742 1180 -59

Revision history for this message

Anton Gladky (gladky-anton) wrote on 2011-03-11:

#16

fix_patch.diff Edit (5.4 KiB, text/plain)

I tried to implement an idea, indicated here: https://bugs.launchpad.net/yade/+bug/724396/comments/7

Sergei, could you not, please, test it?
Vaclav, could you not, please, have a look at it?

Patch is attached.

Thanks

Revision history for this message

Sergei Dorofeenko (sergei.dorofeenko) wrote on 2011-03-13:

#17

fix_724396-v2.diff Edit (9.4 KiB, text/plain)

Fixed path is attached.
Compilation and test model.py from this thread is ok.
Need test of reading forces from python and perfomance benchmark.

Revision history for this message

Anton Gladky (gladky-anton) wrote on 2011-03-14:

#18

The last patch gives a segmentation fault in --test:

yade --test
Math: Matrix3 operations ... ok
Math: Quaternion operations ... ok
Math: Vector2 operations ... ok
Math: Vector3 operations ... ok
Core: correct types are instantiated ... ok
Core: dispatcher ctors with functors ... ok
Core: Attr::hidden ... ok
Core: InteractionLoop special ctor ... ok
Core: invalid attribute access raises AttributeError ... ok
Core: Attr::noSave ... Segmentation fault

I dont have a debug build for the moment to provide a backtrace.

Revision history for this message

Anton Gladky (gladky-anton) wrote on 2011-03-14: Re: [Bug 724396] Re: crash after add particles in parallel mode

#19

segfault_fix_724396-v2 Edit (57.2 KiB, application/octet-stream; name=segfault_fix_724396-v2)

Backtrace is attached

Anton

Revision history for this message

Sergei Dorofeenko (sergei.dorofeenko) wrote on 2011-03-14:

#20

fix_724396-v3.diff Edit (9.4 KiB, text/plain)

fixed

Revision history for this message

Anton Gladky (gladky-anton) wrote on 2011-03-15:

#21

Modified model.py with force check. Edit (762 bytes, text/x-python)

I have checked the last patch, all is working well for me!
The performance is slightly higher now, than without patch, so I think we did not break the performance.

I modified model.py a little bit (attached), now it reads forces of every body on each step. I did not see any problem with that as well. So our doubts about incorrect work of get*Single() functions are empty (I hope).

Actually, I have got finally a crash on unpatched version of YADE, there is no problem on patched version.

Vaclav, could you not, please, have a look at last patch https://bugs.launchpad.net/yade/+bug/724396/+attachment/1908010/+files/fix_724396-v3.diff

If it is ok, we will apply it on all active branches of Yade.

Thanks.

Revision history for this message

Anton Gladky (gladky-anton) wrote on 2011-03-17:

#22

Fixed in r2788.

Thanks all!