20 May 2007
rodeo 0.9b3 and "show collisions"
- solver tab: "show collisions" works (is thread-safe now). collisions are displayed although the solver continues to work in the background
- solver works with unlimited number of collisions (no more "alloca"). rodeo crashed when there were about 3000 collisions at once happening (out of stack memory).
rodeo 0.9b2 and parallel collisions
+ ODE engine should be completely reentrant now, so I can call it in parallel (no more global static variables)
+ multi-thread support optimized (pre-allocated threads) for collisions
- Animator-"Status" report removed ( "Cleaning up Objects" etc ) , because the window wouldnt close sometimes

There are 20 threads (= parallel executions) created for the ODE engine by default. They will constanly wait for jobs, so Rodeo will eat up all your CPU-resources for sure when you hit "Solve".

The ODE engine had to be fixed at several places for parallel execution, I think I got all the crashes fixed now.

Benchmark on dual-core:
1280 cubes, solve frame 0-20
1 thread optimized: 72 sec total - 5 sec write = 67 sec solve
1 thread, parallel: 75 sec - 5 sec write = 70 sec solve
20 threads, parallel: 60 sec - 5 sec write = 55 sec solve

speedup compared to 1 thread optimized : factor 1.2
speedup compared to 1 thread parallel : factor 1.3

The engine has two parts: collision and time-stepping. Collision is parallelized now.
rodeo 0.9b1
things that happened since 0.8b22:
- little interface cleanup (e.g. "Joints" and "About" tab )
- speedup in collision handling ( 1280 cubes project, solve frames 0-20, before: 1 min 25 sec, after 1 min 12 sec )
- memory saving ( saving about 1MB in the 1280 cubes project )
- speedup in writing positions to Animator ( calc/writing 1280 absolute positions/rotations, before: 450 msec, after: 270 msec )


The joints tab is for information only. You can edit the joint-settings in Animator. If you click an object, the opengl window will show the Pivot and the degrees of freedom the object posesses (e.g. locked X-coordinate).


It is possible to save roughly 1/4 of the time in solving. Right now I am doing all calculations with "double precision", it is possible to switch to "single precision". However, I dont know how that would affect the quality/stability etc of the simulation.

1280 cubes project, solve frames 0-20:
double precision, 85 sec total - 20*.45 sec writing = 76 sec solving
float precision, 67 sec total - 20*.45 sec writing = 58 sec solving = 76% of 76 seconds, saved 24%
The solution looks okay.

I suspect the gain is larger when dealing with mesh-geometries instead of "abstract cubes". Will test that later. Right now I do not want to possibly break the plugin's code.


Using multiprocessor code for collision handling proves to be less effective than single-CPU code *on a dual-core machine*. Lots of safety-guards slow down the code. Maybe there is speed-up on quad-core-machines. For now, the MP code is disabled except for the OpenGL-drawing routines which runs in parallel to the solver.

dual-core, 1280 cubes, solve frame 0-20, parallel threads started by collision engine
1 thread (optimized for 1 thread): 1:12 min
1 thread (MP-code): 1:15 min
8 threads: 1:27 min
100 threads: 1:33 min

Also, the threads cause problems with EIAS6.5.2 on a MacTel.