- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
My program does not actually use multithreading, it just divides the calculation among several processes, and the parent process collect the result in shared memory, once they finish. All the code is CPU bound, no I/O, and relatively small footprint (20 MB per process).
I started with this approach when the first multithread Pentiums4 came out, but it did not improve the speed at all.
With advent of true multicore processors (Pentium D), I got almost linear speedup, according to the number of cores.
So my logic was pretty straightforward, start as many processes as there are cores, and ignore the threads. I tested the program on Atom, N270, it has one core and two threads, and program was about 30% faster, when I ran it with two processes. Did not know what to think of it.
But recently I tested my program on 4 core 8 threads processor (i7-3610QM), and I got surprising results. My best speedup was when I started 6 processes, not 4 or 8. These are the times:
Processes time(seconds)
1 4,07
2 2,09
3 1,44
4 1,09
5 0,87
6 0,76
7 0,99
8 1,01
Since I can not test my software on all processor configurations, I need some reliable formula to calculate optimal number of processes for best speedup.
I could also make a benchmark and set the optimal value during program installation, but I dislike such zero knowledge approach.
I no longer have a clear mental model of what a core vs thread is on the CPU. Note that I am not talking about programming models, but about hardware capabilities of the CPU. So thread is like half of extra core, but not quite ;-)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
dpeterc wrote:[SergeyK] Did you try to verify your real results with Amdahl's Law for Parallel Computing? ... en.wikipedia.org/wiki/Amdahl's_law
...Since I can not test my software on all processor configurations, I need some reliable formula to calculate optimal number of processes for best speedup.
I could also make a benchmark and set the optimal value during program installation, but I dislike such zero knowledge approach.I no longer have a clear mental model of what a core vs thread is on the CPU. Note that I am not talking about programming models, but about hardware capabilities of the CPU. So thread is like half of extra core, but not quite ;-)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page