Intel® Moderncode for Parallel Architectures
Support for developing parallel programming applications on Intel® Architecture.
1697 Discussions

what are the tools available to check the performance of the operating system scheduler for multicore platform

I am using linux platform (Ubuntu). I want to know what are the tools available to check the performance of the operating system scheduler (for multicore platform). so that I can get to know whether scheduler is making best usage of multicore architecture.
0 Kudos
3 Replies
I have moved this from the General Contest forum. Thanks for your question. We'll see if we can find someone who can provide an answer.

Aubrey W.
Intel Software Network Support
0 Kudos
Honored Contributor III
This is a tough issue to describe as you have:

Overhead of scheduler

Impact of scheduler on mix of single threaded applications (under subscribed, fully subscribedand over subscribed)

Impact of scheduler on single application that is multi-threaded (under subscribed, fully subscribedand over subscribed)

Impact of scheduler on multiple application thatare multi-threaded (under subscribed, fully subscribedand over subscribed)

And variations of above when threads are affinity pinned and not pinned.

Also then factor in if the application is I/O intensive, FPU intensive, integer intensive, or blend of factors.

Next comes if applications are cache sensitive or not.

Then there is the issue of if the system has HyperThreading.

The following information is from single application on system with HyperThreading, application is Floating Point and memory access intensive (no I/O). Run on Ubuntu 10.0
[bash]I got some time on a Dell R610 with dual Intel Xeon 5570 processors.
The readers of this mailing list might find it of interest.
Results from running fluidanimate using in_500K.fluid with 100 iterations
Runtimes using QuickThread threading toolkit:
1  Total time spent in ROI:         92.494s  1.0000x
2  Total time spent in ROI:         48.265s  1.9164x
3  Total time spent in ROI:         35.771s  2.5857x
4  Total time spent in ROI:         28.770s  3.2149x
5  Total time spent in ROI:         23.912s  3.8681x
6  Total time spent in ROI:         21.912s  4.2212x
7  Total time spent in ROI:         20.918s  4.4217x
8  Total time spent in ROI:         18.428s  5.0192x
9  Total time spent in ROI:         18.897s  4.8946x * note 1
10 Total time spent in ROI:         18.396s  5.0279x
11 Total time spent in ROI:         18.002s  5.1380x
12 Total time spent in ROI:         17.991s  5.1411x
13 Total time spent in ROI:         17.946s  5.1540x
14 Total time spent in ROI:         16.071s  5.7553x
15 Total time spent in ROI:         16.057s  5.7604x
16 Total time spent in ROI:         14.398s  6.4241x
17 Total time spent in ROI:         41.042s  2.2536x ** note 2
18 Total time spent in ROI:        553.489s  0.1671x ** note 3
Each processor has 4 cores with HyperThreading
Total of 8 cores and 16 hardware threads
fluidanimate is a floating point and memory access intensive application.
Note 1:
On this configuration, QuickThread distributes work to cores first, then back fills to HyperThread siblings second.
Result being fairly steady slope from 1 thread to 8 threads (full set of cores) then shallower slope as the HT threads are filled in.
Note 2:
At 17 threads we have oversubscription of threads. Note the adverse effect on cache.
Note 3:
At 18 threads, the adverse effect on cache appears to be exponential.
Additional run data would provide some insight as would profiling.

Jim Dempsey
0 Kudos
Honored Contributor III
Parallel programming frameworks such as OpenMP, TBB, MPI have their own facilities to augment what the OS scheduler does. Until you are able to be more specific about your interests, you aren't likely to get relevant answers.
0 Kudos