- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I faced problem when implemented OpenBALS and MKL. Sizes of task were 16000 - 18000, step = 64 (i.e. 16000, 16064, 16128.......18000). The task was implemented on Cluster with 24 nodes of haswell architecture (two sockets, cache = 30MB). The question is: why does performance has deep drop when size is 16384? Both of application have the same drop in performance when size is 16384. I do not have big experience in programming and I ask about any thoughts. The miss rate also significantly increased in this size (this is why performance is decreased). Also, why does it happen in this size?
Sorry for bothering,
Thanks.
Size | OpenBLAS (Speed, mflops) | MKL (speed, mflops) |
16256 | 738278.342719 | 803630.559752 |
16320 | 734915.036548 | 805445.625905 |
16384 | 661585.465594 | 642552.265062 |
16448 | 719808.609165 | 797099.170117 |
16512 | 745339.961848 | 804849.076513 |
16576 | 742787.216771 | 803981.951285 |
Link Copied
0 Replies
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page