- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Using the stream code, and Intel's MLC, I observe a large difference in reported memory bandwidth.
stream reports - Copy: 13041, Scale: 12850, Add: 14436, Triad: 14340
Intel's MLC reports - ALL Reads: 75823, 3:1 Reads-Writes: 74216, 2:1 Reads-Writes: 73818, 1:1 Reads-Writes: 69407 and Stream-triad like: 70701.
So the MLC Stream-triad like value is 70.7GB/sec, versus stream triad 14.3GB/sec.
I am curious to understand the difference. Is it because of concurrency? MLC spawns several threads. I did not compile stream with OpenMP, so it is executing as a single thread.
Thanks and best regards
Jim
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes, the primary difference is concurrency.
The size of the difference will depend on the system under test (both the physical configuration (model, #sockets, #DIMMS/channel) and the BIOS configuration (snooping mode, memory redundancy mode, etc)) and on how STREAM was compiled and run.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Makes sense. Thanks much.
Jim
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Can you offer the results of mlc and stream/stream_omp under the same system under test?
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page