- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello, experts,
Assume that I have 4 cores machine, each core has 2MB of LLC slice and LLC includes L2.
1) If I use single-threaded MKL, the MKL instance will use 2MB of LLC or use 8MB LLC?
2) If I use openmp threads to control the parallelism, will MKL instance determine available LLC based on thread num?
Any help is appreciated. Thanks.
Best Regards
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello, Ying
Thanks for your kind help.
Assume that in i7-4770K, I have 4 threads application and each thread will call single-threaded sgemm routine.
And my question is that assuming the LLC is inclusive(before Skylake Server) and each sgemm will generate its own memory traffic and may overwrite data from other threads in LLC. And if single-threaded sgemm will use whole LLC, such situation will become much worse. So May I know whether this situation may happen?
Best Regards
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Yan,
I may recommend you to use the Intel Vtune Amplifier XE, it can explore the LLC missing , so you can compare the saturation become worse or not .
Best Regards,
Ying

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page