- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I was wondering if there has been any performance comparison of the various (or any) MKL or IPP functions on an Intel64 platform using large Linux pages (2MiB) vs the "standard" 4KiB pages.
There are several benefits when large pagers are used, including the reduction in TLB miss rates and the larger space for pre-fetching (2MiB vs 4KiB).
I was wondering if there is any other way to utilize large pages besides the hugeTLBfs library.
In other systems one can specify the page size per segment and can obtain tangible speedups in applications suffering from TLB shortages. On that note a smaller page size (eg, 64 to 128 KiB) would be preferrable as a better alternative to 2MiB (or the 1GiB) page size.
Is there any direction in Intel to make large pages more easily accessible to apps ?
regards,
--Michael
I was wondering if there has been any performance comparison of the various (or any) MKL or IPP functions on an Intel64 platform using large Linux pages (2MiB) vs the "standard" 4KiB pages.
There are several benefits when large pagers are used, including the reduction in TLB miss rates and the larger space for pre-fetching (2MiB vs 4KiB).
I was wondering if there is any other way to utilize large pages besides the hugeTLBfs library.
In other systems one can specify the page size per segment and can obtain tangible speedups in applications suffering from TLB shortages. On that note a smaller page size (eg, 64 to 128 KiB) would be preferrable as a better alternative to 2MiB (or the 1GiB) page size.
Is there any direction in Intel to make large pages more easily accessible to apps ?
regards,
--Michael
Link Copied
3 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Michael,
This is Intel Performance Analysis Tool Forum, focus on tools' usage andissues discussion, etc.
Yes. VTune Performance Analyzer can measure CPU cycles and events (DTLB, for example)countfor any program (including user's programs to call functions from MKL or IPP)
Forspecific questions aboutIPP or MKL, please discuss on below forums:
http://software.intel.com/en-us/forums/intel-integrated-performance-primitives/
http://software.intel.com/en-us/forums/intel-math-kernel-library/
Thanks, Peter
This is Intel Performance Analysis Tool Forum, focus on tools' usage andissues discussion, etc.
Yes. VTune Performance Analyzer can measure CPU cycles and events (DTLB, for example)countfor any program (including user's programs to call functions from MKL or IPP)
Forspecific questions aboutIPP or MKL, please discuss on below forums:
http://software.intel.com/en-us/forums/intel-integrated-performance-primitives/
http://software.intel.com/en-us/forums/intel-math-kernel-library/
Thanks, Peter
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Peter,
If I use large pages in my code I can still use VTune to measure relevant VM events with large vs regular pages, right?
thanks..
Michael
If I use large pages in my code I can still use VTune to measure relevant VM events with large vs regular pages, right?
thanks..
Michael
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If you're out to mis-characterize the performance you get with huge pages, yes, you can do it with VTune. As to discovering the new issues introduced by large pages, VTune probably isn't relevant.

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page