Intel® oneAPI Threading Building Blocks
Ask questions and share information about adding parallelism to your applications when using this threading library.

huge pages on linux?

t_redeske
Beginner
780 Views
Our team is interested in using the TBB memory allocation library on linux. Is there a way to have this library use huge pages instead of 4k pages?
0 Kudos
1 Solution
Dmitry_Vyukov
Valued Contributor I
780 Views
Quoting - t.redeske
Our team is interested in using the TBB memory allocation library on linux. Is there a way to have this library use huge pages instead of 4k pages?

There is no support for large pages currently. However I believe one can add it quite easily. All one has to do is:
1. set up mmapRequestSize to large page size (2MB or whatever it is on your system):
static size_t mmapRequestSize = 0x0100000;
2. patch getRawMemory() function to allocate large pages.
3. patch freeRawMemory() function to free large pages.
That's it.
Since current raw block size used by TBB is 1MB which is quite close to size of large pages (2MB), I think one may leave allocation algorithm itself as-is.

View solution in original post

0 Kudos
10 Replies
Dmitry_Vyukov
Valued Contributor I
781 Views
Quoting - t.redeske
Our team is interested in using the TBB memory allocation library on linux. Is there a way to have this library use huge pages instead of 4k pages?

There is no support for large pages currently. However I believe one can add it quite easily. All one has to do is:
1. set up mmapRequestSize to large page size (2MB or whatever it is on your system):
static size_t mmapRequestSize = 0x0100000;
2. patch getRawMemory() function to allocate large pages.
3. patch freeRawMemory() function to free large pages.
That's it.
Since current raw block size used by TBB is 1MB which is quite close to size of large pages (2MB), I think one may leave allocation algorithm itself as-is.

0 Kudos
Dmitry_Vyukov
Valued Contributor I
780 Views
Quoting - t.redeske
Our team is interested in using the TBB memory allocation library on linux. Is there a way to have this library use huge pages instead of 4k pages?

Btw, are large pages are useful on Linux?
I've found that they are completely useless on Windows. Right after system boot I am able to allocate about hundred large pages, but after ten minutes of work I am unable to allocate any large pages. Windows ungracefully fragments physical memory so that there is just no space for large pages. (I tested on Windows Vista)

0 Kudos
jimdempseyatthecove
Honored Contributor III
780 Views
Quoting - Dmitriy Vyukov

Btw, are large pages are useful on Linux?
I've found that they are completely useless on Windows. Right after system boot I am able to allocate about hundred large pages, but after ten minutes of work I am unable to allocate any large pages. Windows ungracefully fragments physical memory so that there is just no space for large pages. (I tested on Windows Vista)


Dmitriy,

Maybe things are OK when you have 128GB of physical RAM? (or some other "large" ammount of physical RAM).

Jim
0 Kudos
Alexey-Kukanov
Employee
780 Views
Quoting - t.redeske
Our team is interested in using the TBB memory allocation library on linux. Is there a way to have this library use huge pages instead of 4k pages?

Maybe I miss some point, but it seems to me that the way TBB allocator works does not depend on the page size. It just relies on OS mechanisms that map physical memory into virtual address space, and is agnostic of the details of mapping such as pages. If I am mistaken and there are some problems with the allocator when large pages are enabled, I would like to know about that.
0 Kudos
t_redeske
Beginner
780 Views
Quoting - Dmitriy Vyukov

There is no support for large pages currently. However I believe one can add it quite easily. All one has to do is:
1. set up mmapRequestSize to large page size (2MB or whatever it is on your system):
static size_t mmapRequestSize = 0x0100000;
2. patch getRawMemory() function to allocate large pages.
3. patch freeRawMemory() function to free large pages.
That's it.
Since current raw block size used by TBB is 1MB which is quite close to size of large pages (2MB), I think one may leave allocation algorithm itself as-is.


This is in the TBB source code? I haven't looked at that as of yet. Does TBB use mmap to get more memory? With TC malloc, it is possible to set an env variable to tell it where to mmap memory from, which makes it easy to hook up to huge pages.
0 Kudos
t_redeske
Beginner
780 Views
Quoting - Dmitriy Vyukov

Btw, are large pages are useful on Linux?
I've found that they are completely useless on Windows. Right after system boot I am able to allocate about hundred large pages, but after ten minutes of work I am unable to allocate any large pages. Windows ungracefully fragments physical memory so that there is just no space for large pages. (I tested on Windows Vista)


They are very useful if you have a long running process that starts soon after boot and stays running a long time. We've seen 5 - 25% performance improvements using them.
0 Kudos
t_redeske
Beginner
780 Views

Maybe I miss some point, but it seems to me that the way TBB allocator works does not depend on the page size. It just relies on OS mechanisms that map physical memory into virtual address space, and is agnostic of the details of mapping such as pages. If I am mistaken and there are some problems with the allocator when large pages are enabled, I would like to know about that.

In order to use huge pages, you have to tell the OS to reserve some. Then, your program has to specially ask for them. libhugetlbfscan be used to have malloc() and other calls allocate from huge pages. The following link shows details: http://www.ibm.com/developerworks/systems/library/es-lop-leveragepages/

We would like to combine the performance boost we get from huge pages with the boost we get from TBB.
0 Kudos
t_redeske
Beginner
780 Views
Quoting - Dmitriy Vyukov

There is no support for large pages currently. However I believe one can add it quite easily. All one has to do is:
1. set up mmapRequestSize to large page size (2MB or whatever it is on your system):
static size_t mmapRequestSize = 0x0100000;
2. patch getRawMemory() function to allocate large pages.
3. patch freeRawMemory() function to free large pages.
That's it.
Since current raw block size used by TBB is 1MB which is quite close to size of large pages (2MB), I think one may leave allocation algorithm itself as-is.


In looking at the code, I can see what to do - I'll need to change the mmap calls. It would be great if the next version would allow the setting of an ENV variable to dictate where to mmap from so that huge pages could be mapped in with no code changes. This would also need a change to the mmapRequestSize (by ENV variable or otherwise).
0 Kudos
Alexey-Kukanov
Employee
780 Views
Quoting - t.redeske
In order to use huge pages, you have to tell the OS to reserve some. Then, your program has to specially ask for them. libhugetlbfscan be used to have malloc() and other calls allocate from huge pages. The following link shows details: http://www.ibm.com/developerworks/systems/library/es-lop-leveragepages/

We would like to combine the performance boost we get from huge pages with the boost we get from TBB.

Oh, I see - the problem is in the need to specially ask for large pages. I thought that it would be enough to just say the program wants to use large pages. Do you know if Windows operates the same way in this regard?

And, could you please enter a feature request into our bug tracker?
0 Kudos
Alexey-Kukanov
Employee
780 Views

Updating the thread with the current state of huge page support (to make it more relevant for search):

The Intel TBB memory allocator supports huge pages on Linux since v4.1, i.e. for several years. The use of huge pages should be explicitly enabled by calling scalable_allocation_mode(TBBMALLOC_USE_HUGE_PAGES, 1), or by setting TBB_MALLOC_USE_HUGE_PAGES environment variable to 1; the latter variant is useful when you substitute the standard malloc routines with the tbbmalloc_proxy library. Of course the system/kernel should first be configured to allocate huge pages.

Since TBB 2017 Update 7, the TBB memory allocator also supports so-called transparent huge pages, which are automatically allocated by the Linux kernel when suitable. Due to possible negative impact of huge pages to application performance, their use still has to be enabled explicitly, in the same way as I described above.

Additional information about the subject can be found in the documentation: https://www.threadingbuildingblocks.org/docs/help/reference/memory_allocation/c_interface_to_scalable_allocator.html

0 Kudos
Reply