Intel® oneAPI Threading Building Blocks
Ask questions and share information about adding parallelism to your applications when using this threading library.

static library for massively parallel computer

rjharrison
Beginner
311 Views
First, thanks to the developers for a very impressive package.

I'm working (unsuccessfully so far) to strip the shared library dependency
out of TBB so that I can use it on a massively parallel computer that does
not support shared libraries (a common situation). Has anyone
succeeded at this or could point me in the right direction?

The issue is how to either remove/replace the dlopen() etc. calls in tbb_misc.cpp or
to provide functional stubs for the dl routines in a static environment.

Robert

0 Kudos
3 Replies
robert-reed
Valued Contributor II
311 Views

Thanks for the complement. We're certainly very excited about capabilities and possibilities of TBB.

Regarding dynamic loading, theonly libraries TBB currently tries to open using FillDynamicLinks are malloc and lib_ittnotify, a library used to communicate with Intel Thread Analysis tools. FillDynamicLinks can fail and in each case the fallback is innocuous: for malloc the cache_aligned_allocator just falls back to the generic malloc and free. If the load of the lib_ittnotify fails, tools like VTune analyzer and Intel Thread Checker won't have the hooks they normally have. Perhaps all you need to do is comment out the dlopen() line and force the function to return false?

Just curious: upon what massively parallel computer and operating system are you trying to get TBB to work? We're always curious and encouraging of more TBB porting efforts.

0 Kudos
rjharrison
Beginner
311 Views
Quoting - Robert Reed

Thanks for the complement. Were certainly very excited about capabilities and possibilities of TBB.

Regarding dynamic loading, theonly libraries TBB currently tries to open using FillDynamicLinks are malloc and lib_ittnotify, a library used to communicate with Intel Thread Analysis tools. FillDynamicLinks can fail and in each case the fallback is innocuous: for malloc the cache_aligned_allocator just falls back to the generic malloc and free. If the load of the lib_ittnotify fails, tools like VTune analyzer and Intel Thread Checker wont have the hooks they normally have. Perhaps all you need to do is comment out the dlopen() line and force the function to return false?

Just curious: upon what massively parallel computer and operating system are you trying to get TBB to work? Were always curious and encouraging of more TBB porting efforts.

Thanks v. much for the suggestion ... I will give it a shot and report the results back here.

The present priorities are the petascale Cray XT-5 systems at UT, ORNL and elsewhere (100+K cores each). These only support statically linked applications. The IBM BG/P does seemingly support shared libraries but I have no data on scalability.

Robert

0 Kudos
Jeff_Hammond1
Beginner
311 Views
"The IBM BG/P does seemingly support shared libraries but I have no data on scalability."

Let's just say shared libs are not the preferred way to do things on BGP. A major issue is that the compute nodes lack local disk, so calling dlopen() from 40K nodes means filesystem metadata blitzkreig and takes a while. There is a ramdisk-workaround but it requires sysadmin assistance and the solution is not general. From what I hear, it took a lot of work to get Python working at scale.

Jeff Hammond
Argonne Leadership Computing Facility
jhammond@mcs.anl.gov / (630) 252-5381
http://www.linkedin.com/in/jeffhammond


0 Kudos
Reply