- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I would like to instruct the compiler and the linker on how to allocate the following code (I am using Visual Studio and Intel C++ compiler for Windows)
/* large chunk of code before */ static uint64_t a; static uint64_t b; static uint64_t c; static uint64_t d; static void CALLBACK foo( args ){ /* short code using a, b, c, d */ } /* large chunk of code after */
Specifically, I would like to instruct the compiler and the linker so that the .EXE file would allocate a, b, c, d, foo() contiguously in this order to maximize cache benefit.
The following mods seem to achieve the goal, but the external driver that invokes foo() does not like the fact that foo() is inside a segment, and thus crashes. Maybe I did not use the pragmas correctly.
#pragma code_seg(push, stack1, ".my_text") __declspec(allocate(".my_text")) static uint64_t a; __declspec(allocate(".my_text")) static uint64_t b; __declspec(allocate(".my_text")) static uint64_t c; __declspec(allocate(".my_text")) static uint64_t d; __declspec(code_seg(".my_text")) static void CALLBACK foo( args ){ /* do something using a, b, c, d */ } #pragma code_seg(pop, stack1)
On the other hand, moving foo() outsize the segment works, but I have no guarantee that foo() is allocated soon after a, b, c, d in the .EXE
#pragma code_seg(push, stack1, ".my_text") __declspec(allocate(".my_text")) static uint64_t a; __declspec(allocate(".my_text")) static uint64_t b; __declspec(allocate(".my_text")) static uint64_t c; __declspec(allocate(".my_text")) static uint64_t d; #pragma code_seg(pop, stack1) static void CALLBACK foo( args ){ /* do something using a, b, c, d */ }
Any help is appreciated, thanks.
-Roberto
- Tags:
- CC++
- Development Tools
- General Support
- Intel® C++ Compiler
- Intel® Parallel Studio XE
- Intel® System Studio
- Optimization
- Parallel Computing
- Vectorization
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I had assumed that the issue was with code_seg pragma. So, I suggested you to try out with section pragma.
Try the below code snippet (I've allocated section for lambda function foo. It seems to work now) and let us know.
#include <iostream> #pragma section(".my_section",read,write) __declspec(allocate(".my_section")) static uint64_t a; __declspec(allocate(".my_section")) static uint64_t b; __declspec(allocate(".my_section")) static uint64_t c; __declspec(allocate(".my_section")) static uint64_t d; __declspec(allocate(".my_section")) static auto foo = [=](int x) -> void { std::cout << "Hello " << x <<"\n"; std::cout << &a << " " << &b << " " << &c << " " << &d << "\n"; }; int main() { std::cout << "Address of foo: " << &foo << "\n"; foo(5); //prints Hello and addresses return 0; }
If you wish to get the values by reference for lambda function foo, change the capture clause from [=] to [&].
Specify the args of your choice in (int x).
Let us know if it works for you.
Regards,
Rahul
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Roberto,
One way to do it is to define your static data variables and functions inside the #pragma section.
As per the documentation, code_seg pragma specifies "the text section (segment) where functions are stored in the object (.obj) file."
Kindly refer to the attached code snippet below:
/* large chunk of code before */ #pragma section(".my_section",read,write) __declspec(allocate(".my_section")) static uint64_t a; __declspec(allocate(".my_section")) static uint64_t b; __declspec(allocate(".my_section")) static uint64_t c; __declspec(allocate(".my_section")) static uint64_t d; __declspec(code_seg(".my_section")) static void CALLBACK foo(args) { /* do something using a, b, c, d */ } /* large chunk of code after */
Let us know if it helps.
Regards,
Rahul
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you Rahul, I will try. Do you think the effect could be different from my second snippet of code? I will let you know asap.
Edit: the external driver crashes, unless I wrap foo into a lambda function, which moves its code outside the section. I suspect that this driver does not like callbacks to be placed in segments or so. Thanks anyway :)
-Roberto
Vaidya, Rahul (Intel) wrote:Hi Roberto,
One way to do it is to define your static data variables and functions inside the #pragma section.
As per the documentation, code_seg pragma specifies "the text section (segment) where functions are stored in the object (.obj) file."
Kindly refer to the attached code snippet below:
/* large chunk of code before */ #pragma section(".my_section",read,write) __declspec(allocate(".my_section")) static uint64_t a; __declspec(allocate(".my_section")) static uint64_t b; __declspec(allocate(".my_section")) static uint64_t c; __declspec(allocate(".my_section")) static uint64_t d; __declspec(code_seg(".my_section")) static void CALLBACK foo(args) { /* do something using a, b, c, d */ } /* large chunk of code after */Let us know if it helps.
Regards,
Rahul
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I had assumed that the issue was with code_seg pragma. So, I suggested you to try out with section pragma.
Try the below code snippet (I've allocated section for lambda function foo. It seems to work now) and let us know.
#include <iostream> #pragma section(".my_section",read,write) __declspec(allocate(".my_section")) static uint64_t a; __declspec(allocate(".my_section")) static uint64_t b; __declspec(allocate(".my_section")) static uint64_t c; __declspec(allocate(".my_section")) static uint64_t d; __declspec(allocate(".my_section")) static auto foo = [=](int x) -> void { std::cout << "Hello " << x <<"\n"; std::cout << &a << " " << &b << " " << &c << " " << &d << "\n"; }; int main() { std::cout << "Address of foo: " << &foo << "\n"; foo(5); //prints Hello and addresses return 0; }
If you wish to get the values by reference for lambda function foo, change the capture clause from [=] to [&].
Specify the args of your choice in (int x).
Let us know if it works for you.
Regards,
Rahul
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Rahul, it works! However... I apologize as I did not explain myself well. Variable foo is consecutively allocated as you say but I wanted foo's body as well or very closely. I modified your code with Windows' intrinsic _ReturnAddress() to print an address inside foo's body and it is quite distant from &foo.
#include <iostream> #include <intrin.h> #pragma intrinsic(_ReturnAddress) #pragma section(".my_section",read,write) __declspec(allocate(".my_section")) static uint64_t a; __declspec(allocate(".my_section")) static uint64_t b; __declspec(allocate(".my_section")) static uint64_t c; __declspec(allocate(".my_section")) static uint64_t d; __declspec(allocate(".my_section")) static auto foo = [=](int x) -> void { auto thisAddress = _ReturnAddress(); std::cout << "... foo's body: " << thisAddress << "\n"; std::cout << "Hello " << x << "\n"; std::cout << &a << " " << &b << " " << &c << " " << &d << "\n"; }; int main() { std::cout << "Address of foo: " << &foo << "\n"; foo(5); //prints Hello and addresses return 0; }
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Potential issue is that newer operating systems, in efforts to thwart viruses, use "execute only" segments for code (.text). In your earlier attempts the crash might have been resolved by adding "execute" to the section attributes. (#pragma section(".my_section",read,write,execute)
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I see this point, it makes sense if it really helps... thank you
-Roberto
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
That's a good point made by Jim. Adding "execute" attribute to the section helps(in the earlier code snippet).
--Rahul
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Actually the address offset does not change, but it could make a difference at runtime in terms of cache performance. Thank you.
-R
Vaidya, Rahul (Intel) wrote:That's a good point made by Jim. Adding "execute" attribute to the section helps(in the earlier code snippet).
--Rahul
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Yes, certainly. Let us know if we can close the thread.
--Rahul
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Sure, we can close the thread, thank you all!
-Roberto
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You're welcome, Roberto.
Thanks for the confirmation.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page