- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Are they incompatible? My program compiles, but crashes at run-time when the /Qparallel switch is thrown.
Link Copied
4 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Perhaps, I should clarify what I mean by incompatible. Is this code legal?
#pragma omp parallel for
for (int i = 0; i < num_iter; i++) {
__asm
{
// inline assembly block
}
}
#pragma omp parallel for
for (int i = 0; i < num_iter; i++) {
__asm
{
// inline assembly block
}
}
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If, by legal, you mean "defined by OpenMP standard," , or even "compatible with Microsoft X64 C," evidently not. You probably didn't mean that. At the very least, however, you would require some definition of private data, or shared arrays which could be sectioned off by index i, as well as a compiler which supports in-line asm in your style.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If I use code segment one, the program runs fine. If I use code segment two, the program crashes. This seems really odd behavior. For specifics, I am using the Intel C++ 10.1.022 compiler.
Code Segment 1:
void asm_func() {
__asm
{
}
}
#pragma omp parallel for
for (int i = 0; i < num_iter; i++) {
asm_func();
}
Code Segment 2:
#pragma omp parallel for
for (int i = 0; i < num_iter; i++) {
__asm
{
}
}
Code Segment 1:
void asm_func() {
__asm
{
}
}
#pragma omp parallel for
for (int i = 0; i < num_iter; i++) {
asm_func();
}
Code Segment 2:
#pragma omp parallel for
for (int i = 0; i < num_iter; i++) {
__asm
{
}
}
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Each thread is running with a different stack space. Your asm code may be witten under the assumption that the stack outside the for loop (scope of OpenMP section) is the same as inside the for loop. As an example the assumption that EBP points to the stack frame of the code outside of the parallel loop. When you wrote the code in the style of asm_func();, I would guess it was more along the line of asm_func(arg1[,arg2[,arg3[...]]]); meaning the argument mapping required no assumptions about stack frame (as it would copy the arguments to a new frame).
Consider experimenting with
#pragma omp parallel for
for (int i = 0; i < num_iter; i++)
{
int IntArg = OuterIntArg;
struct SomeStruct* pSomeStruct = &OuterSomeStruct;
__asm
{
mov reg,OuterIntArg
...
}
}
Also walking the Dissassembly code might help to remove unnecessary statements.
Jim Dempsey
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page