- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello, I've got several huge loops fashioned as follows :
for (unsigned int k = 1; k < ns_1; k++)
{
for (unsigned int j = 1; j < ny_1; j++)
{
for (unsigned int i = 0; i < nx_1; i++)
{
*_C(UPtr) = quat_dtDivdx * (
*_C(u_Ptr) + *_R(u_Ptr) + *_C(uPtr) + *_R(uPtr));
THROW_COURANT(*_C(UPtr)); UPtr++;
uPtr++; u_Ptr++;
}
uPtr++; u_Ptr++;
}
uPtr += nx2; u_Ptr += nx2;
}
Here _C( ) and _R( ) are macroses related to numerical patterns, i.e. central point and right point. Ptrs are sliders that are moving over one-dimensional arrays. So, a most common loop for some computational algorithm.
Say, I'd like to add an OpenMP support here. I do the following :
#ifdef _OPENMP
#pragma omp parallel for shared(UPtr, uPtr, u_Ptr)
for (int k = 0; k < ns_2; k++)
{
// Thread-localize data sliders.
double
*loc_UPtr = UPtr + k * nx_1 * ny_2,
*loc_uPtr = uPtr + k * np,
*loc_u_Ptr = u_Ptr + k * np;
// Redefine data sliders.
#define UPtr loc_UPtr
#define uPtr loc_uPtr
#define u_Ptr loc_u_Ptr
#else
for (unsigned int k = 1; k < ns_1; k++)
{
#endif
for (unsigned int j = 1; j < ny_1; j++)
{
for (unsigned int i = 0; i < nx_1; i++)
{
*_C(UPtr) = quat_dtDivdx * (
*_C(u_Ptr) + *_R(u_Ptr) + *_C(uPtr) + *_R(uPtr));
THROW_COURANT(*_C(UPtr)); UPtr++;
uPtr++; u_Ptr++;
}
uPtr++; u_Ptr++;
}
uPtr += nx2; u_Ptr += nx2;
#ifdef _OPENMP
// Redefine data sliders.
#undef UPtr
#undef uPtr
#undef u_Ptr
#endif
}
This simple idea came after looking on basic OpenMP examples :
1) set pragma for an outter loop
2) for every slider to create and independent thread-local copy using the preprocessor definitions
OK, now please let me ask the question : why does the threading extension described above brings absolutely NO benefit on the dual-core machine? = I mean, there is no speedup, timings (I use the clock() function from ) are almost equal. However in task manager I can see that with _OPENMP both cores get busy by the program's process. What is the reason?
Thanks.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
From your description, I'm not certain if you have possible dependency problems, where one thread uses data which are updated by the other.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello, Tim,
Thanks for reply,
> From your description, I'm not certain if you have possible dependency problems, where one thread uses data which are > updated by the other.
I suppose mythreads are not data-dependent. The generalformula is U = F(u, u_) - here u and u_ are read-only, U is not self-dependent. To make it clear, let me providethe preprocessed source :
double
*UPtr = this->get_TopLevel()->get_Values(),
*u_Ptr = uFlow->levels[uFlow->levelsCount - 1]->get_Values() +
nx + np,
*uPtr = uFlow->get_TopLevel()->get_Values() + nx + np;
#pragma omp parallel for shared(UPtr, uPtr, u_Ptr)
for (int k = 0; k < ns_2; k++)
{
double
*loc_UPtr = UPtr + k * nx_1 * ny_2,
*loc_uPtr = uPtr + k * np,
*loc_u_Ptr = u_Ptr + k * np;
for (unsigned int j = 1; j < ny_1; j++)
{
for (unsigned int i = 0; i < nx_1; i++)
{
*((loc_UPtr)) = quat_dtDivdx * (
*((loc_u_Ptr)) + *((loc_u_Ptr + 1)) + *((loc_uPtr)) + *((loc_uPtr + 1)));
if (abs(*((loc_UPtr))) > 1e0) throw *((loc_UPtr));; loc_UPtr++;
loc_uPtr++; loc_u_Ptr++;
}
loc_uPtr++; loc_u_Ptr++;
}
loc_uPtr += nx2; loc_u_Ptr += nx2;
}
}
So here, in parallel version, I'm trying to provide each k-iteration with independent sliders copies (names starting with loc_) and corresponding offsets.
Now, about timing. When enclosing the cycle above in clock()-s,the result varies from 0.0149 to 0.016 sec, same for serial and parallel versions. If I change clock()-s to omp_get_wtime(), the result varies from 0.0156 to 0.018 secfor serial and from0.0110 to 0.118 sec for parallel. This timings differ a little from test to test, anyway as for omp_get_wtime()parallel seems to be30% faster than serial.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page