- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I used parallel_for in my visual c++ program. When I ran the program, the windows task manager showed that the total threads increased while the CPU usage did not change much (still about 15%).
Do you have any suggestion why this happens? My pc has intel core i7 CPU.
thanks,
Ying
I used parallel_for in my visual c++ program. When I ran the program, the windows task manager showed that the total threads increased while the CPU usage did not change much (still about 15%).
Do you have any suggestion why this happens? My pc has intel core i7 CPU.
thanks,
Ying
Link Copied
4 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Most likely something wrong with your program.
What exactly? There is a way too much variants to enumerate them all.
What exactly? There is a way too much variants to enumerate them all.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You might try cutting your example down to something that you can post as an attachment in this forum, and see if anyone has ideas. Often when I'm cut ting down a problematic example, the root problem dawns on me before I'm even through cutting.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks.
My code is similar to the following. The structure is simple, but the function pParent->Calc is complicated, which calls some comercial library we bought without source code. Any suggestion is welcome and appreciated.
class ApplyCalc{
Parent *pParent;
int index;
double *result;
public:
void operator() ( const blocked_range& r ) const {
for (int j = r.begin(); j != r.end(); ++j) {
result = pParent->Calc(j,index)
}
}
ApplyCalc(Parent *pParent, int index, double *result) :
pParent(p), index(i), result{ }
};
void calcResult(double **allResult) {
for (i=0; i Parent *pParent;
int index;
pParent = getParentFromChildID(pAllChildren->getID(),
pAllParents,
numParents,
&index);
double *result = new double[NUMRUN];
parallel_for(blocked_range(0,NUMRUN),
ApplyCalc(pParent,index,result));
for (j=0; j {
allResult[i,j]=result;
}
delete[] result;
}
}
My code is similar to the following. The structure is simple, but the function pParent->Calc is complicated, which calls some comercial library we bought without source code. Any suggestion is welcome and appreciated.
class ApplyCalc{
Parent *pParent;
int index;
double *result;
public:
void operator() ( const blocked_range
for (int j = r.begin(); j != r.end(); ++j) {
result
}
}
ApplyCalc(Parent *pParent, int index, double *result) :
pParent(p), index(i), result{ }
};
void calcResult(double **allResult) {
for (i=0; i
int index;
pParent = getParentFromChildID(pAllChildren->getID(),
pAllParents,
numParents,
&index);
double *result = new double[NUMRUN];
parallel_for(blocked_range
ApplyCalc(pParent,index,result));
for (j=0; j
allResult[i,j]=result
}
delete[] result;
}
}
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Try to apply parallel_for to the *outer* loop.
You call pParent->Calc() in parallel. Even if it's thread-safe, most likely it's uses mutexes which kills scalability.
Parallelization of outer loops is always preferable. That will also increase granularity and locality.
You call pParent->Calc() in parallel. Even if it's thread-safe, most likely it's uses mutexes which kills scalability.
Parallelization of outer loops is always preferable. That will also increase granularity and locality.
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page