Intel® Moderncode for Parallel Architectures
Support for developing parallel programming applications on Intel® Architecture.
Welcome to the Intel Community. If you get an answer you like, please mark it as an Accepted Solution to help others. Thank you!
For the latest information on Intel’s response to the Log4j/Log4Shell vulnerability, please see Intel-SA-00646

Quad-Core - How are two bottlenecks possible?

Is it possible to have a process that is CPU bound on a single thread, but still completes more slowly when a second instance is run?

Example 1:
Run process A, takes 14 seconds to complete, CPU is maxed on one core the entire time

Example 2:
Run two instances of process A, takes 23 seconds to complete, CPU is maxed on two cores the entire time.

So how can example 2 take much longer?

If the processes were IO bound, then how can the 2 CPU cores be maxed out the whole time?

I understand there are many things that prevent perfect 2x scaling, but I don't understand how the 2 CPU cores can be maxed if there were something else preventing better scaling.

Any ideas appreciated,
0 Kudos
3 Replies
Sure thats possible, depending on your code. Just think of the case when you have locks in your program that both CPUs are competing for. Or maybe your programs bottleneck is really memory bandwith, in which case using two CPUs will not help at all. Or maybe your program has bad locality and your two cores are constantly invalidating each others caches. There are many reasons possible reasons for the behaviour you describe and in all cases your CPUs will appear to be maxed out.
Hi Michael,

Thank you for your response it was very helpful.

I now need to look for the proper techniques to measure lock contention, memory bandwidth usage, and other possibilities that are not easy to detect just by using SysInternals utilities.

I have no idea how to check on these yet, but that's what Google is for so I'm going to find out.

Black Belt

Try Intel's vTune. They have a trial version.

They also have other utility programs.

By examining where your program is executing you may shed some light on the problem. vTune will detect a fair amount of your problems (once you get the hang of using it).

I do not use vTune myself as I have an AMD based system, for that I use CodeAnalyst which has similar functionality.

Jim Dempsey