taskset -c 0-9 testApp -threads 10
taskset -c 10-19 testApp -threads 10
What I'm seeing is that the testApp running on cores 0-9 takes about 10 times longer than the testApp running on cores 10-19
Using perf stat -ddd I saw that some.of rhe cache loads (l1, LLC) were a lot slower for the testApp on cores 0-9 and therefore the resulting instructions per cycle was a lot lower.
Also the slowness only starts to appear with 4 or more threads - at least it's only noticeable with the timing metrics I'm using.
We've also seen that rebooting the node can clear the problem, so it seems to be intermittent. We have three other identical systems and this has been observed on one of the others as well.
Any ideas what could cause this or what I could look at to determine what's happening?
Jaspers95, Thank you for posting in the Intel® Communities Support.
In reference to this scenario and in order for us to provide the most accurate assistance on this matter, I will move your thread to the proper department so they can further assist you with this topic.
Intel Customer Support Technician
A Contingent Worker at Intel
Thank you for contacting Intel Embedded Community.
We suggest confirming this situation with the Intel® Processor Diagnostic Tool. You can find it on the following website:
By the way, we want to address the following consultations to have a better idea of this situation:
Could you please give all the details of the tool, the Linux variant, and its flavor used previously to determine the reported issue?
Could you please clarify if the affected design has been developed by you or by a third-party vendor? Could you please let us know the part number, model, and name of the manufacturer if it is a third-party design?
We are waiting for your answer to these questions.