- Als neu kennzeichnen
- Lesezeichen
- Abonnieren
- Stummschalten
- RSS-Feed abonnieren
- Kennzeichnen
- Anstößigen Inhalt melden
I have been benchmarking a cluster with two MIC cards per node and noticed unusual behavior. Performance has always varied between nodes for whatever reason, but the second MIC card has never achieved the expected 760 GFLOPS for the DGEMM benchmark. All runs were done in native mode and separtately for each card. I have attached a plot that shows the average performance for a subset of nodes. According to the system administrator, all nodes have the same configuration and settings. Can anybody explain this behavior?
Link kopiert
- Als neu kennzeichnen
- Lesezeichen
- Abonnieren
- Stummschalten
- RSS-Feed abonnieren
- Kennzeichnen
- Anstößigen Inhalt melden
- Als neu kennzeichnen
- Lesezeichen
- Abonnieren
- Stummschalten
- RSS-Feed abonnieren
- Kennzeichnen
- Anstößigen Inhalt melden
Unfortunately, we do not have VTune.
- Als neu kennzeichnen
- Lesezeichen
- Abonnieren
- Stummschalten
- RSS-Feed abonnieren
- Kennzeichnen
- Anstößigen Inhalt melden
Did you try to swap the PCIe slots of the two cards?
- Als neu kennzeichnen
- Lesezeichen
- Abonnieren
- Stummschalten
- RSS-Feed abonnieren
- Kennzeichnen
- Anstößigen Inhalt melden
- Als neu kennzeichnen
- Lesezeichen
- Abonnieren
- Stummschalten
- RSS-Feed abonnieren
- Kennzeichnen
- Anstößigen Inhalt melden
Did you try to issue the command "/opt/intel/mic/bin/micinfo" to examine the difference (if any) of these two coprocessor?
- Als neu kennzeichnen
- Lesezeichen
- Abonnieren
- Stummschalten
- RSS-Feed abonnieren
- Kennzeichnen
- Anstößigen Inhalt melden
I completed more tests and the issue seems to be related to the number of threads used by the application. Using the maximum 240 threads resulted in inconsistent and often poor performance. If I instead used 236 threads and avoided the OS core (logical threads 0, 237, 238, and 239), then I consistently achieved 760 GFLOPS on both cards. I had not expected this behavior because I am running in native mode and not accessing any files or generating any output files. I suppose the solution is to not use the OS core.
I consider it unusual the MIC cards would exhibit different behavior. During my testing MIC card 1 never achieved anything approaching the expected 760 GFLOPS on any of the 45 nodes.
Yes, I had already verified that all cards are identical using the micinfo utility. I also ran the SHOC benchmark and did not notice any issues.
- Als neu kennzeichnen
- Lesezeichen
- Abonnieren
- Stummschalten
- RSS-Feed abonnieren
- Kennzeichnen
- Anstößigen Inhalt melden
I Suggest you to do for MIC1 first then try MIC0 lets see is there any change in reader
- Als neu kennzeichnen
- Lesezeichen
- Abonnieren
- Stummschalten
- RSS-Feed abonnieren
- Kennzeichnen
- Anstößigen Inhalt melden
I had already tried running first on MIC card 1 and nothing changed.
As I said, the solution seems to be to never run anything on the so-called OS core.
- Als neu kennzeichnen
- Lesezeichen
- Abonnieren
- Stummschalten
- RSS-Feed abonnieren
- Kennzeichnen
- Anstößigen Inhalt melden
any update?
- Als neu kennzeichnen
- Lesezeichen
- Abonnieren
- Stummschalten
- RSS-Feed abonnieren
- Kennzeichnen
- Anstößigen Inhalt melden
can you try it with
KMP_AFFINITY=balanced
If I did not set this env var performance never reached 700+ GFLOPS on my 5000 series cards.
Also, which version of the mpss stack are you using? There are currently only two versions (still) supported by Intel:
- 3.4.5
- 3.5.2
- Als neu kennzeichnen
- Lesezeichen
- Abonnieren
- Stummschalten
- RSS-Feed abonnieren
- Kennzeichnen
- Anstößigen Inhalt melden
Kumar, Mithun wrote:I currently use a hd 5850 setup in crossfire. i've never had issues with performance or driver problems, but there are some underlying issues with multi card.
Very good explantion thanks sap training in chennai
- RSS-Feed abonnieren
- Thema als neu kennzeichnen
- Thema als gelesen kennzeichnen
- Diesen Thema für aktuellen Benutzer floaten
- Lesezeichen
- Abonnieren
- Drucker-Anzeigeseite