- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I recently installed one 7120P in one of my servers. It seems working fine, but I noticed that it always runs at the lowest available frequency. Even I am running the benchmark application coming with intel compiler, the frequency stays at 0.57GHz.
Any idea about this?
Here is some information about my machine
System Info
HOST OS : Linux
OS Version : 2.6.32-504.8.1.el6.x86_64
Driver Version : 3.4.2-1
MPSS Version : 3.4.2
Host Physical Memory : 65939 MB
Device No: 0, Device Name: mic0
------------------------------------------------------------------------------------
Version
Flash Version : 2.1.02.0390
SMC Firmware Version : 1.16.5078
SMC Boot Loader Version : 1.8.4326
uOS Version : 2.6.38.8+mpss3.4.2
Device Serial Number : ADKC44700618
Board
Vendor ID : 0x8086
Device ID : 0x225c
Subsystem ID : 0x7d95
Coprocessor Stepping ID : 2
PCIe Width : x16
PCIe Speed : 5 GT/s
PCIe Max payload size : 256 bytes
PCIe Max read req size : 512 bytes
Coprocessor Model : 0x01
Coprocessor Model Ext : 0x00
Coprocessor Type : 0x00
Coprocessor Family : 0x0b
Coprocessor Family Ext : 0x00
Coprocessor Stepping : C0
Board SKU : C0PRQ-7120 P/A/X/D
ECC Mode : Enabled
SMC HW Revision : Product 300W Passive CS
Cores
Total No of Active Cores : 61
Voltage : 995000 uV
Frequency : 571428 kHz
Thermal
Fan Speed Control : N/A
Fan RPM : N/A
Fan PWM : N/A
Die Temp : 43 C
GDDR
GDDR Vendor : Samsung
GDDR Version : 0x6
GDDR Density : 4096 Mb
GDDR Size : 15872 MB
GDDR Technology : GDDR5
GDDR Speed : 5.500000 GT/s
GDDR Frequency : 2750000 kHz
GDDR Voltage : 1501000 uV
------------------------------------------------------------------------------------
micsmc -f
mic0 (freq):
Core Frequency: .......... 0.57 GHz
Total Power: ............. 100.00 Watts
Low Power Limit: ......... 315.00 Watts
High Power Limit: ........ 375.00 Watts
Physical Power Limit: .... 395.00 Watts
------------------------------------------------------------------------------------
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Hongwei,
Can you please verify if while executing the benchmark all the cores are up and running. You can verify this by taking a snapshot of "micsmc" Core Histogram View while the benchmark is running. Power states snapshot you attached looks normal.
Thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Sunny,
It seems that all cores are running. I run two benchmarks on this card, one is the linpack benchmark comes with icc and the other is the helloflops3 from the book Intel Xeon Phi Coprocessor High- Performance Programming. I attached snapshots for both of these two benchmarks. The linpack benchmark gives me up to 400+GFLOPS, which is about one sixth of the theoretical value. The helloflops3 supports to give me 2+TFLFOPS according to the book, but it gives me 100+GLOFS in this test. I have seen the 2+TFLFOPS result with the same code on a 5110P before on a different machine.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi
Can you please share the output of the following commands:
micsmc -f micsmc --tthrottle mic0 micsmc --pthrottle mic0
The coprocessor frequency is forced to drop to minimum supported value (~600 MHz) when the thermal sensors on the coprocessor detect temperature rising above 105° C. Once the temperature drops below 105° C the coprocessor should return to the normal high frequency. Similar behavior is also seen when power level (PL0) is reached.
Thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Sunny,
Here is the information, when the device is idle
----------------------------------------------------------------
micsmc -f
mic0 (freq):
Core Frequency: .......... 0.57 GHz
Total Power: ............. 98.00 Watts
Low Power Limit: ......... 315.00 Watts
High Power Limit: ........ 375.00 Watts
Physical Power Limit: .... 395.00 Watts
----------------------------------------------------------------
micsmc --tthrottle mic0
mic0 (tthrottle):
Throttle state: ......... inactive
Current throttle time: .. 0 msec
Throttle event count: ... 0
Total throttle time: .... 0 msec
----------------------------------------------------------------
micsmc --pthrottle mic0
mic0 (pthrottle):
Throttle state: ......... inactive
Current throttle time: .. 0 msec
Throttle event count: ... 0
Total throttle time: .... 0 msec
----------------------------------------------------------------
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Hongwei,
Thanks. Output looks OK to me. Can you please also verify if all the temperatures are OK. You can do this using
micsmc -t //Temperatures should be in the following ranges mic0 (temp): Cpu Temp: ................ 59.00 C Memory Temp: ............. 37.00 C Fan-In Temp: ............. 30.00 C Fan-Out Temp: ............ 37.00 C Core Rail Temp: .......... 36.00 C Uncore Rail Temp: ........ 37.00 C Memory Rail Temp: ........ 37.00 C
Thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Hongwei,
Also if the above temperature are displayed out to be OK then I would recommend updating the flash. This would help eliminate the chances of any microkernel issues which might be forcing the coprocessor to run at the lowest frequency (~600 MHz). You can find steps for updating flash in latest version of System Administration guide for Intel® Xeon Phi™ Coprocessors.
Thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Sunny,
Thank you very much for the help. Here are the temperatures
micsmc -t
mic0 (temp):
Cpu Temp: ................ 38.00 C
Memory Temp: ............. 26.00 C
Fan-In Temp: ............. 22.00 C
Fan-Out Temp: ............ 29.00 C
Core Rail Temp: .......... 0.00 C (SMC reports sensor read invalid)
Uncore Rail Temp: ........ 30.00 C
Memory Rail Temp: ........ 30.00 C
It seems that the Core Rail Temp is not quite right.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Hongwei,
I am not sure if that is the cause of your problem. Do you get the same output on running "micsmc -t" multiple times. The error what you see above just states that SMC read was invalid it doesn't report any temperature issues.
I would still recommend updating the flash.
Thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Sunny,
When I repeat the command 'micsmc -t', I can get reasonable output
mic0 (temp):
Cpu Temp: ................ 52.00 C
Memory Temp: ............. 40.00 C
Fan-In Temp: ............. 32.00 C
Fan-Out Temp: ............ 45.00 C
Core Rail Temp: .......... 45.00 C
Uncore Rail Temp: ........ 46.00 C
Memory Rail Temp: ........ 46.00 C
I updated the flash as suggested. It seems that nothing changed after system reboot. The benchmarks still report the same GFLOPS. I watched the micsmc GUI when the benchmarks are runing, the temperature never goes higher than 65C event the device utilization is 100% sometimes.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Hongwei,
I would like to touch base on the issue you were facing regarding the Intel Xeon Phi coprocessor running on low frequency. Were you able to resolve this issue?
Thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Sunny,
Sorry for being late. I did not notice your message.
I have not resolved this problem. It is quite strange, everything look fine, but just not in the full power.
Thank you.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Hongwei,
Did you get to resolve the problem? if yes, would you please share what the resolution was?
Thank you!
Edwin G.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Edwin,
I did not resolve the problem. I plan to connect the manufacturer later. I will share if I have news on this problem.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
One of the reason the coprocessor is forced to run at lowest frequency is that it may not be getting enough power. Can you please verify again if the coprocessor is properly connected in the PCIe slot. Also can you please verify all the cables (to/from coprocessor) are intact.
Thanks
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page