Processors
Intel® Processors, Tools, and Utilities
14503 Discussions

What temperature will cause a Xeon W-2175 to downclock?

SGiar
Beginner
1,391 Views

I'm trying to figure out the operating temperature limits of the Xeon W-2175. I've read through the datasheets but I can't seem to find any operating temperature information.

 

At what temperature will the processor's thermal control kick in and start downclocking?

 

What is the maximum operating temperature of the chip?

 

Is there a T_CASE_MAX or TJ_MAX?

 

I know some of these values are based on the characteristics of the individual chip, but is there a ballpark number that someone can give?

 

Or is there a tool that I can run on Ubuntu to see these values/characteristics, or maybe monitor the clock speeds in real time so that I can run a stress test and see when the chip begins to downclock?

 

Any info would be very helpful! Thank you!

0 Kudos
8 Replies
Emeth_O_Intel
Moderator
1,162 Views

Hi Stephen,

 

Thank you for contacting us about this concern.

 

I would like to let you know the Maximum Tcase of this processor Xeon W-2175 is 65ºC according product specification of this processor on ARK.Intel.com.

 

So, if the temperature of the processor is going up to that value you processor is overheating. We recommend to use a cooling solution that provides Thermal Design Power (TDP) of 120W.

 

On the other hand, we have the tool called Intel® Processor Diagnostic Tool; however, it is a tool comparable with Microsoft Windows Only. The diagnostic tool checks for brand identification, verifies the processor operating frequency, tests specific processor features, and performs a stress test on the processor.

 

Nevertheless, due to the fact that you are using Linux OS; let me provide you some commands in order to verify the temperatures of your system at real time.

 

Please run the following commands:

 

If you are using Ubuntu:

 

  1. sudo apt-get install lm-sensors
  2. sudo sensors-detect
  3. sudo service kmod start
  4. sensors

 

If you are using CentOS/RedHat:

 

  1. sudo yum install lm_sensors
  2. sudo sensors-detect
  3. sensors
  4. watch sensors

 

I hope this information will help you, if you have any other question please do not hesitate and let me know and I will be more than happy to assist you.

 

Regards,

 

Emeth O.

Intel Customer Support Technician

Under Contract to Intel Corporation

0 Kudos
SGiar
Beginner
1,162 Views

So just to clarify on this, does the thermal control kick in when the processor hits 65C - or more concretely, does the processor downclock itself when above 65C? We see temps running up to 85-90C (sometimes spiking at 95C) and the processor still seems to be running all cores around 3.3GHz, which is less than the max boost clock of 4.3GHz, but still above the base clock of 2.5GHz. We've been using lm-sensors and psensor to get temps, and lscpu to get the clock speed of each core - not sure if lscpu is accurate or not...

0 Kudos
Emeth_O_Intel
Moderator
1,162 Views

Hi Stephen,

 

Could you please so kind and provide us the following information:

 

A) Please provide us the output of the sensors in order to verify the temperatures.

 

B) Are you getting this temperatures when you are running an specific application or running an specifi job? or it is happening at any time?

 

C) When did this issue start? It occurs after an specific change on the system or any specific update?

 

E) Have you changed the cooling system? Did you already check that the cooling system is providing the correct Thermal Design Power (TDP) to the processor?

 

D) On the other hand, which ServerBoard are you using? did you already confirm it is using the latest BIOS Version?

 

 

Regards,

 

Emeth O.

Intel Customer Support Technician

Under Contract to Intel Corporation

 

0 Kudos
SGiar
Beginner
1,162 Views

A) Please provide us the output of the sensors in order to verify the temperatures.

psensors is showing individual core temperatures that range from 85-95C, and the case temperature is usually the same as the highest core temp. Fore example, one snapshot we have is:

Package ID 0: 90C

Core 0: 86C Core 1: 90C Core 2: 87C Core 3: 77C Core 4: 87C Core 5: 84C Core 6: 83C

Core 7: 89C Core 8: 87C Core 9:  89C Core 10: 85C Core 11: 88C Core 12: 89C Core 13: 88C

CPU_temps.PNG

lscpu reported that all cores were running at 3300MHz +/- 2MHz

 

B) Are you getting this temperatures when you are running an specific application or running an specific job? or it is happening at any time?

So this happens when we run a stress test. We are building a custom compute box for outdoor an application - it's a very unique case. We are attempting to verify our cooling solution. Our basic testing right now is to run a stress test at 50% (14 theads) load and 100% load (28 threads). The above reading is after running a 50% (14 thread) stress test for 10 minutes.

 

C) When did this issue start? It occurs after an specific change on the system or any specific update?

The issue has always been around as we are developing a custom compute solution.

 

E) Have you changed the cooling system? Did you already check that the cooling system is providing the correct Thermal Design Power (TDP) to the processor?

That is exactly what we are trying to determine - these tests are from a custom heat pipe solution and we are also experimenting with a custom liquid cooling solution. We are encountering these issues as we are trying to verify the cooling solutions.

 

D) On the other hand, which ServerBoard are you using? did you already confirm it is using the latest BIOS Version?

We are using an ASUS WS C422 SAGE/10G. I believe the BIOS is at it's latest version - I will verify this today.

 

Overall, I'm not trying to understand why this is happening - we know that the cooling solution is an issue. What I'm really trying to figure out is the characteristics of the processor. I'm trying to figure out the temperature limits of the processor and what happens when we exceed them. I know that at a certain temperature - the processor will engage some thermal throttling to protect itself - what is this temperature threshold?

 

As you said T_case is 65C but we are seeing temps up into the 80-90C range. So what is the processor doing when we exceed this 65C limit - is it limiting the boost clock? Is it throttling the base clock to a lower value? Is it engaging any kind of thermal protection? Or is it just operating normally - and we are just decreasing the life of the processor?

 

 

0 Kudos
Emeth_O_Intel
Moderator
1,162 Views

Hi Stephen,

 

Thank you for the information provided.

 

It is expected to noticed a high temperature when the system is running a stress test.

 

Please do not run the stress test and scan the temperature of the system and provide us the output in order to verify if the issue still persists.

 

Regards,

 

Emeth O.

Intel Customer Support Technician

Under Contract to Intel Corporation

0 Kudos
SGiar
Beginner
1,162 Views

I appreciate your assistance with this, but the temperature is not the problem. I am not concerned about why the CPU is so hot - we know it is hot because we are running it at load, in a custom enclosure, outdoors. What I am concerned about is what happens when the CPU is this hot?

 

We are trying to learn about the characteristics of the CPU - the questions that I really want answered are:

 

A.) You mentioned that T_case is 65C - what happens when we exceed this temperature?

 

B.) Does it have a thermal control circuit that limits the clock speed to keep it from reaching dangerous temperatures?

 

C.) If so, at what temperature is this thermal control triggered?

 

D.) What are the characteristics of the thermal control? Does it limit base frequency? Does it just limit the maximum boost frequency? Is the behavior dependent on the BIOS settings?

 

E.) From what we can tell every core is running at 3.3GHz even with temperatures up to 95C - does this mean that the processor is not engaging thermal throttling?

 

E.) Is there any definitive way to tell if the CPU is throttling down for thermal protection on a Linux PC?

 

Thanks!

0 Kudos
Emeth_O_Intel
Moderator
1,162 Views

Hi Stephen,

 

Thank you for the information provided.

 

Well, if you noticed that your processor is showing high tempreratures just running an specific application or even though without running an application just loading the system; it is something that we need to take a look.

 

Usually when a processor exceed the normal temperature, your system will start to show BlueScreens , randomly shut downs and the performance of the system will be affected as well.

 

Correct, the processor has a Thermal Control Circuit; please check the following guide in order to verify more details about it:

 

https://www.intel.com/content/dam/www/public/us/en/documents/guides/xeon-scalable-thermal-guide.pdf

 

Usually the Thermal Control Circuit reduce the die temperature by using the clock modulation and/or operating frequency and input voltage adjustment when the die temperature is very near its operating limits. So, yes it will limit the maximun boot frequency.

 

Finally, there is not an official tool from Intel® that could monitor if the CPU is throttling down on Linux.

 

Regards,

 

Emeth O.

Intel Customer Support Technician

Under Contract to Intel Corporation

 

0 Kudos
Emeth_O_Intel
Moderator
1,162 Views

Hi Stephen,

 

I was checking and I noticed you had created a new case for the same question and the information was provided accordingly.

 

I am going to proceed and close this one in order to proceed the assistance on the other one.

 

Regards,

 

Emeth O.

Intel Customer Support Technician

Under Contract to Intel Corporation

0 Kudos
Reply