Software Archive
Read-only legacy content
17061 Discussions

Solution for 5110P overheating?

kankamuso
Beginner
753 Views

Hi all,

It seems I finally found the problem with my Phi card freezing. It just got overheated. Why?, because the provider mounted a 5110P (which is pasively cooled) into a workstation without forced cooling. Therefore, the 5110P is designed for RACK-ONLY systems with a powerful air system. 

Then, I would like to know if is there any solution for this around there... I have found something like this:

http://www.asetek.com/press-room/news/375-asetekr-liquid-cools-intelr-xeon-phi-coprocessors.html

But no place to purchase it from...

Thanks in advance!,

Jose.

PS: I would suggest Intel people to explicitely advert that 5110P and all pasive cards are intended for rack only mounting with forced airflow.

0 Kudos
9 Replies
TimP
Honored Contributor III
753 Views

The ad does say that servers are available from Boxx and HP with this feature to support passive cooled cards.

To the extent to which Intel has a list of which combinations of server and Intel(c) Xeon Phi(tm) cards are recommended, it's kept internal to Intel, in part because the vendors ought to be able to give better information about their own products.  Unfortunately, this hasn't always proven to be a valid assumption.

0 Kudos
NATHAN_S_Intel
Employee
753 Views

For what it's worth, cooling requirements for the Intel(R) Xeon Phi(TM) coprocessor 5110P are publically available here: http://www.intel.com/content/www/us/en/processors/xeon/xeon-phi-coprocessor-datasheet.html?wapkw=intel+xeon+phi+datasheet

Who was the system provider?

0 Kudos
TimP
Honored Contributor III
753 Views

The bad stories come from people acquiring server and coprocessor separately, even when consulting vendors who should have given correct advice.  Common possibilities include getting a server designed for low profile passively cooled coprocessor along with an active fan cooled coprocessor, or vice versa, or a host for which no suitable BIOS is available.

I have one of those BIOS-orphaned boxes myself.  It won't even run the CentOS 6.2 kernel (6.1 is fine) nor will it consistently display grub menu when hooked to a KVM.  No, it's not a current production box, but it's physically compatible with active cooled coprocessor.

0 Kudos
kankamuso
Beginner
753 Views

Well I prefer not to make the name of the vendor public.... but I should. I purchased a 8000 € machine just because I needed to run on the Xeon Phi card and now I cannot. I did not purchase items separately. They provide the Xeon Phi as an option in a "configure yourself Dell's like interface" so it is their responsibility to guarantee hardware compatibility. I think I will ask for a full reimbursement.

Thanks all!

0 Kudos
mjc
Beginner
753 Views

What is considered a normal running temperature for the coproc? Mine seems to run 97c and over, which seems a little high unless I want to boil water. Is this normal?

0 Kudos
TimP
Honored Contributor III
753 Views

My fan cooled KNC, with this month's BIOS and MPSS updates, idles at 100w, 52-54c and I never saw it reach 80c.  Early pre-production models which ran as hot as you mention tended to be short-lived or to require 30 minute power off periods to restore functioning (not to mention swapping out of the box!).

In another case, the sales organization for a vendor did not realize that their server (designed for KNC) would not be compatible with coprocessors which differed from those which they planned to introduce, when the servers became available first.

0 Kudos
mjc
Beginner
753 Views

I think our box is not set up properly. We need to send it back so they can put an appropriate fan in the appropriate place.

0 Kudos
Georg_V_
Beginner
753 Views

Our KNC, which is installed in a standard PC located in an air cooled compute center, usually is between 72 and 85 degC, idles at 128 Watts and draws 230 Watts max (as reported by micsmc). MPSS version  is 2.1.4982-15.

Georg

0 Kudos
Robert_F_2
Beginner
753 Views

With the promotional price of a knight's corner for developers to play with someone .... "Intel?" needs to ask Asetek to get with the program...

I'm looking at picking up a couple of cards for my dev box in the basement and if I have to diy a water cooling solution for the cards it isn't that much harder to get batches of waterblocks cnc'ed and enter the business.      The best way to keep us out of the hardware business is to throw hardware at us so we can stay distracted with our shiny toys.

0 Kudos
Reply