Server Products
Data Center Products including boards, integrated systems, Intel® Xeon® Processors, RAID Storage, and Intel® Xeon® Processors
4778 Discussions

Intel Modular Server MFSYS25V2 - Goes Deaf

DThom24
Beginner
1,934 Views

Hi all,

I am brand new to this forum and am looking for some help, hopefully quickly. We recently acquired an Intel Modular Chassis V2 with 3 drives for a RAID 5 utilizing 2 nodes and External iSCSI storage. The nodes themeselves are powered by PROXMOX.

We have updated the firmware to the latest firmware, as of April 2012 with

11.6.100.20120307.34736

Everything will run fine for a couple of weeks, but after that, we start to notice that the fans on the server start to get louder and louder over the course of a few days and then the admin interface via ethernet to the chassis goes deaf. No pinging, no acceptance to https requests, nothing.

After we bring down the nodes gracefully, we force down the chassis, and once we bring it back up and online, looking in the events for any issues, we don't see anything that stands out.

Has anyone seen this before? If so, do you have a work around for it or is this a known issue? I would think that a server of this nature should be designed for 100% uptime outside of firmware / hardware updates or issues.

I am looking for a solution as we are looking to purchase more nodes shortly, but with this type of (lack) or reliability right now, we don't want to make any more purchases until its either straightened out, or we look at a different hardware vendor for our solution.

Thanks for any help you can provide me.

0 Kudos
4 Replies
Edward_Z_Intel
Employee
309 Views

It seems the CMM hung up for some unknown reason... I'm not aware of any known issue like this. Can't tell more without reading the log files.

Just to let you know that the absence of CMM doesn't impact operation of other components like compute module, switch, and storage module. To ensure proper cooling, all fans will be running at full speed if the CMM is down. To reset the CMM, you can simply unplug and plug it while the system is running. There is no need to bring down the compute modules (nodes).

0 Kudos
idata
Employee
309 Views

I am seeing a very similar problem. I can connect to the CMM through the GUI no problem all day. Then when I come in the next day I cannot connect. I can not ping it or connect through a browser. There is no link light on the back of it.

In order to get it to work, I have to plug in a crossover cable to it and my laptop. Then the lights start blinking and I can plug it back into the switch and connect to it no problem.

It will work all day until I come in the following morning and the same thing happens. This has happened three days in a row.

Any help would be appreciated.

0 Kudos
Daniel_O_Intel
Employee
309 Views

I've seen excessive network traffic cause this.

If you have dual switches, disable the Admin link between them

From the CMM GUI, choose System | Switches | Advanced Configuration

Mouse over any port and choose Port Configuration

Select Port | 10G.XC | Admin Status | Down

and Port | 10G.SC | Admin Status | Down

0 Kudos
DThom24
Beginner
309 Views

In my case, I don't have dual switches, only the original supplied switch on the chassis. Should this still be done? If so, I'm more than happy to and I can reply with the results in a week or so once the chassis would normally go deaf.

In our case as well, I would not say there is a lot of excessive traffic at this point, only 3 VM's running windows 2008 for about 6 users and 3 virtual containers running debian linux.

Thanks,

0 Kudos
Reply