Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Beginner
4,209 Views

JBOD2312S2SP Power Supply Issue?

I have a JBOD2312S2SP Intel Raid Enclosure (http://www.intel.com/content/www/xr/en/server-systems/storage-system-jbod.html Intel® Storage Systems JBOD Family) at my house and woke up to screaming fan noises. Warning light was blinking and the console showed this:

ID = 206

SEQUENCE NUMBER = 164362

TIME = 29-06-2013 03:13:39

LOCALIZED MESSAGE = Controller ID: 0 Power supply cable removed on enclosure: 1 Power Supply 1

Note I only have one power supply so if the cable was removed not sure how it was still running. Right after that warning received the below messages copied from the logs so the top one is the latest:

 

ID = 243

 

SEQUENCE NUMBER = 164371

 

TIME = 29-06-2013 03:13:44

 

LOCALIZED MESSAGE = Controller ID: 0 Fan speed changed on enclosure: 1 Fan 3

ID = 243

 

SEQUENCE NUMBER = 164370

 

TIME = 29-06-2013 03:13:44

 

LOCALIZED MESSAGE = Controller ID: 0 Fan speed changed on enclosure: 1 Fan 2

ID = 243

 

SEQUENCE NUMBER = 164369

 

TIME = 29-06-2013 03:13:44

 

LOCALIZED MESSAGE = Controller ID: 0 Fan speed changed on enclosure: 1 Fan 1

ID = 243

 

SEQUENCE NUMBER = 164368

 

TIME = 29-06-2013 03:13:42

 

LOCALIZED MESSAGE = Controller ID: 0 Fan speed changed on enclosure: 1 Fan 3

ID = 243

 

SEQUENCE NUMBER = 164367

 

TIME = 29-06-2013 03:13:42

 

LOCALIZED MESSAGE = Controller ID: 0 Fan speed changed on enclosure: 1 Fan 2

ID = 243

 

SEQUENCE NUMBER = 164366

 

TIME = 29-06-2013 03:13:42

 

LOCALIZED MESSAGE = Controller ID: 0 Fan speed changed on enclosure: 1 Fan 1

ID = 243

 

SEQUENCE NUMBER = 164365

 

TIME = 29-06-2013 03:13:40

 

LOCALIZED MESSAGE = Controller ID: 0 Fan speed changed on enclosure: 1 Fan 3

ID = 243

 

SEQUENCE NUMBER = 164364

 

TIME = 29-06-2013 03:13:40

 

LOCALIZED MESSAGE = Controller ID: 0 Fan speed changed on enclosure: 1 Fan 2

ID = 243

 

SEQUENCE NUMBER = 164363

 

TIME = 29-06-2013 03:13:40

 

LOCALIZED MESSAGE = Controller ID: 0 Fan speed changed on enclosure: 1 Fan 1

Last message until 6 minutes later I powered up a computer to remove in and shutdown the system. Pulled the power plug, put it back in, powered everything up, and it's running right now without issue. Thoughts?

Edit: As a side note, the Intel Raid Web Console shows this drive cage with 2 power supplies. Always did. I only have one. So not sure if that's just how many it can hold or if it's just not intelligent enough to detect the number of drives.

Thanks.

Steve

Tags (1)
0 Kudos
41 Replies
53 Views

Steve,

You mention that you have RWC2 installed...correct? If you click in the "physical" tab..and then click on the "graphical view" (on the right side pane) ..what speed/RPM does it show for your fans when running at normal speed and also when running at the faster speed?FYI..that the fan speeds are monitored by a temp sensor located in the front panel. Is your JBOD located in an area...where the front panel could be operating at a higher than expected temperature?

0 Kudos
Beginner
53 Views

Since my last post I have had run away fans twice; the last time was this morning. The physical tab in RWC2, (latest Version), always says power cord removed when the fans run up from normal to high speed. The unit is in as very small closet with a built in air conditioner, and even in the summer on hot days I have never seen the room temp above 62 degrees F. Glad the servers are in the garage rather than my office. It appears as I have said before, anytime the UPSs' intervene the fans go crazy. Would be nice to have a fix as we are in talks with a client to put one of these in an office for storing backup images.

0 Kudos
53 Views

Steve,

So it sounds like your UPS is kicking in because of a power failure..correct? If so..it almost sounds like the JBOD is rebooting. Do you have 1 power supply or 2 power supplies in the JBOD?

 

I've copied a technical marketing engineer in a seperate email..asking if he's had other customers with this same fan issue. When I hear something from him I'll let you know
0 Kudos
Beginner
53 Views

Almost; I just happened to be right next to the closet, which is in my garage when the lights flickered. The 3 servers kept running because of the UPSs' but the fans on the JBOD ran back up to high speed. None of the servers shut down because the UPSs' are not set to shut thing down in one second; but after 5 minutes. The RWC log showed that both power cord had been removed and power supply status was unknown. Did the same thing I always do, shut down the server that the JBOD is connected to, shutdown the JBOD and restarted it and then restarted the server. I originally had only one power supply, and bought a second one and both are installed now. I have had them connect to the same UPS and connected to different UPSs' (Two UPSs' in the rack). Once I thought I had it figured out because RWC log would show one cord had been removed, so I put both power supplies on the UPS that had not come on line, but that did not fix it either. The event logs in the server never show an unexpected shutdown or disk issues. The server is currently shutdown (Lab with 3 VMs) Going to be going up to Server 2012 R2 over the weekend I hope.

Message was edited by: Phillip Funk Seems this issue has dropped of the radar; it now looks like the system is fine when the UPS takes over, however when regular power is restored the fans run up to max, and RWC again says that both power cord have been removed. Thinking about trying to reopen a support ticket, as I don't really want to by a new server system with the 12 bays just because the JBOD has issues. Has anyone found a fix.

0 Kudos
Beginner
53 Views

Sorry for my abscense. Besides having some other priorities, and not being alerted of all the new posts from people offering help or input, I was working with John on this thread trying the Intel recommended part replacement which unfortunately didn't change anything and am out $25 in advanced replacement costs.

Let me try and answer the questions that were asked. Concerned this thread didn't alert me of new posts.

In RWC2 right now the fans are showing at 6270 RPM with speed code of Intermediate Speed running at what I would consider normal speed or noise levels. I don't know right now what they were at when it acts up other than the original logs on my first post where it says speed is being changed. Can check next time it happens. The comment on the heat sensor, front of the enclosure is in the open and system is in a very cool place with no real change in climate.

wrevans, for your inquiry, the problem does seem to happen with fluctuations in power, connected to APC or not. No I do not have a second power supply installed however have tried moving the existing one between slot 1 and 2 in case there was a problem there but didn't help.

Thoughts?

Steve

0 Kudos
Beginner
53 Views

I have two power supplies, and two UPS's with one power supply plug into each. Do not need a power failure, just a flicker and the logs and alerts say both power cords have been removed and the fans run up to 12000 RPMs. Have 4 Physical servers on the same UPSs' and they don't even blink; no alerts except those generated by RWC. Wanted to see if there was any new fixes before I spent $115. on a new power distribution board.

0 Kudos
Community Manager
53 Views

All,

I recommend contacting Intel Technical Support to open service tickets on these issues. It's difficult to follow and escalate properly via the communities.

Sorry for the difficulty..

Regards,

John

0 Kudos
Beginner
53 Views

Curiosity question: Does the UPS being used have sine wave output?

SMT1500RM2U and SURTA1500RMXL2U do.

With the management card installed the setup can be tweaked.

Under Configuration --> Power Settings one can set the voltage thresholds and sensitivity. I suspect that the PSUs in the JBOD are just a tad bit too sensitive to incoming power fluctuations. The UPS settings should be able to be fine tuned to your given environment. Ours is terrible. We have lots of power fluctuations and usually two or three power outages due to someone putting a shovel through a line somewhere.

0 Kudos
Beginner
53 Views

Both UPSs' are sine wave, have not done to much tweaking in them yet, am on the verge of ordering a power distribution board to see if that fixes the issue; but will turn down the sensitivity all the way first.

Thanks,

Phil

0 Kudos
Beginner
53 Views

Don't bother with the power distribution board replacement. I replaced mine and paid the $25 for them to advance ship me a new one. Didn't do any good. Same deal. Intel support was out of ideas. They sent me a form to fill out for an engineer to look at. Haven't gotten to it yet. Don't have the availability to be around to troubleshoot since this is at my house and I work all day.

Steve

0 Kudos
Moderator
53 Views

I would recommend contacting http://www.intel.com/p/en_US/support/contactsupport Intel Customer Support for proper escalation process to expedite the resolution of this issue. You may refer to this thread when contacting them.

0 Kudos
Beginner
53 Views

I took mine out of service as it appears there will be no solution. Have not checked here for a while. We still have a customer who needs a lot of storage (20+TB) but have to consider other options since this problem would not be acceptable for them.

0 Kudos
Beginner
53 Views

I have 2 of these in service that have done this from day 1. So far support has been unable to provide a fix. I have been working with them for almost 3 months now. Here is what they have done so far:

Replace fan controller board

Update all firmware

Replace raid card

Replace Jbod expander card

They seem to stubborn or unconcerned to think that it could be a problem with the firmware. Im now waiting to hear back from them now that they have replaced everything.

I would advise anyone considering a JBOD chassis to stay away from this one.

0 Kudos
Moderator
53 Views

I am sorry to hear that this issue has not been resolved to your satisfaction. Let me get in touch with the http://www.intel.com/p/en_US/support/contactsupport Intel Customer Support team for proper follow up of your case.

0 Kudos
Beginner
53 Views

I have received word from support that a new firmware update is going to be released. I will update once I receive it and run it for a while.

0 Kudos
Beginner
53 Views

Looks like new firmware was recently released. One of the two updates sounds theoretically promising:

-Fixed fan ramp to 100% when AC removed from one power supply.

Haven't tried it yet. Help anyone else? I only have one power supply and never removed the power when this happens. The box just thinks the power was removed as it doesn't properly deal with power fluctuations when plugged into the wall or change in sine wave when the APC takes over in case of power failure.

Steve

0 Kudos
Beginner
53 Views

I update to the new firmware as directed by support case. Took a little over a week this time but fans are now at highest speed again.

Log says things like "Power state change failed" and "Predictive failure" and "Unexpected Sense" Over and over again

I stick with my recommendations to stay away from these.

Alex

0 Kudos
Beginner
53 Views

Maybe the actual situation hasn't happened to me again, however I updated with the latest firmware and had a power fluctuation one day, while it was connected to my Powerware battery backup with 12 brand new batteries installed (yes they were fully charged), and fans didn't go to 100%. The logs showed the power supply was removed, the error light on the front of the unit was red, but fans stayed as they were. So maybe that wasn't the right conditions, but conceptually seems like they masked it partly. Don't really think that's a solution but guess something. Still have to shutdown and restart to get that error off the display.

Other issue if people see it, I sent Intel a massive log of how my fans normally just speed up, and slow down, and speed up and slow down. Every couple seconds they are switching between low and middle speeds settings. They said their unit in the test lab does the same thing so can't help me. I keep asking to know what the fan thresholds are or if then can adjust them custom but they won't acknowledge that request. You can't tell me having voltages going up and down every couple seconds on the fans is good for anything. Not to mention the annoyance of the sound fluctuating all day.

When I said I want to return it and get my money back, they told me after two years of fighting to get this thing to work, to go to the original vendor as Intel doesn't deal with full units directly only parts. They are going to have to be one heck of a vendor to return something after almost two years even if Intel kind of told me they should take care of it not them.

I would replace the fans with ones that even at full speed will run silent, however unlike Dell and their 4 pin PWM fans that they alter the pin layout, these have I know even more wires, enough that it's very non standard and didn't see anything on the net about them. At this point if it can't be returned, it's throw it in the corner or void the warranty replacing the fans which replacing I don't think is possible.

Steve

0 Kudos
Beginner
53 Views

Needless to say vendor came back scratching their head on why a product over 30 days since purchase should be their problem to return or replace. Couldn't argue with them so requested to the support address I have an open ticket with how I return this for a refund.

Steve

0 Kudos
Beginner
53 Views

Sad to say that I purchased two of these enclosures before I noticed this support issue which remains UNRESOLVED BY INTEL AFTER 2 YEARS.

I had both units installed and running for a week -- when all of a sudden BOTH units started having the fans scream to 100%. It is deafening! I rebooted the enclosure (which is a huge pain because they're part of a mutli-unit block storage system) and the fans calmed down. 5 days later -- they started screaming again. Contacted Intel -- and you would think they didn't know about this issue at all. We were told it was the firmware on our LSI controller and were run through the ringer for 5 days updating this and that. Then after that failed to resolve it -- they started wanting to replace parts much as is mentioned in this thread. It was about that time that I found this thread and mentioned it to the support who acted SHOCKED that this was happening elsewhere.

They're now telling me that a firmware update will be available to resolve this problem by the end of of Q1 2016 -- almost 3 full years after this was reported originally by Steve5623. I can't believe that INTEL is actually still selling these units with such a well know system defect -- I purchased the brand name BECAUSE I thought I'd be getting a better quality product and better support but that is FAR from the case. Has anyone else found a resolution to this issue?

0 Kudos