Server Products
Data Center Products including boards, integrated systems, Intel® Xeon® Processors, RAID Storage, and Intel® Xeon® Processors
4761 Discussions

Vanishing SATA devices

idata
Employee
1,678 Views

I have an S5520SC based system using BMC (software) based RAID 5. It was all running fine, until I installed a new graphics card - RADEON 6990. Now the system will not boot.

After almost 8 hours of troubleshooting it still seems a mystery, but here is but here is what I know so far:

  1. As far as I can tell the graphics card is working because the display all seems to work fine.
  2. According to the BIOS, I have no bootable SATA devices.
  3. I cannot even boot from my Optical Drive any more.
  4. The only boot devices the BIOS knows of are USB, EFI and network.
  5. When I power on my system I can hear the disks chattering away as normal. It sound like the software RAID is initializing as normal. The LED on the optical drive blinks as normal.
  6. I updated all my firmware to the latest version, but that made no difference.
  7. I ran the EFI Platform Confidence Test (see attached) but nothing really seems wrong as far as I can tell.
  8. I ran the EFI System Information Retrieval Utility (see attached) and again nothing really seems wrong as far as I can tell.
  9. The BIOS can sort of see the SATA devices when I switch the mode from software to extended, but it does not show any details on the devices as it normally did. It seems like something in the SATA system or IOH is hosed, but I cannot think of what or why, unless there is some weird PCI Express interaction going on with this new graphics card. Yes, I have rechecked all the cables and connections.
  10. My next step is try putting the previous card back in - but that is an increadibly stressful process because things are so tight in the chassis.

Does anyone have any suggestions on what else I can try?

Cheers, Eric

0 Kudos
5 Replies
Daniel_O_Intel
Employee
386 Views

The only clue I see is the last line in the SEL:

POST Err Sensor reports PCI out of resources error.

Vendor 1002 is ATI, so

00 04 00 00 ==> Display Controller - VGA/8514 controller

 

Vendor 1002 Device 671D Prog Interface 0

 

00 04 00 01 ==> Multimedia Device - UNDEFINED

 

Vendor 1002 Device AA80 Prog Interface 0

 

00 05 00 00 ==> Display Controller - Other display controller

 

Vendor 1002 Device 671D Prog Interface 0

 

00 05 00 01 ==> Multimedia Device - UNDEFINED

 

Vendor 1002 Device AA80 Prog Interface 0

all belong to taht card. I know 671D is the Radeon 6990, but I don't recognize AA80 at all - is the Radeon using some HDMI audio onboard?

0 Kudos
idata
Employee
386 Views

What does "PCI out of resources" mean?

The Radeon 6990 does have HDMI 1.4a support. I'm not sure whether you access it from the DVI or DisplayPort connectors. There is one DVI and 4 DisplayPort connectors.

Is there any reason this graphics card should be interfering with the SATA system on the S5520SC, unless there are not enough PCI resouces left for the SATA devices.

The 6990 is a PCI Express 2.1 device, while the S5520SC has PCI Express 2.0 slots - but generally that should not be a problem for PCI Express, unless the firmware does not support it for some reason.

Is it possible there is some compatability problem that could be resolved with a firmware fix, or is this likey an actual hardware incompatability for which there is no fix?

I don't see any hardware jumpers on the card, or any other external means to change the configuration.

After putting my Radeon 4870 back in, the system works fine.

Cheers, Eric

idata
Employee
386 Views

I currently have a Radeon 4870 running in my system and it works fine, but I guess that's close enough to a 4850.

I wish I knew what resources the PCI Express system is running out of.

Cheers, Eric

0 Kudos
idata
Employee
386 Views

OK, in case anyone here is interested in the story...

So it turns out that the Radeon 6990 requires 375 watts. It gets 300 watts from the two 8-pin power connectors, and 75 watts from the PCI Express connector. If you put it in overclock mode it can consume more than 450 watts, but I'm not doing that.

Anyway, the S5520SC is only designed to support up to 300 watts in a graphics card (or cards). This is not a power supply limitation in the PCI Express hardware, rather it is a design decision by S5520SC design team. Looking at the Technical Product Specification, section 3.11.3 the best I can tell is that this limitation is a consideration for the amount of heat dissapation in an Intel chassis (which I am not using). Never mind that this section also assumes that the graphics card exhausts the heat to the rear of the chassis, so it doesn't really matter how much heat the graphics card puts out anyway. I wasn't on the design team, so I can't figure out what they were really thinking to design in such an arbitrary limitation (in a way that cannot be changed with a firmware update).

As far as I can tell during POST, the PCI host sends a Slot Power Limit message to the 6990 during Link Training. The 6990 sets its device registers to say it needs 375 watts, but that it also gets 300 watts from external sources. The S5520SC firmware decides 375 watts is too much (even though it really isn't) and records a "PCI out of resources" warning in the SEL. Even though this is reported as a warning, the POST fails to complete normally, so the SATA subsystem never gets initialized properly, consequently when the system gets to the BIOS or EFI shell it cannot see any SATA devices.

Personally, if I had been on the design team I would have coded the firmware to treat this as a warning, log it in the SEL (and elsewhere), and let the POST and SATA initialization complete normally.

Anyway, according to Intel technical support, they consider this a hardware limitation that cannot be rectified by a firmware update. Presumably the firmware that does the PCI power negotiation does not live in the BIOS, BMC, ME or FRUSDR and cannot be changed in the field. Again, not the way I would have designed it. The only way to rectify the limitation would be in a future product release of the S5520SC which they cannot commit to, nor can they discuss any plans for future product releases/revisions.

A better design for the firmware would have been to look in the FRUSDR, and if the user configured the chassis to be 'custom' then it should assume the user knows what they are doing, and the system should not impose such a limitation. An even better solution would be to have a user settable power allowance in the FRUSDR and then it is really clear whether the user knows what they are doing or not. In general it is shortsighted design to build such assumptions into 'hardware' as it uncessarily restricts the versatility of the product.

Cheers, Eric

0 Kudos
Reply