Server Products
Data Center Products including boards, integrated systems, Intel® Xeon® Processors, RAID Storage, and Intel® Xeon® Processors
4778 Discussions

Intel S2600GZ Risers Poor Performance - 3 x8 vs x16 x8


Hello all,

I have an S2600GZ 2U server with dual Xeon E5-2643v2 and 384GB of RAM.

I've installed a GPU in one of the CPU2 x8 slots, and while performance is good I'd like to use the full bandwidth of an x16 slot for the GPU.

I purchased an x16 x8 riser kit, Intel Product Code A2UL16RISER2. When installed the GPU is detected and functions normally, GPU-z shows that it is running at PCIe 3.0 x16, so as far as I can tell the riser is working properly.

The issue is when running the CUDA app - the performance on the x16 riser is terrible. The GPU completes the task in 8-10 minutes on the x8 slot, but takes upwards of an hour on the x16 slot.

I have run some benchmarks and see that the device to host throughput is very low.

When using the x8 slot, it is roughly 5-6GB/s in either direction ( device to host or host to device ), when using the x16 slot it is 10-12GB/s host to device but only 350-600MB/s device to host.


I have seen some discussion about CUDA and NUMA topology for similar poor performance from device to host, but for either riser all the lanes should connect to CPU2. I have tried switching riser slots so that the x16 slot is connected to CPU1, and this does improve performance but only marginally.


Any ideas?



0 Kudos
1 Reply

Hello, zcomputerwiz,

Due to the Intel® Server Board S2600GZ being discontinued, Intel Customer Service no longer supports inquiries for it, but perhaps fellow community members have the knowledge to jump in and help. You may also find the Discontinued Products website helpful to address your request:

Thank you for understanding. Best regards,


0 Kudos