Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Brunol
Beginner
265 Views

Help Another CATERR issue (IPMI LOG) using VROC Supermicro

I'm having the same issue has this post community.intel.com/t5/Software-Storage-Technologies/CATERR-issue-IPMI-LOG-using-VROC/m-p/1215608#M1553.

Did you find a solution for his problem?

My system is:

QTY Product_Name Notes/Remarks
2 Supermicro SuperServer 1029U-TN10RT  7 x NVME bays Populated
4 Intel(R) Xeon(R) Silver 4210R CPU @ 2.40GHz  
12 each server 96 GB REG ECC Ram Micron MTA9ASF1G72PZ-2G9E1
 
 
1

AOC-VROCINTMOD

 
6 NVMe INTEL SSDPE2KX01 RAID 10
1 INTEL SSDPELKX010T8 OS Drive (no raid)

 

All drives are in the latest firmware.

Vroc software is the latest.

The use MS Windows Server 2019 Standard running hyper-v with virtual machines replicating to the other server.

The problem is when we run a couple Virtual Machines then the server will crash and reboot. In the IPMI log it says “processor, CATERR issue”.

We been in contact with our supplier and supermicro but no idea what is causing the issue.

They already replace all CPU's with new ones but the problem is still there.

Please let me know if you need more information.

0 Kudos
5 Replies
BrusC_Intel
Moderator
245 Views

Hello, Brunol.


Thank you for posting in the Intel Community Support forum.


I received your ticket regarding this Intel VROC error message, I will be glad to assist you.


Please allow me to review the information and I will contact you back as soon as possible.


Best regards,


Bruce C.

Intel Customer Support Technician


BrusC_Intel
Moderator
232 Views

Hello, Brunol.

 

Thank you for waiting.

 

There are some details we would like to confirm: 

  • Are the drives spanning VMDs or are they one the same VMD? 
  • Is this happening in different RAID configurations?
  • You mentioned that the configuration is RAID 10, this should be 4 drives supported, but the configuration says 6 drives, besides that the report shows RAID 5.

configurations.jpg

 

I was not able to find errors in the reports, but please see the following details/recommendations for Windows:

 

A single Intel VROC Driver instance is loaded on top of all VMD domains. This Intel VROC Driver handles communication with SSDs connected to each domain and has the capability to merge those SSDs into RAID volumes. RAID logic is implemented on the software side and is realized by the Intel VROC RAID Engine. In Windows, the current Intel VROC RAID Engine architecture has limited efficiency and, in some cases, can limit theoretical maximum RAID performance.

 

For maximum theoretical performance when millions of operations per second occur, the performance of the Intel VROC driver is negatively impacted by the number of shared structures that cannot be accessed in parallel. This results in increased lock contentions and finite performance scalability.

 

Due to Intel VROC RAID Engine efficiency IOPS numbers are currently limited to: 

  • ~1M IOPS for single RAID volume
  • ~1.4M IOPS for multiple RAID volumes

 

Intel VROC RAID ENGINE Adjustments:

 

The Intel VROC RAID Engine is multithreaded; by default, there are 10 active threads. The optimal number of Intel RAID Engine threads may vary for different setups and different workloads. This value is adjustable and can be changed by modifying the following registry key:

  • HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\
  •  Services\iaVROC\Parameters\Device\Threads

 

Reducing the number of active threads will limit maximum performance. Increasing the number of active threads may improve overall maximum performance, especially for workloads generated to multiple RAID volumes in parallel.

 

Users should not set the threads number to a value higher than the number of physical cores on the first socket of a target platform.

 

To apply the changes a platform reboot is required.

 

If you have any questions, please let us know, I will follow up on May 4th or we can schedule a different date if necessary.

 

Best regards,

 

Bruce C.

Intel Customer Support Technician

Brunol
Beginner
215 Views

Hi Bruce,

Here is the answers for your questions:

 

Are the drives spanning VMDs or are they one the same VMD? 

Yes, we enable VMD Controller spanning.

Does this is happening in different RAID configurations?

Yes, i try raid 5 with only 3 drives and i have some errors and crash.

You mentioned that the configuration is RAID 10, this should be 4 drives supported, but the configuration says 6 drives, besides that the report shows RAID 5.

Sorry, my configuration is raid 5 with spare drive for hot swap.

I done the adjustments that you suggest and I'm still having the same issue, the registry option i change to the value 20.

 

Let me know if you need more information from me.

Regards,

Bruno Lopes

 

 

BrusC_Intel
Moderator
200 Views

Hello, Brunol.


Good day,


Thank you for the details and screenshots.


Please allow us to review this and I will contact you back as soon as possible.


Best regards,


Bruce C.

Intel Customer Support Technician


BrusC_Intel
Moderator
117 Views

Hello, Brunol.


Good day,


Thank you very much for waiting.


The exact reason for this error message could not be determined, nor confirm it is strictly related to VROC, the configuration, or hardware.


I completely understand you already tried working with the system manufacturer on this, and since this is a platform feature, the only recommendation right now would be to get back to them for assistance on the debug process. If the OEM determines it is related to Intel® VROC, they can work with us (Intel®) directly on a possible fix.


With that recommendation being shared, the thread will be closed right now as there is no further assistance we can provide, but if you require any type of assistance in the future, you can always contact us back by opening a new thread or via any of the other support methods:

- https://www.intel.com/content/www/us/en/support/contact-support.html

- Remember to select your appropriate location.


Best regards,


Bruce C.

Intel Customer Support Technician


Reply