Server Products
Data Center Products including boards, integrated systems, Intel® Xeon® Processors, RAID Storage, and Intel® Xeon® Processors
4776 Discussions

Issue with using Data Streaming Accelerator

Johnnyjax
Beginner
2,009 Views

Hello, I had trouble with either the dsa-perf-micros and the accel-config test, it seems to be related to having no pasid support.

running "$ ./scripts/setup_dsa.sh configs/4e1w-d.conf" in dsa-perf-micros gives this error:

disabled dsa0 

enabled 1 device(s) out of 1 

failed in dsa0/wq0.0 

enabled 0 wq(s) out of 1 

Error[0x80110000] dsa0/wq0.0: Unknown error 

 

Running "'accel-config test" after building gives this error related to no pasid support ->

run test_libaccfg 

__accfg_test_skip: explicit skip test_libaccfg:928 

device has no pasid support, skipping tests 

test-libaccfg: SKIP 

libaccfg: accfg_unref: context 0x560f2c1682a0 released

 

cat /proc/cmdline gives

"BOOT_IMAGE=/vmlinuz-6.5.0-14-generic root=/dev/mapper/ubuntu--vg-ubuntu--lv ro intel_iommu=on,sm_on"

 

cat /boot/config-6.5.0-14-generic | grep CONFIG_INTEL_IOMMU gives:

CONFIG_INTEL_IOMMU=y
CONFIG_INTEL_IOMMU_SVM=y
# CONFIG_INTEL_IOMMU_DEFAULT_ON is not set
CONFIG_INTEL_IOMMU_FLOPPY_WA=y
# CONFIG_INTEL_IOMMU_SCALABLE_MODE_DEFAULT_ON is not set
CONFIG_INTEL_IOMMU_PERF_EVENTS=y

and

cat /boot/config-6.5.0-14-generic | grep CONFIG_INTEL_IDXD gives

CONFIG_INTEL_IDXD_BUS=m
CONFIG_INTEL_IDXD=m
# CONFIG_INTEL_IDXD_COMPAT is not set
CONFIG_INTEL_IDXD_SVM=y
CONFIG_INTEL_IDXD_PERFMON=y

My server is a sapphire rapids server and Here's my Baseboard configuration:

Base Board Information
Manufacturer: Supermicro
Product Name: X13DEM
Version: 1.02A
Serial Number: HM22BS019943
Asset Tag: Base Board Asset Tag
Features:
Board is a hosting board
Board is replaceable
Location In Chassis: Part Component
Chassis Handle: 0x0003
Type: Motherboard
Contained Object Handles: 0

 

Here's the CPU information:

Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 46 bits physical, 57 bits virtual
Byte Order: Little Endian
CPU(s): 80
On-line CPU(s) list: 0-79
Vendor ID: GenuineIntel
Model name: Intel(R) Xeon(R) Silver 4416+
CPU family: 6
Model: 143
Thread(s) per core: 2
Core(s) per socket: 20
Socket(s): 2
Stepping: 8
CPU max MHz: 3900.0000
CPU min MHz: 800.0000

 

Here's what some of the relevant information from sudo lspci -vvv -s 75:01.0 produced:

Capabilities: [220 v1] Address Translation Service (ATS)
ATSCap: Invalidate Queue Depth: 00
ATSCtl: Enable+, Smallest Translation Unit: 00
Capabilities: [230 v1] Process Address Space ID (PASID)
PASIDCap: Exec- Priv+, Max PASID Width: 14
PASIDCtl: Enable+ Exec- Priv+
Capabilities: [240 v1] Page Request Interface (PRI)
PRICtl: Enable- Reset-
PRISta: RF- UPRGI- Stopped+
Page Request Capacity: 00000200, Page Request Allocation: 00000200
Kernel driver in use: idxd
Kernel modules: idxd

 

0 Kudos
18 Replies
Meghak
Employee
1,983 Views

Hi Johnnyjax,

 

Thank you for posting on Intel Community.

 

Please note we are reviewing the query reported and we shall get back with an update at the earliest.

 

Regards,

Megha K


0 Kudos
Meghak
Employee
1,950 Views

Hi Johnnyjax,

 

Thank you for posting on Intel Community.

 

Please note we have reviewed the issue reported and we request to help us with the below details.

 

Kernel Boot Parameters:

Run grep -i vtd /etc/default/grub and grep -i vtx /etc/default/grub

Look for options like intel_vtd.active=off or intel_vtd_cmt_mode=off that might disable VT-d functionalities.

 

VT-d Status:

Run grep -i vtd /proc/cpuinfo and grep -i vtx /proc/cpuinfo

Look for entries like "VT-d: active" or "VT-d: disabled" to confirm VT-d activation.

 

Driver Status:

Run dmesg | grep -i prictl or journalctl | grep -i prictl

Look for any error messages or information about PRICtl driver loading or initialization issues.

If VT-d is disabled, investigate the cause through BIOS settings or kernel boot parameters.

 

On the BIOS settings, please make sure: enabled "Opt-Out Illegal MSI Mitigation", enabled "PRS Capability for PCIe" and disabled "Limit CPU PA to 46 bits".

 

We request to consult with the system or motherboard documentation for specific BIOS settings related to VT-d and VT-x and make sure no additional config is needed on BIOS level.

 

Also, on the Intel® Data Streaming Accelerator User Guide ( https://www.intel.com/content/www/us/en/content-details/759709/intel-data-streaming-accelerator-user-guide.html?DocID=759709)- Page 15, there is a description about checking the DSA on the OS:

 

Run lspci -vvv -s 6a:01.0

 

The output should be like the following:

 

  1. Capabilities: [220 v1] Address Translation Service (ATS)
  2. ATSCap: Invalidate Queue Depth: 00
  3. ATSCtl: Enable+, Smallest Translation Unit: 00

 

 

Run dmesg | grep -i idxd if you see “Unable to turn on SVA feature”, VT-d scalable mode may not be enabled by default, reboot with “intel_iommu=on,sm_on” added to the kernel command line to enable VT-d scalable mode.

 

Run dmesg | grep idxd

Confirm if SVA if enabled. If unable to turn on SVA:

  • Add this grub config command in /etc/default/grub file:

GRUB_CMDLINE_LINUX_DEFAULT="intel_iommu=on,sm_on iommu=on"

  • Run command:

update-grub

  • Reboot the system.

 

If the commands above do not show the issue, please report back the output to review further.

 

Regards,

Megha K


0 Kudos
Johnnyjax
Beginner
1,891 Views

-

0 Kudos
Johnnyjax
Beginner
1,891 Views

-

0 Kudos
Johnnyjax
Beginner
1,891 Views

Hello,

Thank you for your continued support. I've made several changes to our system based on your recommendations, but I'm still encountering some issues.

  1. BIOS Settings Adjustments:

    • Enabled "Opt-Out Illegal MSI Mitigation."
    • Could not find "PRS Capability for PCIe" or "Limit CPU PA to 46 bits." in the BIOS settings
    • Confirmed that both VT-d and VT-x are enabled in the BIOS
  2. GRUB Configuration:

    • Updated GRUB_CMDLINE_LINUX_DEFAULT to "intel_iommu=on,sm_on iommu=on" in the GRUB configuration file.
  3. System Changes Observed:

    • Post these changes the command sudo lspci -vvv -s xx:xx.x shows that ATS, PASID, and PRI capabilities are enabled with the following output:
      Capabilities: [220 v1] Address Translation Service (ATS)
      ATSCap: Invalidate Queue Depth: 00
      ATSCtl: Enable+, Smallest Translation Unit: 00
      Capabilities: [230 v1] Process Address Space ID (PASID)
      PASIDCap: Exec- Priv+, Max PASID Width: 14
      PASIDCtl: Enable+ Exec- Priv+
      Capabilities: [240 v1] Page Request Interface (PRI)
      PRICtl: Enable+ Reset-
      PRISta: RF- UPRGI- Stopped+
      Page Request Capacity: 00000200, Page Request Allocation: 00000200
      Kernel driver in use: idxd
      Kernel modules: idxd
  4. Issues with Work Queues:

    • Able to enable Dedicated Work Queues (DWQs) but not Shared Work Queues.
    • Despite enabling DWQs, PASID appears to be not enabled (cat /sys/bus/dsa/devices/dsa0/pasid_enabled returns 0) despite the previous command showing it was enabled.

      Running "dmesg | grep idxd" gives:

      [ 10.399537] idxd 0000:75:01.0: enabling device (0144 -> 0146)
      [ 10.399586] idxd 0000:75:01.0: Unable to turn on user SVA feature.
      [ 10.469665] idxd 0000:75:01.0: Intel(R) Accelerator Device (v100)
      [ 10.470148] idxd 0000:75:02.0: enabling device (0140 -> 0142)
      [ 10.470217] idxd 0000:75:02.0: Unable to turn on user SVA feature.
      [ 10.633482] idxd 0000:75:02.0: Intel(R) Accelerator Device (v100)
      [ 10.633717] idxd 0000:f2:01.0: enabling device (0144 -> 0146)
      [ 10.633797] idxd 0000:f2:01.0: Unable to turn on user SVA feature.
      [ 10.648503] idxd 0000:f2:01.0: Intel(R) Accelerator Device (v100)
      [ 10.648605] idxd 0000:f2:02.0: enabling device (0140 -> 0142)
      [ 10.648629] idxd 0000:f2:02.0: Unable to turn on user SVA feature.
      [ 10.656282] idxd 0000:f2:02.0: Intel(R) Accelerator Device (v100)
      [ 2029.559277] idxd 0000:75:01.0: No shared wq support but configured.
      [ 2052.000604] idxd 0000:f2:01.0: No shared wq support but configured.
      [ 2609.360100] idxd 0000:75:01.0: No shared wq support but configured.

  5. Errors Encountered:

    • Attempting to run dsa-perf-micros test with DWQs led to errors related to a timeout polling the completion records. The dmesg logs show repeated instances of the following errors:
      [15371.702030] dmar12: Invalid page request: 75100001 7f940eeff0b7
      [15371.702031] idxd 0000:75:02.0: err[2]: 0x00007f940eefe000
      [15371.702035] IOMMU: dmar12: Page request without PASID
  6. Additional Checks:

    • Commands like grep -i vtd /etc/default/grub, grep -i vtx /etc/default/grub, grep -i vtd /proc/cpuinfo, grep -i vtx /proc/cpuinfo, dmesg | grep -i prictl, and journalctl | grep -i prictl yielded no results.

I would greatly appreciate any further assistance or insights you can provide to help resolve these issues.

Thank you.

0 Kudos
Johnnyjax
Beginner
1,891 Views

-

0 Kudos
Meghak
Employee
1,839 Views

Hi Johnnyjax,

 

Thank you for posting on Intel Community.

 

Please note we are reviewing the query reported and we shall get back with an update at the earliest.

 

Regards,

Megha K


0 Kudos
Meghak
Employee
1,818 Views

Hi Johnnyjax,

 

Thank you for posting on Intel Community.

 

Please note since the commands shared do not show the issue, we request to help us with the output to review further.

 

Regards,

Megha k


0 Kudos
Johnnyjax
Beginner
1,814 Views

Hi Meghak,

The issue is there's no shared WQ support and  that PASID still seems to not be enabled despite doing the following:

  1. BIOS Settings Adjustments:

    • Enabled "Opt-Out Illegal MSI Mitigation."
    • Could not find "PRS Capability for PCIe" or "Limit CPU PA to 46 bits." in the BIOS settings
    • Confirmed that both VT-d and VT-x are enabled in the BIOS
  2. GRUB Configuration:

    • Updated GRUB_CMDLINE_LINUX_DEFAULT to "intel_iommu=on,sm_on iommu=on" in the GRUB configuration file.
  3. System Changes Observed:

    • Post these changes the command sudo lspci -vvv -s xx:xx.x shows that ATS, PASID, and PRI capabilities are enabled with the following output:
      Capabilities: [220 v1] Address Translation Service (ATS)
      ATSCap: Invalidate Queue Depth: 00
      ATSCtl: Enable+, Smallest Translation Unit: 00
      Capabilities: [230 v1] Process Address Space ID (PASID)
      PASIDCap: Exec- Priv+, Max PASID Width: 14
      PASIDCtl: Enable+ Exec- Priv+
      Capabilities: [240 v1] Page Request Interface (PRI)
      PRICtl: Enable+ Reset-
      PRISta: RF- UPRGI- Stopped+
      Page Request Capacity: 00000200, Page Request Allocation: 00000200
      Kernel driver in use: idxd
      Kernel modules: idxd
  4. Issues with Work Queues:

    • Able to enable Dedicated Work Queues (DWQs) but not Shared Work Queues.
    • Despite enabling DWQs, PASID appears to be not enabled (cat /sys/bus/dsa/devices/dsa0/pasid_enabled returns 0) despite the previous command showing it was enabled.

      Running "dmesg | grep idxd" gives:

      [ 10.399537] idxd 0000:75:01.0: enabling device (0144 -> 0146)
      [ 10.399586] idxd 0000:75:01.0: Unable to turn on user SVA feature.
      [ 10.469665] idxd 0000:75:01.0: Intel(R) Accelerator Device (v100)
      [ 10.470148] idxd 0000:75:02.0: enabling device (0140 -> 0142)
      [ 10.470217] idxd 0000:75:02.0: Unable to turn on user SVA feature.
      [ 10.633482] idxd 0000:75:02.0: Intel(R) Accelerator Device (v100)
      [ 10.633717] idxd 0000:f2:01.0: enabling device (0144 -> 0146)
      [ 10.633797] idxd 0000:f2:01.0: Unable to turn on user SVA feature.
      [ 10.648503] idxd 0000:f2:01.0: Intel(R) Accelerator Device (v100)
      [ 10.648605] idxd 0000:f2:02.0: enabling device (0140 -> 0142)
      [ 10.648629] idxd 0000:f2:02.0: Unable to turn on user SVA feature.
      [ 10.656282] idxd 0000:f2:02.0: Intel(R) Accelerator Device (v100)
      [ 2029.559277] idxd 0000:75:01.0: No shared wq support but configured.
      [ 2052.000604] idxd 0000:f2:01.0: No shared wq support but configured.
      [ 2609.360100] idxd 0000:75:01.0: No shared wq support but configured.

  5. Errors Encountered:

    • Attempting to run dsa-perf-micros test with DWQs led to errors related to a timeout polling the completion records. The dmesg logs show repeated instances of the following errors:
      [15371.702030] dmar12: Invalid page request: 75100001 7f940eeff0b7
      [15371.702031] idxd 0000:75:02.0: err[2]: 0x00007f940eefe000
      [15371.702035] IOMMU: dmar12: Page request without PASID
  6. Additional Checks:

    • Commands like grep -i vtd /etc/default/grub, grep -i vtx /etc/default/grub, grep -i vtd /proc/cpuinfo, grep -i vtx /proc/cpuinfo, dmesg | grep -i prictl, and journalctl | grep -i prictl yielded no results.

I would greatly appreciate any further assistance or insights you can provide to help resolve these issues.

Thank you.

0 Kudos
Meghak
Employee
1,793 Views

Hi Johnnyjax,

 

Thank you for posting on Intel Community.

 

Please note we are currently reviewing the issue reported and will provide you with a solution as soon as possible.

 

Regards,

Megha k


0 Kudos
Meghak
Employee
1,730 Views

Hi Johnnyjax,

 

Thank you for posting on Intel Community.

 

For the issue reported, we request you to try insert the vfio-pci driver to the kernel and then run “dmesg | tail” command to see whether the driver loads successfully or not?

 

Please let us know the outcome.

 

Regards,

Megha K


0 Kudos
Meghak
Employee
1,628 Views

Hi Johnnyjax,

 

Thank you for posting on Intel Community.

 

Please note this is a follow up post requesting to confirm the details requested on the previous post.

 

We await your response to assist you further.

 

Regards,

Megha K


0 Kudos
Meghak
Employee
1,549 Views

Hi Johnnyjax,

 

Thank you for posting on Intel Community.

 

We have not received a reply from you , and as such, we will be closing your case.

 

If you want to continue support, please reply to this post and we will reopen your case or create a new one so that we can continue to support you.

 

Regards,

Megha K


0 Kudos
Johnnyjax
Beginner
1,543 Views

Hello,

 

Sorry for the late reply. I inserted the vfio-pci driver to the kernel and then ran “dmesg | tail” and it showed that it enabled the vfio-pci driver: "[56273.148644] vfio-pci 0000:f3:02.0: enabling device (0000 -> 0002)". Despite this I still get the same previous errors where it shows that there's no passid support when I run "cat /sys/bus/dsa/devices/dsa0/pasid_enabled" which returns 0. running "dmesg | grep idxd" also shows  "idxd 0000:75:01.0: No shared wq support but configured."

I've been able to use the DSA device within dpdk using the vfio-pci driver" but I still can't use it with the kernel with the idxd driver outside of dpdk. It still shows that passid is not enabled and that there's no shared work queue support.

 

Thanks.

0 Kudos
Meghak
Employee
1,535 Views

Hi Johnnyjax,

 

Thank you for confirming the details.

 

Please note we are currently reviewing the issue reported and will provide you with a solution as soon as possible.

 

Regards,

Megha k



0 Kudos
Meghak
Employee
1,172 Views

Hi Johnnyjax,

 

Thank you for posting on Intel community.

 

Please accept our sincere apologies for the delay in response.

 

Please note we are still waiting for an update from our Engineering team who is checking the usage of DSA outside DPDK environment.


We will provide you an update as soon as possible.

 

Regards,

Megha k


0 Kudos
Meghak
Employee
1,064 Views

Hi Johnnyjax,

 

Thank you for posting on Intel Community.

 

Please note we have reviewed your query and at this point we do not have support for DSA outside the DPDK environment.

 

 DSA (and other Accelerators) were designed to utilize DPDK by providing a user-space programming environment for high-performance packet processing applications.

 

DPDK (Data Plane Development Kit) is a set of libraries and drivers that provides high-performance packet processing frameworks for building software-defined applications.

 

Please let us know if we may proceed with case closure if there are no further queries.

 

Regards,

Megha K


0 Kudos
Meghak
Employee
1,037 Views

Hi Johnnyjax,

 

Greetings for the day!

 

Please note we have not received a reply from you , and as such, we will be closing your case.

 

If you want to continue support, please reply to this post and we will reopen your case or create a new one so that we can continue to support you.

 

Best Regards,

Megha k

 


0 Kudos
Reply