Server Products
Data Center Products including boards, integrated systems, Intel® Xeon® Processors, RAID Storage, and Intel® Xeon® Processors
5002 Discussions

in ubuntu2404 of SRP EE server, DSA driver idxd reports "Unable to turn on user SVA feature"

lihengzh
Employee
1,935 Views

hi, 

I encoutner a blocking issue when enabling wq of DSA in in ubuntu2404 of SRP EE server. could any expert give some help?

thanks a lot!!!

 

root@npg-l2-1uspree-01:~# accel-config enable-wq dsa0/wq0.0
failed in dsa0/wq0.0
enabled 0 wq(s) out of 1
Error[0x80110000] dsa0/wq0.0: Unknown error

root@npg-l2-1uspree-01:~# dmesg | grep idxd
[ 8.738647] idxd 0000:f6:01.0: Unable to turn on user SVA feature.
[ 8.814550] idxd 0000:f6:01.0: Intel(R) Accelerator Device (v100)

 

here is the kernel version:

root@npg-l2-1uspree-01:~# cat /proc/version
Linux version 6.8.1-1002-realtime (buildd@lcy02-amd64-007) (x86_64-linux-gnu-gcc-13 (Ubuntu 13.2.0-23ubuntu4) 13.2.0, GNU ld (GNU Binutils for Ubuntu) 2.42) #2-Ubuntu SMP PREEMPT_RT Tue May 21 21:13:36 UTC 2024

here is the OS version

root@npg-l2-1uspree-01:~# cat /etc/os-release
PRETTY_NAME="Ubuntu 24.04 LTS"
NAME="Ubuntu"
VERSION_ID="24.04"
VERSION="24.04 LTS (Noble Numbat)"
VERSION_CODENAME=noble
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=noble
LOGO=ubuntu-logo
root@npg-l2-1uspree-01:~#

 

Other DMA devices
=================
0000:f6:01.0 'Device 0b25' unused=idxd,igb_uio,vfio-pci

 

0 Kudos
10 Replies
MACM
Employee
1,868 Views

Hi Liheng,


Greetings from Intel.


Hope you are doing well.


Could you please let us know the current firmware versions.


So that we can look into it and proceed further accordingly.


Best regards,

Mohammed Ali


0 Kudos
MACM
Employee
1,841 Views

Hello Liheng,

 

Thank you for contacting Intel.

 

This is the first follow-up regarding the issue you reported to us.


We wanted to inquire whether you had the opportunity to review the plan of action we provided.

 

Feel free to reply to this email, and we'll be more than happy to assist you further.

 

Regards,

Ali

Intel Customer Support


0 Kudos
Sachinks
Employee
1,819 Views

Hello lihengzh,


Greetings!


Kindly provide us the below system details so that we can check this further.


1) The motherboard details that you are using ( model name )

2) The processor details 


Also you can refer the below guide on troubleshooting this issue :


An issue with using Data Streaming Accelerator. When trying to test Intel® Data Streaming Accelerator executing dsa-perf-micros and the accel-config test, an error output is show: 


$ ./scripts/setup_dsa.sh configs/4e1w-d.conf" in dsa-perf-micros gives this error:


disabled dsa0


enabled 1 device(s) out of 1


failed in dsa0/wq0.0


enabled 0 wq(s) out of 1


Error[0x80110000] dsa0/wq0.0: Unknown error


We need to analyze the information from DSA and Linux commands together and look for inconsistencies or missing elements related to VT-d and PRICtl functionalities.


Kernel Boot Parameters: 


Run grep -i vtd /etc/default/grub and grep -i vtx /etc/default/grub


Look for options like intel_vtd.active=off or intel_vtd_cmt_mode=off that might disable VT-d functionalities.


VT-d Status: 


Run grep -i vtd /proc/cpuinfo and grep -i vtx /proc/cpuinfo


Look for entries like "VT-d: active" or "VT-d: disabled" to confirm VT-d activation.


Driver Status: 


Run dmesg | grep -i prictl or journalctl | grep -i prictl


Look for any error messages or information about PRICtl driver loading or initialization issues. If VT-d is disabled, investigate the cause through BIOS settings or kernel boot parameters.


On the BIOS settings, please make sure the following options are correct:


enabled "Opt-Out Illegal MSI Mitigation", enabled "PRS Capability for PCIe" and disabled "Limit CPU PA to 46 bits".


Please also consult with the system or motherboard documentation for specific BIOS settings related to VT-d and VT-x and make sure no additional config is needed on BIOS level.


On the Intel® Data Streaming Accelerator User Guide Page 15, there is a description about checking the DSA on the OS:

User guide : https://www.intel.com/content/www/us/en/content-details/759709/intel-data-streaming-accelerator-user-guide.html?DocID=759709


Run lspci -vvv -s 6a:01.0


The output should be like the following:


Capabilities: [220 v1] Address Translation Service (ATS)


ATSCap: Invalidate Queue Depth: 00


ATSCtl: Enable+, Smallest Translation Unit: 00


Run dmesg | grep -i idxd if you see “Unable to turn on SVA feature”, VT-d scalable mode may not be enabled by default, reboot with “intel_iommu=on,sm_on” added to the kernel command line to enable VT-d scalable mode.


Run dmesg | grep idxd

Received: Unable to turn on SVA

 

Add this grub config command in /etc/default/grub file:

GRUB_CMDLINE_LINUX_DEFAULT="intel_iommu=on,sm_on iommu=on"

Run update-grub

Reboot the system.


If the commands above do not show the issue, please collect OS Logs, output from the commands described on this article so that we can check this further.


Intel data streaming accelerator user guide : https://www.intel.com/content/www/us/en/content-details/759709/intel-data-streaming-accelerator-user-guide.html?DocID=759709


You can also refer to our resource and documentation center to find more information regarding this. Please note that you need an NDA to access the RDC premier account.


RDC link : https://www.intel.com/content/www/us/en/resources-documentation/developer.html#gs.5k1ay6


How to Apply for an Intel® Resource and Documentation Center (RDC) and/or Intel® Developer Zone (Intel® DevZone) Account:

https://www.intel.com/content/www/us/en/support/articles/000058073/programs/resource-and-documentation-center.html


So we request you to kindly provide the system details requested and also try out the troubleshooting guide shared to see if it resolves the issue.


Regards,

Sachin KS


0 Kudos
lihengzh
Employee
1,751 Views

@MACM @Sachinks 

Thank you very much for the great help!

I didn’t find more useful info with the following 13 commands, please feel free to let me know if you want to know info or traces.

  1. lscpu
  2. modinfo idxd
  3. modinfo vfio
  4. modinfo vfio-pci
  5. cat /proc/cmdline
  6. grep -i vtd /etc/default/grub
  7. grep -i vtx /etc/default/grub
  8. grep -i vtd /proc/cpuinfo
  9. grep -i vtx /proc/cpuinfo
  10. dmesg | grep -i prictl
  11. journalctl | grep -i prictl
  12. dmesg | grep idxd
  13. lspci -vvv -s f6:01.0

Here it the detailed info:

root@npg-l2-1uspree-01:~# lscpu

Architecture:             x86_64

  CPU op-mode(s):         32-bit, 64-bit

  Address sizes:          46 bits physical, 57 bits virtual

  Byte Order:             Little Endian

CPU(s):                   64

  On-line CPU(s) list:    0-63

Vendor ID:                GenuineIntel

  BIOS Vendor ID:         Intel

  Model name:             Intel(R) Xeon(R) Gold 6433N

    BIOS Model name:      Intel(R) Xeon(R) Gold 6433N  CPU @ 1.4GHz

    BIOS CPU family:      179

    CPU family:           6

    Model:                143

    Thread(s) per core:   2

    Core(s) per socket:   32

    Socket(s):            1

    Stepping:             8

    CPU(s) scaling MHz:   69%

    CPU max MHz:          3600.0000

    CPU min MHz:          800.0000

    BogoMIPS:             2800.00

    Flags:                fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm

                           pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid

                          aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dc

                          a sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fa

                          ult epb cat_l3 cat_l2 cdp_l3 cdp_l2 ssbd mba ibrs ibpb stibp ibrs_enhanced tpr_shadow flexpriority ept vpid ept_ad

                          fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a avx512f avx512dq rdseed adx smap avx512ifma clflusho

                          pt clwb intel_pt avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_tot

                          al cqm_mbm_local split_lock_detect user_shstk avx_vnni avx512_bf16 wbnoinvd dtherm ida arat pln pts hfi vnmi avx512

                          vbmi umip pku ospke waitpkg avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg tme avx512_vpopcntdq la57 r

                          dpid bus_lock_detect cldemote movdiri movdir64b enqcmd fsrm md_clear serialize tsxldtrk pconfig arch_lbr ibt avx512

                          _fp16 flush_l1d arch_capabilities

Virtualization features:

  Virtualization:         VT-x

Caches (sum of all):

  L1d:                    1.5 MiB (32 instances)

  L1i:                    1 MiB (32 instances)

  L2:                     64 MiB (32 instances)

  L3:                     60 MiB (1 instance)

NUMA:

  NUMA node(s):           1

  NUMA node0 CPU(s):      0-63

Vulnerabilities:

  Gather data sampling:   Not affected

  Itlb multihit:          Not affected

  L1tf:                   Not affected

  Mds:                    Not affected

  Meltdown:               Not affected

  Mmio stale data:        Not affected

  Reg file data sampling: Not affected

  Retbleed:               Not affected

  Spec rstack overflow:   Not affected

  Spec store bypass:      Mitigation; Speculative Store Bypass disabled via prctl

  Spectre v1:             Mitigation; usercopy/swapgs barriers and __user pointer sanitization

  Spectre v2:             Mitigation; Enhanced / Automatic IBRS; IBPB conditional; RSB filling; PBRSB-eIBRS SW sequence; BHI BHI_DIS_S

  Srbds:                  Not affected

  Tsx async abort:        Not affected

root@npg-l2-1uspree-01:~# modinfo idxd

filename:       /lib/modules/6.8.1-1002-realtime/kernel/drivers/dma/idxd/idxd.ko.zst

import_ns:      IDXD

author:         Intel Corporation

license:        GPL v2

version:        1.00

srcversion:     23680C20EC237C4A2D5D095

alias:          pci:v00008086d00000CFEsv*sd*bc*sc*i*

alias:          pci:v00008086d00000B25sv*sd*bc*sc*i*

depends:        idxd_bus

retpoline:      Y

intree:         Y

name:           idxd

vermagic:       6.8.1-1002-realtime SMP preempt mod_unload modversions

sig_id:         PKCS#7

signer:         Build time autogenerated kernel key

sig_key:        0E:B3:06:BA:A8:77:54:C7:BE:FE:3C:2C:A6:8C:47:19:47:29:65:3A

sig_hashalgo:   sha512

signature:      97:73:5E:81:DF:45:FE:8E:8C:74:75:9C:7B:A3:A5:6F:77:B9:65:D9:

                85:7E:2D:3B:26:B8:44:5B:D6:E9:B6:AC:4C:15:AB:7A:CF:05:7A:69:

                B1:20:0E:63:84:12:2D:B5:3B:1E:D4:E5:D4:6A:9A:B6:E0:4E:5B:7E:

                AD:15:48:FD:81:31:CE:9A:03:BA:F2:85:0B:E1:D4:FC:D2:47:4C:AC:

                E1:0F:EE:4C:48:DC:46:0B:8E:F8:88:7A:6E:C0:BC:51:0F:4B:DE:1A:

                90:66:94:94:01:C2:E8:33:60:92:17:EF:E1:6B:9E:58:14:86:78:A4:

                44:83:20:0F:79:C2:05:94:F7:F3:FA:5E:A0:12:92:CE:70:65:B1:5A:

                B1:26:61:97:11:C1:5A:7D:60:37:02:37:7D:54:EB:DC:F0:1C:12:67:

                F0:B9:D0:93:54:A2:78:7D:8E:88:E4:58:96:CF:70:03:C0:72:31:B5:

                BA:BC:A7:87:F4:FD:91:36:DE:41:69:78:9C:EF:45:38:E0:68:C7:B4:

                86:78:61:2B:F1:A8:5C:4F:F2:0B:97:C0:C1:53:54:61:DA:93:7F:73:

                36:EE:BD:91:26:16:C7:FA:69:21:BA:17:11:65:D7:07:64:FC:45:E6:

                16:AA:BB:C6:83:EA:78:74:02:7E:A1:7A:9B:04:D8:F0:E3:D2:B8:E9:

                27:7F:63:B8:11:30:0B:34:B7:8E:75:35:06:74:3A:25:E7:35:0D:3F:

                99:E9:34:9D:51:9F:86:A6:7F:BB:0D:2F:E1:27:EE:B3:88:5C:BD:D9:

                7E:79:49:8D:E2:38:75:B4:98:69:C5:20:54:38:EC:D5:21:B4:E8:C7:

                99:0D:90:8C:15:A7:05:EA:B5:4D:DD:40:44:83:4A:98:D9:16:1D:F4:

                F4:61:A1:E5:27:A8:91:CA:53:D7:D8:D9:AD:89:CB:09:A5:4D:39:71:

                FF:F5:99:97:27:10:A6:A1:37:E6:DF:83:7A:9F:E4:B5:01:C9:58:63:

                4B:3A:01:4A:2F:3C:43:B2:2E:14:27:32:FF:8D:43:84:8D:72:E9:7B:

                1F:FF:43:66:01:A6:8B:40:A8:1E:11:B6:EF:FC:8D:B2:43:48:C5:16:

                46:65:7C:05:CE:66:81:48:CC:B5:A1:7C:67:C4:69:62:81:1F:A7:C2:

                CB:65:E6:A9:36:F9:60:37:A8:08:F3:AE:B5:63:6F:E1:D9:4F:BF:B3:

                77:22:3D:D0:BF:77:99:08:17:22:C5:7C:49:24:ED:AF:30:41:E6:BC:

                02:B2:D0:2E:B9:22:7E:4E:8D:70:4B:2F:E8:49:28:93:CB:E0:56:70:

                D6:6A:66:C7:C3:69:36:83:5B:80:83:B4

parm:           sva:Toggle SVA support on/off (bool)

parm:           tc_override:Override traffic class defaults (bool)

root@npg-l2-1uspree-01:~# modinfo vfio

filename:       /lib/modules/6.8.1-1002-realtime/kernel/drivers/vfio/vfio.ko.zst

softdep:        post: vfio_iommu_type1 vfio_iommu_spapr_tce

description:    VFIO - User Level meta-driver

author:         Alex Williamson alex.williamson@redhat.com

license:        GPL v2

version:        0.3

import_ns:      IOMMUFD

alias:          devname:vfio/vfio

alias:          char-major-10-196

import_ns:      IOMMUFD_VFIO

import_ns:      IOMMUFD

srcversion:     71E8940B20D02E0F6F5D499

depends:        iommufd

retpoline:      Y

intree:         Y

name:           vfio

vermagic:       6.8.1-1002-realtime SMP preempt mod_unload modversions

sig_id:         PKCS#7

signer:         Build time autogenerated kernel key

sig_key:        0E:B3:06:BA:A8:77:54:C7:BE:FE:3C:2C:A6:8C:47:19:47:29:65:3A

sig_hashalgo:   sha512

signature:      2C:99:F2:F9:4D:20:4E:F4:0D:74:1C:B2:A1:CD:34:DF:3E:0A:83:39:

                0F:B5:53:DC:0E:84:50:9A:37:D6:32:7F:13:83:3A:CB:7D:08:8F:86:

                D3:B1:16:73:43:61:BF:B7:47:1E:8C:58:B4:22:48:56:57:3C:E7:81:

                F0:C4:CF:6C:9F:CF:06:2E:C2:60:86:96:2F:E4:3F:55:FF:81:DD:6E:

                16:C9:8C:BA:CE:8E:46:9C:75:F6:81:3D:17:F2:C1:64:53:06:5E:3D:

                B4:1C:51:71:C5:DB:83:D7:8D:FD:6C:02:94:92:8C:84:44:53:15:2A:

                F0:A2:48:9B:96:B3:6B:55:5C:0D:95:1A:32:36:2F:74:3A:0E:03:A9:

                4D:29:CD:EA:F0:B6:3D:D6:8C:B3:6F:14:40:4B:A0:A6:4D:B8:F5:38:

                9B:DA:15:8C:E5:55:21:24:93:11:59:21:6A:A5:7D:61:23:8D:F8:CF:

                F2:D1:C8:3A:D1:C6:75:7F:36:34:ED:1B:B8:E6:C4:4F:1A:B1:0E:35:

                29:F7:39:E2:97:45:50:D4:3C:3F:63:56:03:7A:01:53:5C:D8:53:CF:

                D8:DC:3C:CF:5C:A9:83:C1:95:60:86:7D:BA:41:27:11:E0:BA:9A:08:

                42:E0:03:EC:44:63:5D:EB:D4:E6:84:4C:10:59:0A:E9:9B:AB:3C:4A:

                30:06:24:02:1B:3B:3A:C2:8D:66:85:F8:1A:F4:05:E0:AF:A3:51:CF:

                A3:ED:73:41:3E:28:F2:2F:DA:4A:91:4A:68:90:58:E7:D9:21:D7:AF:

                C5:5A:8A:A1:79:E4:79:A9:1C:A5:43:85:DA:33:2F:C5:5E:02:3A:AC:

                09:04:FA:E1:5A:C2:3D:C6:C9:83:05:D8:F1:44:4D:5D:06:CD:AA:F3:

                3A:B9:79:F7:BA:71:5C:EE:7C:25:4D:C1:F2:55:CA:24:B8:36:E2:FE:

                48:E3:29:35:45:0C:4B:D0:98:3C:45:2E:20:6F:41:0E:06:08:02:08:

                01:97:DF:DF:AF:2B:C1:42:02:76:EB:B4:A0:6F:EB:1A:B4:FF:8B:52:

                36:D2:04:F8:EB:72:48:A6:D6:B4:19:50:99:16:4A:36:95:C6:4E:1C:

                00:3A:16:52:09:49:FF:35:C1:E7:CA:E1:8D:84:FE:24:E5:D8:B4:7D:

                38:F7:07:83:85:FF:B9:29:4F:45:7C:56:C0:22:99:3C:B2:59:49:7A:

                85:66:75:0C:C1:69:02:2B:85:01:58:12:E7:D0:10:E2:8B:7E:0A:1C:

                6E:9B:4A:AD:7F:32:B6:1F:C8:5C:3E:A2:B7:E6:FA:A6:D1:65:A4:DA:

                C6:91:73:70:A8:CD:2D:B9:53:C9:D6:49

parm:           enable_unsafe_noiommu_mode:Enable UNSAFE, no-IOMMU mode.  This mode provides no device isolation, no DMA translation, no host kernel protection, cannot be used for device assignment to virtual machines, requires RAWIO permissions, and will taint the kernel.  If you do not know what this is for, step away. (default: false) (bool)

root@npg-l2-1uspree-01:~# modinfo vfio-pci

filename:       /lib/modules/6.8.1-1002-realtime/kernel/drivers/vfio/pci/vfio-pci.ko.zst

description:    VFIO PCI - User Level meta-driver

author:         Alex Williamson alex.williamson@redhat.com

license:        GPL v2

srcversion:     2961A59104CE4DB12621164

alias:          vfio_pci:v*d*sv*sd*bc*sc*i*

depends:        vfio-pci-core,vfio

retpoline:      Y

intree:         Y

name:           vfio_pci

vermagic:       6.8.1-1002-realtime SMP preempt mod_unload modversions

sig_id:         PKCS#7

signer:         Build time autogenerated kernel key

sig_key:        0E:B3:06:BA:A8:77:54:C7:BE:FE:3C:2C:A6:8C:47:19:47:29:65:3A

sig_hashalgo:   sha512

signature:      63:15:9B:7D:A0:D1:49:1A:46:CF:47:BB:66:73:AB:62:09:74:06:BC:

                03:96:F5:90:C7:B2:04:AB:75:4A:16:A6:98:DB:52:D0:5C:CF:A7:02:

                8E:29:5E:5B:E5:47:C5:67:05:92:62:C0:79:88:03:A6:0A:58:10:29:

                09:7C:6F:44:4C:30:29:97:E7:0A:AB:35:17:BE:4D:32:47:AF:47:61:

                E5:AE:42:18:BC:E4:C2:FD:7B:C5:6C:1D:D7:0B:A0:A1:21:C5:91:99:

                2E:B9:E7:08:0C:27:0A:87:19:BC:D2:93:0C:2F:34:A2:EB:E7:3D:FE:

                03:4E:FB:EC:9A:C0:51:22:AD:41:48:72:BD:0D:27:63:9D:A6:7A:EC:

                D2:1B:64:69:1E:C7:6E:38:54:17:34:16:72:2D:E4:1C:6C:39:B7:E9:

                10:78:97:A8:19:A6:FE:E4:47:D4:A4:96:CA:62:40:01:59:EE:AD:82:

                60:AB:2F:13:E0:97:37:E7:C4:21:77:73:B0:6C:8B:B1:47:F0:23:75:

                79:C1:AA:6A:FA:58:4D:28:9E:FE:08:74:48:7C:69:1D:5C:D3:5D:B8:

                53:2B:0D:B3:F6:5C:26:FE:50:7A:5C:69:D8:15:E0:ED:E3:B2:B0:25:

                3C:E1:27:62:6C:F1:71:06:F1:82:62:F7:72:CC:1E:FC:B6:C1:56:E6:

                F6:F6:F3:D7:2A:9C:3A:F2:99:17:FF:C0:3C:27:40:B7:58:15:34:B3:

                02:D6:A0:7A:4B:5F:71:3C:BF:22:62:BD:3F:66:C0:55:79:E7:56:50:

                EB:AC:64:2C:71:70:88:BA:5C:97:32:15:CD:30:F2:53:13:A3:AF:5B:

                FB:0C:B8:30:B0:13:2F:EB:74:67:8C:71:43:23:C3:91:51:46:FE:92:

                EE:3A:F4:6F:87:DB:67:1E:D4:F8:F3:51:2D:25:95:3E:8F:19:0C:12:

                6E:6F:18:F2:C0:25:2B:7A:A8:E4:37:DB:C6:39:9F:D6:9C:C8:B4:AC:

                38:B6:50:A8:33:30:34:BF:A1:39:8E:89:E2:C2:92:4B:50:F3:3F:D3:

                96:86:E5:73:61:BB:B7:96:CF:2E:34:73:88:7B:81:F3:F0:C7:FF:EA:

                0B:8B:15:F4:F2:48:E6:7A:02:E3:A1:A6:AD:97:91:2C:C1:92:39:27:

                81:A0:9C:0E:05:B7:51:38:54:D0:15:34:2E:28:E1:CB:7E:4E:82:FB:

                F1:FD:6A:87:0D:7C:0E:00:DF:BF:E7:1C:B9:6C:47:9F:FF:D0:CB:E7:

                60:37:D7:B0:69:6F:05:7A:83:66:7D:9D:95:A2:26:D3:5F:6F:63:58:

                54:97:DB:DE:69:3E:6B:34:D9:24:52:06

parm:           ids:Initial PCI IDs to add to the vfio driver, format is "vendor:device[:subvendor[:subdevice[:class[:class_mask]]]]" and multiple comma separated entries can be specified (string)

parm:           nointxmask:Disable support for PCI 2.3 style INTx masking.  If this resolves problems for specific devices, report lspci -vvvxxx to linux-pci@vger.kernel.org so the device can be fixed automatically via the broken_intx_masking flag. (bool)

parm:           disable_vga:Disable VGA resource access through vfio-pci (bool)

parm:           disable_idle_d3:Disable using the PCI D3 low power state for idle, unused devices (bool)

parm:           enable_sriov:Enable support for SR-IOV configuration.  Enabling SR-IOV on a PF typically requires support of the userspace PF driver, enabling VFs without such support may result in non-functional VFs or PF. (bool)

parm:           disable_denylist:Disable use of device denylist. Disabling the denylist allows binding to devices with known errata that may lead to exploitable stability or security issues when accessed by untrusted users. (bool)

root@npg-l2-1uspree-01:~# cat /proc/cmdline

BOOT_IMAGE=/vmlinuz-6.8.1-1002-realtime root=/dev/mapper/ubuntu--vg--00-ubuntu--lv ro intel_iommu=on,sm_on iommu=on vfio_pci.enable_sriov=1 vfio_pci.disable_idle_d3=1 usbcore.autosuspend=-1 selinux=0 enforcing=0 nmi_watchdog=0 crashkernel=auto softlockup_panic=0 audit=0 mce=off hugepagesz=1G hugepages=50 hugepagesz=2M hugepages=0 default_hugepagesz=1G kthread_cpus=0 irqaffinity=0,31,32,63 rcu_nocb_poll nohz=on skew_tick=1 skew_tick=1 tsc=reliable rcupdate.rcu_normal_after_boot=1 isolcpus=managed_irq,domain,1-30,33-62 nohz_full=1-30,33-62 rcu_nocbs=1-30,33-62 nosoftlockup tsc=nowatchdog

root@npg-l2-1uspree-01:~# grep -i vtd /etc/default/grub

root@npg-l2-1uspree-01:~# grep -i vtx /etc/default/grub

root@npg-l2-1uspree-01:~# grep -i vtd /proc/cpuinfo

root@npg-l2-1uspree-01:~# grep -i vtx /proc/cpuinfo

root@npg-l2-1uspree-01:~# dmesg | grep -i prictl

root@npg-l2-1uspree-01:~# journalctl | grep -i prictl

root@npg-l2-1uspree-01:~# dmesg | grep idxd

[    7.356870] idxd 0000:f6:01.0: Unable to turn on user SVA feature.

[    7.522017] idxd 0000:f6:01.0: Intel(R) Accelerator Device (v100)

root@npg-l2-1uspree-01:~# lspci -vvv -s f6:01.0

f6:01.0 System peripheral: Intel Corporation Device 0b25

        Subsystem: Intel Corporation Device 0000

        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-

        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-

        Latency: 0

        NUMA node: 0

        IOMMU group: 1

        Region 0: Memory at 26fffff20000 (64-bit, prefetchable) [size=64K]

        Region 2: Memory at 26fffff00000 (64-bit, prefetchable) [size=128K]

        Capabilities: [40] Express (v2) Root Complex Integrated Endpoint, MSI 00

                DevCap: MaxPayload 512 bytes, PhantFunc 0

                        ExtTag+ RBE+ FLReset+

                DevCtl: CorrErr- NonFatalErr+ FatalErr+ UnsupReq+

                        RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+ FLReset-

                        MaxPayload 512 bytes, MaxReadReq 4096 bytes

                DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend-

                DevCap2: Completion Timeout: Not Supported, TimeoutDis+ NROPrPrP- LTR+

                         10BitTagComp+ 10BitTagReq+ OBFF Not Supported, ExtFmt+ EETLPPrefix+, MaxEETLPPrefixes 1

                         EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-

                         FRS-

                         AtomicOpsCap: 32bit- 64bit- 128bitCAS-

                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis+ LTR- 10BitTagReq+ OBFF Disabled,

                         AtomicOpsCtl: ReqEn-

        Capabilities: [80] MSI-X: Enable+ Count=9 Masked-

                Vector table: BAR=0 offset=00002000

                PBA: BAR=0 offset=00003000

        Capabilities: [90] Power Management version 3

                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)

                Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-

        Capabilities: [100 v2] Advanced Error Reporting

                UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-

                UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt+ RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-

                UESvrt: DLP- SDES- TLP+ FCP- CmpltTO+ CmpltAbrt+ UnxCmplt- RxOF- MalfTLP+ ECRC- UnsupReq- ACSViol-

                CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr-

                CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+

                AERCap: First Error Pointer: 00, ECRCGenCap- ECRCGenEn- ECRCChkCap- ECRCChkEn-

                        MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-

                HeaderLog: 00000000 00000000 00000000 00000000

        Capabilities: [150 v1] Latency Tolerance Reporting

                Max snoop latency: 0ns

                Max no snoop latency: 0ns

        Capabilities: [160 v1] Transaction Processing Hints

                Device specific mode supported

                Steering table in TPH capability structure

        Capabilities: [170 v1] Virtual Channel

                Caps:   LPEVC=1 RefClk=100ns PATEntryBits=1

                Arb:    Fixed+ WRR32- WRR64- WRR128-

                Ctrl:   ArbSelect=Fixed

                Status: InProgress-

                VC0:    Caps:   PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-

                        Arb:    Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-

                        Ctrl:   Enable+ ID=0 ArbSelect=Fixed TC/VC=fd

                        Status: NegoPending- InProgress-

                VC1:    Caps:   PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-

                        Arb:    Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-

                        Ctrl:   Enable+ ID=1 ArbSelect=Fixed TC/VC=02

                        Status: NegoPending- InProgress-

        Capabilities: [200 v1] Designated Vendor-Specific: Vendor=8086 ID=0005 Rev=0 Len=24 <?>

        Capabilities: [220 v1] Address Translation Service (ATS)

                ATSCap: Invalidate Queue Depth: 00

                ATSCtl: Enable+, Smallest Translation Unit: 00

        Capabilities: [230 v1] Process Address Space ID (PASID)

                PASIDCap: Exec- Priv+, Max PASID Width: 14

                PASIDCtl: Enable+ Exec- Priv+

        Capabilities: [240 v1] Page Request Interface (PRI)

                PRICtl: Enable- Reset-

                PRISta: RF- UPRGI- Stopped+

                Page Request Capacity: 00000200, Page Request Allocation: 00000200

        Kernel driver in use: idxd

        Kernel modules: idxd

0 Kudos
Ragulan_Intel
Employee
1,730 Views

Hi MACM,


Greetings!


Thanks for the update. Kindly allow us some time to check on this and we will keep you posted once there is an update.


Regards,

Ragulan_Intel


0 Kudos
MACM
Employee
1,687 Views

Hi lihengzh


Greetings from Intel.


Hope you are doing well.


Please find the information below.


1 – Could you please test using the Ubuntu 22.10 version? I have reached out to the BU and the expert oriented to first test using distribution 22.10

2 – Can you go over the commands?

Double-check: BIOS:- EDKII -> Socket Configuration -> IIO Configuration -> Intel VT for Directed I/O (VT-d)

Option: No -> Yes


Install ubuntu 21.10

vi /etc/default/grub

GRUB_CMDLINE_LINUX="iommu=pt intel_iommu=on,sm_on idxd.dyndbg idxd.legacy_cdev_load=1 modprobe.blacklist=idxd_uacce"


Check and verify driver loaded

lsmod | grep dxd

idxd 98304 0

idxd_bus 20480 1 idxd


Dmesg | grep dxd

Verify idxd driver loaded properly


download

https://github.com/intel/idxd-config/archive/refs/tags/accel-config-v3.4.4.zip


apt-get update

apt install build-essential

apt install autoconf automake autotools-dev libtool pkgconf asciidoc xmlto

apt install autoconf

apt install automake

apt install autotools-dev

apt install libtools

apt install libtool

apt install pkgconf

apt install uuid-dev libjson-c-dev libkeyutils-dev

cd idxd-config-accel-config-v3.4.4

./autogen.sh

./configure CFLAGS='-g -O2' --prefix=/usr --sysconfdir=/etc --libdir=/usr/lib64 --disable-docs --enable-test=yes

make

make check

make install

cd idxd-config-accel-config-v3.4.4/test/configs

cp 2g2q_user_1.conf dedicated_2g2q_user_1.conf

sed -i.bak 's/"mode":"shared"/"mode":"dedicated"/g' dedicated_2g2q_user_1.conf

sed -i.bak 's/"threshold":15/"threshold":0/g' dedicated_2g2q_user_1.conf


reboot


# accel-config load-config -c configs/dedicated_2g2q_user_1.conf

# accel-config enable-device dsa0

enabled 1 device(s) out of 1

# accel-config enable-wq dsa0/wq0.0

enabled 1 wq(s) out of 1

# accel-config enable-wq dsa0/wq0.1

enabled 1 wq(s) out of 1


./dsa_test -w 0 -f 0x0 -v

./dsa_test -w 0 -l 4096 -o 0x3 -f 0x0 t200 -v

./dsa_test -w 0 -l 2097152 -o 0x3 -f 0x0 t200 -v

./dsa_test -w 0 -l 1073741824 -o 0x3 -f 0x0 t200 -v

Passed if no error reported


Best Regards,

Ali


0 Kudos
MACM
Employee
1,550 Views

Hi lihengzh


Greetings from Intel.


Hope you are doing great.


After some research and testing, based on other cases that are showing the same behavior:

1 - The hardware CPU ( Intel(R) Xeon(R) Gold 6433N) is showing the DSA device present on the system per message: (“idxd 0000:f6:01.0: Intel(R) Accelerator Device (v100)”. There are no issues with the hardware.

2 - Need to go over the Kernel and drivers for DSA to make sure all is installed correctly on the OS, following Intel® Data Streaming Accelerator User Guide. Make sure to be using the supported OS described in the Guide: SUSE Linux Enterprise Server SLES 15 SP4 / Redhat Enterprise Linux RHEL 8.7 & 9.1 / Ubuntu Ubuntu 22.10 (page 15). We do not provide OS support, please contact the OS vendor.

3 - We suggest the Kernel to be used should be higher than 6.6 as contains all drivers needed.

4 - We suggest the following troubleshooting:

a – Could you please test using the Ubuntu 22.10 version? I have reached out to the BU and the expert oriented to first test using distribution 22.10

b – Can you go over the commands? 

Double-check: BIOS:- EDKII -> Socket Configuration -> IIO Configuration -> Intel VT for Directed I/O (VT-d)

Option: No -> Yes

 

Install ubuntu 21.10

vi /etc/default/grub

GRUB_CMDLINE_LINUX="iommu=pt intel_iommu=on,sm_on idxd.dyndbg idxd.legacy_cdev_load=1 modprobe.blacklist=idxd_uacce"

 

Check and verify driver loaded

lsmod | grep dxd

idxd 98304 0

idxd_bus 20480 1 idxd

 

Dmesg | grep dxd

Verify idxd driver loaded properly

 

download

https://github.com/intel/idxd-config/archive/refs/tags/accel-config-v3.4.4.zip

 

apt-get update

apt install build-essential

apt install autoconf automake autotools-dev libtool pkgconf asciidoc xmlto

apt install autoconf

apt install automake

apt install autotools-dev

apt install libtools

apt install libtool

apt install pkgconf

apt install uuid-dev libjson-c-dev libkeyutils-dev

cd idxd-config-accel-config-v3.4.4

./autogen.sh

./configure CFLAGS='-g -O2' --prefix=/usr --sysconfdir=/etc --libdir=/usr/lib64 --disable-docs --enable-test=yes

make

make check

make install

cd idxd-config-accel-config-v3.4.4/test/configs

cp 2g2q_user_1.conf dedicated_2g2q_user_1.conf

sed -i.bak 's/"mode":"shared"/"mode":"dedicated"/g' dedicated_2g2q_user_1.conf

sed -i.bak 's/"threshold":15/"threshold":0/g' dedicated_2g2q_user_1.conf

 

reboot

 

# accel-config load-config -c configs/dedicated_2g2q_user_1.conf

# accel-config enable-device dsa0

enabled 1 device(s) out of 1

# accel-config enable-wq dsa0/wq0.0

enabled 1 wq(s) out of 1

# accel-config enable-wq dsa0/wq0.1

enabled 1 wq(s) out of 1

 

./dsa_test -w 0 -f 0x0 -v

./dsa_test -w 0 -l 4096 -o 0x3 -f 0x0 t200 -v

./dsa_test -w 0 -l 2097152 -o 0x3 -f 0x0 t200 -v

./dsa_test -w 0 -l 1073741824 -o 0x3 -f 0x0 t200 -v

Passed if no error reported

 

For more details about OS driver enablement, please check with the OS vendor.


Best Regard,

Ali


0 Kudos
MACM
Employee
1,223 Views

Hello lihengzh

 

Thank you for contacting Intel.

 

This is the first follow-up regarding the issue you reported to us.


We wanted to inquire whether you had the opportunity to review the plan of action we provided.

 

Feel free to reply to this email, and we'll be more than happy to assist you further.

 

Regards,

Ali

Intel Customer Support



0 Kudos
MACM
Employee
1,076 Views

Hello lihengzh,

 

Thank you for contacting Intel.

 

This is the second follow-up regarding the reported issue. We're eager to ensure a swift resolution and would appreciate any updates or additional information you can provide.

 

Please feel free to respond to this email at your earliest convenience.

 

Best regards,

Ali

Intel Customer Support


0 Kudos
MACM
Employee
953 Views

Hello lihengzh,

 

Thank you for contacting Intel.

 

This is the third follow-up regarding the reported issue. We're committed to ensuring a swift resolution and would greatly appreciate any updates or additional information you can provide.

 

If we don't hear back from you soon, we'll assume the issue has been resolved and will proceed to close the case.

 

Please feel free to respond to this email at your earliest convenience.

 

Best regards,

Mohammed Ali CM

Intel Customer Support


0 Kudos
Reply