Server Products
Data Center Products including boards, integrated systems, Intel® Xeon® Processors, RAID Storage, and Intel® Xeon® Processors
4778 Discussions

[Intel C625 QAT driver]userMemAlloc() can allocate memory but is with the problem of wrong alignment

Alfago12
New Contributor I
2,769 Views

Hello Team,

userMemAlloc:380 can allocate memory but is with the problem of wrong alignment. What's likely root cause? Is it still caused by USDM failing to allocate 2MB slab? Many thanks in advance.

Besides, if the wrong alignment is caused by USDM failing to allocate 2MB slab, can I use kvzalloc() to replace kmalloc() in QAT driver/USDM? Does QAT driver/USDM require "physically contiguous memory in the kernel's own address space"?
https://lwn.net/Articles/711653/

 

Version information:
QAT driver: QAT1.7.Upstream.L.1.0.3-42
QAT engine: QAT_Engine-0.5.40, uses qat_contig_mem as memory driver.

  1.  

Intel QAT driver uses USDM to allocate memory. "The USDM Memory Driver works by allocating 2MB slabs and dividing them up for individual allocations. The failures you are seeing in your log are from not being able to allocate a 2MB slab of contiguous memory from the kernel."

The device is highly fragmented because there're few chunks equal to or greater than 2MB.
cat /proc/buddyinfo
Node 0, zone DMA 2 1 0 1 3 2 2 1 2 0 0 0
Node 0, zone DMA32 3721 455 163 120 41 15 7 3 2 7 5 1
Node 0, zone Normal 273 237 187 128 66 32 28 5 4 0 0 0

Nov 26 13:01:27 user.err kernel: [4999514.599784] userMemAlloc:380 Unable to allocate memory slab or wrong alignment: 000000006faebeda
Nov 26 13:01:27 user.err kernel: [4999514.707096] dev_mem_alloc:566 userMemAlloc failed
Nov 26 13:01:27 user.info kernel: [4999514.765474] xxxxx[15568]: segfault at 40 ip 00007f8ffa1938d0 sp 00007ffc35618518 error 4 in libjemalloc.so.2[7f8ffa13b000+76000]
Nov 26 13:01:27 user.info kernel: [4999514.765478] Code: 66 2e 0f 1f 84 00 00 00 00 00 48 8b 06 48 8d 15 06 15 02 00 48 c1 e8 12 0f b6 c8 48 8b 04 ca c3 66 2e 0f 1f 84 00 00 00 00 00 <48> 8b 46 40 c3 66 66 2e 0f 1f 84 00 00 00 00 00 48 89 56 40 c3 66
Nov 26 13:01:29 user.warn kernel: [4999516.600569] warn_alloc: 4 callbacks suppressed
Nov 26 13:01:29 user.warn kernel: [4999516.600570] xxxxx: page allocation failure: order:9, mode:0x40cc0(GFP_KERNEL|__GFP_COMP), nodemask=(null)
Nov 26 13:01:29 user.debug kernel: [4999516.600572] CPU: 2 PID: 15572 Comm: xxxxx Kdump: loaded Tainted: G W O 5.4.0 1
Nov 26 13:01:29 user.debug kernel: [4999516.600573] Hardware name: GIGABYTE MN32-EC2-F5/MN32-EC2-F5, BIOS F03 06/04/2019
Nov 26 13:01:29 user.debug kernel: [4999516.600573] Call Trace:
Nov 26 13:01:29 user.debug kernel: [4999516.600577] dump_stack+0x50/0x70
Nov 26 13:01:29 user.debug kernel: [4999516.600578] warn_alloc.cold+0x73/0xd7
Nov 26 13:01:29 user.debug kernel: [4999516.600580] __alloc_pages_slowpath+0x8e3/0xaa0
Nov 26 13:01:29 user.debug kernel: [4999516.600581] ? cdev_put.part.0+0x20/0x20
Nov 26 13:01:29 user.debug kernel: [4999516.600582] __alloc_pages_nodemask+0x222/0x250
Nov 26 13:01:29 user.debug kernel: [4999516.600584] kmalloc_large_node+0x40/0xa0
Nov 26 13:01:29 user.debug kernel: [4999516.600585] __kmalloc_node+0x12b/0x290
Nov 26 13:01:29 user.debug kernel: [4999516.600587] 0xffffffffa02848cd
Nov 26 13:01:29 user.debug kernel: [4999516.600588] 0xffffffffa02849cf
Nov 26 13:01:29 user.debug kernel: [4999516.600589] do_vfs_ioctl+0x3e4/0x640
Nov 26 13:01:29 user.debug kernel: [4999516.600590] ksys_ioctl+0x3a/0x70
Nov 26 13:01:29 user.debug kernel: [4999516.600592] __x64_sys_ioctl+0x16/0x20
Nov 26 13:01:29 user.debug kernel: [4999516.600593] do_syscall_64+0x68/0x3c0
Nov 26 13:01:29 user.debug kernel: [4999516.600594] ? __do_page_fault+0x23d/0x480
Nov 26 13:01:29 user.debug kernel: [4999516.600596] entry_SYSCALL_64_after_hwframe+0x44/0xa9

2.search regular expression "^.started . acceleration engines\n" in dmesg info.
In the configuration file:
[SHIM]
NumberCyInstances = 22

But it seems that only 16 Crypto instances are obtained each time.
Line 887: Sep 12 03:39:16 user.info kernel: [ 49.046552] c6xx 0000:53:00.0: qat_dev0 started 8 acceleration engines
Line 893: Sep 12 03:39:16 user.info kernel: [ 50.768547] c6xx 0000:53:00.0: qat_dev0 started 8 acceleration engines
Line 1818: Sep 25 04:41:17 user.info kernel: [ 48.279127] c6xx 0000:53:00.0: qat_dev0 started 8 acceleration engines
Line 1824: Sep 25 04:41:17 user.info kernel: [ 50.000120] c6xx 0000:53:00.0: qat_dev0 started 8 acceleration engines
Line 3075: Sep 29 17:17:18 user.info kernel: [ 48.391143] c6xx 0000:53:00.0: qat_dev0 started 8 acceleration engines
Line 3081: Sep 29 17:17:18 user.info kernel: [ 50.114146] c6xx 0000:53:00.0: qat_dev0 started 8 acceleration engines

0 Kudos
1 Solution
Victor_G_Intel
Employee
2,319 Views

Hello Alfago12,


Thank you so much for waiting.


Regarding the previous post you sent, we don't recommend making changes to the QAT driver. You can give it a try and use kvzalloc() but this is something that we don't recommend or validate.


Additionally, an easy way to confirm how many instances are being used is by running the QAT sample application cpa_sample_code.


Best regards,


Victor G.

Intel Technical Support Technician  


View solution in original post

0 Kudos
14 Replies
Victor_G_Intel
Employee
2,732 Views

Hello Alfago12,


Thank you for posting on the Intel® communities.


Please let me review this information internally, and kindly wait for an update.


Once we have more information to share, we will post it on this thread.


Best regards,


Victor G.

Intel Technical Support Technician  


0 Kudos
Alfago12
New Contributor I
2,726 Views
0 Kudos
Victor_G_Intel
Employee
2,719 Views

Hello Alfago12,


Thank you for your patience.


Our strongest recommendation based on your scenario is for you to check the following programmer's guide:


https://www.intel.com/content/www/us/en/content-details/710060/intel-quickassist-technology-software-for-linux-programmer-s-guide-hw-version-1-7.html?DocID=710060


For your first issue, you can try to enable huge pages as noted in section 3.16. As for the second issue, you can check section 4.3.3.1 for guidance on configuration parameters.


Additionally, please bear in mind that there are newer versions of the QAT driver and QAT Engine than the ones you are currently using, our recommendation is to get the latest versions.


Best regards,


Victor G.

Intel Technical Support Technician  


Victor_G_Intel
Employee
2,704 Views

Hello Demi,


I hope this message finds you well.


I would like to know if you need further assistance.


Regards,


Victor G.

Intel Technical Support Technician


0 Kudos
Alfago12
New Contributor I
2,701 Views

Hi Victor,

 

1.For the first issue, I don't try huge page because the available memory is so limited on the device.

 

The device is highly fragmented because there're few chunks equal to or greater than 2MB.
cat /proc/buddyinfo
Node 0, zone DMA 2 1 0 1 3 2 2 1 2 0 0 0
Node 0, zone DMA32 3721 455 163 120 41 15 7 3 2 7 5 1
Node 0, zone Normal 273 237 187 128 66 32 28 5 4 0 0 0

 

In Section 3.16, it mentions that

"

Use of this capability assumes that a sufficient number of huge pages are allocated in the operating system for the particular use case and configuration.

"

 

2.For the second issue, I checked my configuration according to 4.3.3.1 and my config should be right.

 

Regards,

 

0 Kudos
Victor_G_Intel
Employee
2,696 Views

Hello Alfago12,


Thank you for your response.


Please allow us some more time to look into this. As soon as possible we will be contacting you back.


Best regards,


Victor G.

Intel Technical Support Technician


0 Kudos
Victor_G_Intel
Employee
2,675 Views

Hello Alfago12,


Thank you so much for waiting.


In regards to your previous response, QAT requires physically contiguous and DMAable memory for the hardware acceleration to work. The fact that there is a high limitation of available memory in your system seems to be the reason for the error message. QAT is not able to allocate enough memory to perform the operations.


Additionally, about the number of crypto instances, if the configuration file is correct, then you should see the 22 instances set in the configuration file. What you previously shared in your post refers to the acceleration engines of the QAT Endpoint which are 8 in your QAT device. They will always be 8 as that's how the hardware was built, but they are not the instances used. It's kind of unclear to us; however, how you got the number of 16 crypto instances.


Best regards,


Victor G.

Intel Technical Support Technician  


0 Kudos
Alfago12
New Contributor I
2,666 Views

 Hi Victor,

 

Thanks for your reply.

 

1."QAT requires physically contiguous and DMAable memory for the hardware acceleration to work."

So can I use kvzalloc() to replace kmalloc() in QAT driver/USDM?

 

2."It's kind of unclear to us; however, how you got the number of 16 crypto instances."

For the following log, I took acceleration engines for crypto instances by mistake. Crypto instances number should be acquired by API ENGINE_ctrl_cmd("GET_NUM_CRYPTO_INSTANCES").

Line 887: Sep 12 03:39:16 user.info kernel: [ 49.046552] c6xx 0000:53:00.0: qat_dev0 started 8 acceleration engines
Line 893: Sep 12 03:39:16 user.info kernel: [ 50.768547] c6xx 0000:53:00.0: qat_dev0 started 8 acceleration engines
Line 1818: Sep 25 04:41:17 user.info kernel: [ 48.279127] c6xx 0000:53:00.0: qat_dev0 started 8 acceleration engines
Line 1824: Sep 25 04:41:17 user.info kernel: [ 50.000120] c6xx 0000:53:00.0: qat_dev0 started 8 acceleration engines
Line 3075: Sep 29 17:17:18 user.info kernel: [ 48.391143] c6xx 0000:53:00.0: qat_dev0 started 8 acceleration engines
Line 3081: Sep 29 17:17:18 user.info kernel: [ 50.114146] c6xx 0000:53:00.0: qat_dev0 started 8 acceleration engines

 

Regards,

 

0 Kudos
JoseH_Intel
Moderator
2,355 Views

Hello Alfago12,


We will check into this and will get back to you as soon as we have updates.


Regards


Jose A.

Intel Customer Support Technician

For firmware updates and troubleshooting tips, visit:

https://intel.com/support/serverbios


0 Kudos
Alfago12
New Contributor I
2,343 Views
0 Kudos
Victor_G_Intel
Employee
2,320 Views

Hello Alfago12,


Thank you so much for waiting.


Regarding the previous post you sent, we don't recommend making changes to the QAT driver. You can give it a try and use kvzalloc() but this is something that we don't recommend or validate.


Additionally, an easy way to confirm how many instances are being used is by running the QAT sample application cpa_sample_code.


Best regards,


Victor G.

Intel Technical Support Technician  


0 Kudos
Alfago12
New Contributor I
2,313 Views

Hi Victor,

 

Thanks so much for your help.

I did the following modifications to QAT driver which seems to run inappropriately.

Before     After
kmalloc  kvmalloc
kzalloc    kvzalloc
kmalloc_node   kvmalloc_node
kzalloc_node    kvzalloc_node
kmalloc_array   kvmalloc_array
kfree     kvfree

 

I'll try cpa_sample_code to get the instances number.

 

Regards,

0 Kudos
Victor_G_Intel
Employee
2,245 Views

Hello Alfago12,


Please let me know if you still need further assistance or if it's okay for us to close the thread.


Regards,


Victor G.

Intel Technical Support Technician


0 Kudos
Alfago12
New Contributor I
2,239 Views

Hi Victor,

 

Pls close this thread. Thx.

 

Regards,

0 Kudos
Reply