- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Based on https://fast.dpdk.org/doc/perf/DPDK_21_11_Intel_crypto_performance_report.pdf
I am having problems to replicate Intel's test on x86, the best I can get is 7.9gbps on Xeon(R) Gold 6338N CPU @ 2.20GHz from SuperMicro. I tried to follow your instructions of configuring BIOS and kernel settings, etc on your report.
The VFs were assigned with vfio-pci driver not QAT's VF driver.
Anything I could be missing?
Thanks in advance.
JC
Link Copied
- « Previous
- Next »
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Ronny
for crypto_qat device type here is the published Intel result:
AES-CBC-128/SHA1-HMAC (Gbps) crypto_qat | AES-CBC-128/SHA2-256-HMAC (Gbps) crypto_qat | AES-GCM-128 (Gbps) crypto_qat |
3.90 | 3.89 | 3.35 |
7.72 | 7.68 | 6.66 |
15.06 | 14.92 | 13.10 |
28.31 | 27.99 | 24.69 |
45.60 | 46.67 | 39.58 |
52.70 | 52.45 | 49.85 |
on our lab machine Gold 6338N CPU @ 2.20GHz, the results are:
AES-CBC-128/SHA1-HMAC (Gbps) crypto_qat | AES-CBC-128/SHA2-256-HMAC (Gbps) crypto_qat | AES-GCM-128 (Gbps) crypto_qat |
0.8138 | 0.8104 | 0.7035 |
1.6104 | 1.6004 | 1.4020 |
3.1418 | 3.1145 | 2.7618 |
5.9390 | 5.8802 | 5.2268 |
9.2248 | 9.0837 | 8.2348 |
10.9657 | 10.8057 | 10.3737 |
We are trying to bring up the system based on Sapphire Rapid to run the tests again.
Are we going to run tests on crypto_scheduler?
Thanks
JCK
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi JCK1,
We need some additional clarification, can you please provide the command that you are running for the Cryptodev QAT PMD performance test (test we are concentrating on)?
Are these the results?
AES-CBC-128/SHA1-HMAC (Gbps) crypto_qat |
AES-CBC-128/SHA2-256-HMAC (Gbps) crypto_qat |
AES-GCM-128 (Gbps) crypto_qat |
3.90 | 3.89 | 3.35 |
7.72 | 7.68 | 6.66 |
15.06 | 14.92 | 13.10 |
28.31 | 27.99 | 24.69 |
45.60 | 46.67 | 39.58 |
52.70 | 52.45 |
49.85 |
on our lab machine Gold 6338N CPU @ 2.20GHz, the results are:
AES-CBC-128/SHA1-HMAC (Gbps) crypto_qat |
AES-CBC-128/SHA2-256-HMAC (Gbps) crypto_qat |
AES-GCM-128 (Gbps) crypto_qat |
0.8138 | 0.8104 | 0.7035 |
1.6104 | 1.6004 | 1.4020 |
3.1418 | 3.1145 | 2.7618 |
5.9390 | 5.8802 | 5.2268 |
9.2248 | 9.0837 | 8.2348 |
10.9657 | 10.8057 | 10.3737 |
Thanks,
Ronny G
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Ronny
Yes the first table is from Intel published result. the second table is from my test on our x86 machine.
Thanks
JCK
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks JCK1, can you please provide the exact command that you are running to obtain these results?
Regards,
Ronny G
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Ronny
Yes, run this:
./intel-cryptodev-qat-tests.sh [0|1|2] for
0 - AES-CBC-128/SHA1-HMAC
1 - AES-CBC-128/SHA2-256-HMAC
2 - AES-GCM-128
thanks
JCK
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi JCK1,
I really need your help with the full command that you are running.
We want to confirm that you are using scheduler PMD with QAT workers in round-robin.
Can you please provide the full command?
Thanks,
Ronny G
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Ronny,
I tried to post reply here, but your system complained:
Your post has been changed because invalid HTML was found in the message body. The invalid HTML has been removed. Please review the message and submit the message when you are satisfied.
So I put every thing into a txt file and attached here.
JCK
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Ronny
Also Intel's Sapphire Rapid has QAT integrated into the SoC, so for QAT how to test its performance on SR? do you have any information can share with us to conduct our evaluation? How is that supported in DPDK?
Thanks
JCK
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
and by the way JCK1, I have provided the DPDK team with your update, thank you.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi JCK1,
Thanks for the information, the .txt you provided me with has been shared with the DPDK team.
Thanks,
Ronny G
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi JCK1,
This simulation and report below is focused on just TestCase 3, which is multi core QAT test with 4 VFs used, no scheduler PMD used and focused on AES-CBC-128 SHA1-HMAC.
This test equates to this first test run by you:
sudo $DPDK_TEST_CRYPTO_PERF/dpdk-test-crypto-perf \
--socket-mem 2048,0 --legacy-mem \
-a ${QAT_PF0}.0 -a ${QAT_PF0}.1 -a ${QAT_PF0}.2 -a ${QAT_PF0}.3 \
-l 4,5,13,6,14 -n 4 \
-- --buffer-sz 64,128,256,512,1024,2048 \
--optype cipher-then-auth --ptest throughput --auth-key-sz 64 --cipher-key-sz 16 \
--devtype crypto_qat --cipher-iv-sz 16 --auth-op generate --burst-sz 32 \
--total-ops 30000000 --digest-sz 20 --auth-algo sha1-hmac --cipher-algo aes-cbc --cipher-op encrypt
We run this test with a similar command.
Changes for socket-mem as our QAT is on socket 1 on system.
Also changed lcores used to be from socket 1 also (these are isolcpu in config so this should match the same you are using)
Please check that your QAT socket matches the socket of lcores used.
QAT socket is shown in DPDK app output: EAL: Probe PCI driver: qat (8086:37c9) device: 0000:b5:01.1 (socket 1)
And lcore sockets can be checked with DPDK app: ./usertools/cpu_layout.py
Intel Command:
./build/app/dpdk-test-crypto-perf --socket-mem 2048,2048 --legacy-mem -a 0000:b5:01.1 -a 0000:b5:01.2 -a 0000:b5:01.3 -a 0000:b5:01.4 -l 37,38,39,40,41 -n 4 -- --buffer-sz 64,128,256,512,1024,2048 --optype cipher-then-auth --ptest throughput --auth-key-sz 64 --cipher-key-sz 16 --devtype crypto_qat --cipher-iv-sz 16 --auth-op generate --burst-sz 32 --total-ops 30000000 --digest-sz 20 --auth-algo sha1-hmac --cipher-algo aes-cbc --cipher-op encrypt
Intel Results:
lcore id Buf Size Burst Size Enqueued Dequeued Failed Enq Failed Deq MOps Gbps Cycles/Buf
41 64 32 30000000 30000000 391959011 374040103 1.5700 0.8039 1719.71
40 64 32 30000000 30000000 393476553 375387075 1.5700 0.8038 1719.78
39 64 32 30000000 30000000 396301663 378362533 1.5700 0.8038 1719.79
38 64 32 30000000 30000000 395045310 377272628 1.5699 0.8038 1719.83
39 128 32 30000000 30000000 398887225 381068282 1.5573 1.5947 1733.72
40 128 32 30000000 30000000 396297972 378294301 1.5573 1.5946 1733.81
38 128 32 30000000 30000000 397938626 380308357 1.5569 1.5943 1734.23
41 128 32 30000000 30000000 394682012 376872827 1.5572 1.5945 1733.92
41 256 32 30000000 30000000 405075976 387162514 1.5187 3.1103 1777.86
40 256 32 30000000 30000000 406794016 388691199 1.5188 3.1105 1777.74
38 256 32 30000000 30000000 408170888 390414980 1.5185 3.1099 1778.06
39 256 32 30000000 30000000 409437937 391469812 1.5187 3.1103 1777.84
38 512 32 30000000 30000000 435607109 417513101 1.4310 5.8615 1886.74
39 512 32 30000000 30000000 437163787 418859493 1.4307 5.8600 1887.23
41 512 32 30000000 30000000 432472880 414204093 1.4308 5.8605 18876
40 512 32 30000000 30000000 434418130 415985247 1.4305 5.8594 1887.42
40 1024 32 30000000 30000000 583552510 564226360 1.1165 9.1464 2418.26
41 1024 32 30000000 30000000 582614586 563287134 1.1157 9.1396 2420.06
39 1024 32 30000000 30000000 586490123 567304136 1.1154 9.1377 2420.56
38 1024 32 30000000 30000000 585874264 566681859 1.1154 9.1370 2420.74
38 2048 32 30000000 30000000 1094437097 1073467007 0.6575 10.7730 4106.27
40 2048 32 30000000 30000000 1091084015 1070100515 0.6575 10.7724 4106.49
39 2048 32 30000000 30000000 1096021229 1075066115 0.6575 10.7729 4106.32
41 2048 32 30000000 30000000 1053719015 1033070689 0.6575 10.7733 4106.17
Now, as stated in perf report, the results shown in report for multi-core are the sum of each core's perf results for that buffer size.
So for buffer 64, I have: 0.8039 + 0.8038 + 0.8038 + 0.8038 = 3.2153
That value isn't far off the reported value 3.90 below.
The other values are just short of the report results too.
From your previous community messages, you mentioned results around 0.8GBps for this, Are you looking at just one core result or was that the sum value of all cores?
Platform used for testing:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
Address sizes: 46 bits physical, 48 bits virtual
CPU(s): 112
On-line CPU(s) list: 0-111
Thread(s) per core: 2
Core(s) per socket: 28
Socket(s): 2
NUMA node(s): 2
Vendor ID: GenuineIntel
CPU family: 6
Model: 85
Model name: Intel(R) Xeon(R) Platinum 8280 CPU @ 2.70GHz
Stepping: 7
Frequency boost: enabled
CPU MHz: 3536.342
CPU max MHz: 2701.0000
CPU min MHz: 1000.0000
BogoMIPS: 5400.00
Virtualization: VT-x
L1d cache: 1.8 MiB
L1i cache: 1.8 MiB
L2 cache: 56 MiB
L3 cache: 77 MiB
NUMA node0 CPU(s): 0-27,56-83
NUMA node1 CPU(s): 28-55,84-111
To sum up, there doesn't seem to be a significant discrepancy between our performance results and those that you reported; they are quite similar. Your reported results closely match ours for a single core.
I hope this helps.
Regards,
Ronny G
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Ronny,
Thanks for your reply.
So I plotted your results into a spreadsheet and compared your results with Intel official numbers. They are 17% difference, I would not say that is not far off. Sorry.
Thanks
JCK
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi JCK1,
I recognize that the approximately 17% variance between the official DPDK test report and our test outcomes might seem substantial. However, it's important to remember that official test reports are conducted on optimized platforms, which can include specialized hardware, tailored BIOS settings, and even operating system optimization for the one porpuse of maximizing the test to be performed. Additionally, our hardware setup is not an exact match to the one used in the official tests. Official test reports should be viewed as a benchmark indicating the potential performance of a system under extremely controlled conditions and with a high degree of customization.
Please let me know if there is anything else I can help you with.
Regards,
Ronny G
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Ronny
I think it is important to have third-party to verify Intel official results independently. I understand there are many settings and configurations in many levels need to be had correctly in order to reproduce such results but Intel should try its best to help people to do so easily and quickly (look at how long we have gone through?)
Thank you for your support and best regards
JCK
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
And we haven't touched Intel test case #1 for QAT scheduler PMD case. the discrepancy of results is even much bigger.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi JCK1,
I acknowledge your remarks and will convey your feedback to the DPDK team. I also concur that this matter has been ongoing for quite some time, and unfortunately, the outcomes have not met your expectations. I apologize for any inconvenience this may have caused.
I would agree that the outcomes for test #1 involving the QAT scheduler PMD might differ from the official DPDK performance test results for the same reasons I previously outlined.
Let me know how do want to proceed with this issue. I don't really have additional recommendations at this point.
Regards,
Ronny G
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Ronny
I am not sure how much more Intel can do at this point based on what we can achieve, I would accept the current situation as is and convey to our mgmt and partner who wants to use QAT in their product. It is up to them to make final decision.
thank you
JCK
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Ronny
Thanks for your support on this matter. One last favor I would ask is to have your DPDK expert run all remaining tests (like qat scheduler PMD) on their existing setup to see how much we can get out of it. I like to include these results in my final report to my management as a reference point. Could you help on this?
Thanks
JCK
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi JCK1,
I have conveyed your feedback to the DPDK team for their consideration, and we will be exploring ways to make it easier for customers to replicate the outcomes detailed in the Performance Reports. Achieving results that are extremely close to ours can be challenging, as we cannot ensure identical outcomes if the system differs. It's important to note that only systems that are exactly the same can produce results that are very close to those reported, and even then, they may not be exactly identical.
I regret any inconvenience this may have caused and the delay in resolving this matter. Please inform me if there's anything more I can assist you with. If there are no further issues, I will proceed to close the internal ticket I have opened regarding this concern.
Regards,
Ronny G
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi JCK1,
I am concluding the internal case I opened for this matter. Please don't hesitate to initiate a new thread if you require further help.
Thanks,
Ronny G
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- « Previous
- Next »