Embedded Intel® Core™ Processors
Communicate Intel® Core™ Hardware, Software, Firmware, Graphics Concerns
Announcements
FPGA community forums and blogs have moved to the Altera Community. Existing Intel Community members can sign in with their current credentials.

DPDK line rate

YXu21
Beginner
6,220 Views

I'm a newbie to DPDK. When running testpmd app in SDK1.5 tools on CentOS 6.4, I got only 400Mb/s on 1G NICs. I tried various parameters (number of cores, burst packets number, rxd size, etc), but still no improvement. Two NIC ports are connected with a direct cable. I noticed 100% cpu on the selected cores after test started (I guess this may be normal). I know my NIC 82571EB is not officially supported by DPDK, but it seems working fine. I wonder if I missed anything in BIOS setting or some other parameters like NUMA enablement in order to achieve 1G line rate. Please help me.

Here is my configuration:

Version: Intel(R) Xeon(R) CPU E5450 @ 3.00GHz

Version: Intel(R) Xeon(R) CPU E5450 @ 3.00GHz

Description: LAN 3 of Gilgal (Intel 82571EB)

Description: LAN 4 of Gilgal (Intel 82571EB)

Option: 12

Enter hex bitmask of cores to execute testpmd app on

Example: to execute app on cores 0 to 7, enter 0xff

bitmask: 0xff

Launching app

EAL: Cannot read numa node link for lcore 0 - using physical package id instead

EAL: Detected lcore 0 as core 0 on socket 0

EAL: Cannot read numa node link for lcore 1 - using physical package id instead

EAL: Detected lcore 1 as core 0 on socket 1

EAL: Cannot read numa node link for lcore 2 - using physical package id instead

EAL: Detected lcore 2 as core 1 on socket 0

EAL: Cannot read numa node link for lcore 3 - using physical package id instead

EAL: Detected lcore 3 as core 1 on socket 1

EAL: Cannot read numa node link for lcore 4 - using physical package id instead

EAL: Detected lcore 4 as core 2 on socket 0

EAL: Cannot read numa node link for lcore 5 - using physical package id instead

EAL: Detected lcore 5 as core 2 on socket 1

EAL: Cannot read numa node link for lcore 6 - using physical package id instead

EAL: Detected lcore 6 as core 3 on socket 0

EAL: Cannot read numa node link for lcore 7 - using physical package id instead

EAL: Detected lcore 7 as core 3 on socket 1

EAL: Skip lcore 8 (not detected)

EAL: Skip lcore 9 (not detected)

... ...

 

EAL: Skip lcore 62 (not detected)

EAL: Skip lcore 63 (not detected)

EAL: Setting up memory...

EAL: Ask a virtual area of 0x2097152 bytes

EAL: Virtual area found at 0x7f1916200000 (size = 0x200000)

EAL: Ask a virtual area of 0x2097152 bytes

EAL: Virtual area found at 0x7f1915e00000 (size = 0x200000)

EAL: Ask a virtual area of 0x2097152000 bytes

EAL: Virtual area found at 0x7f1898c00000 (size = 0x7d000000)

EAL: Ask a virtual area of 0x4194304 bytes

EAL: Virtual area found at 0x7f1898600000 (size = 0x400000)

EAL: Ask a virtual area of 0x4194304 bytes

EAL: Virtual area found at 0x7f1898000000 (size = 0x400000)

EAL: Ask a virtual area of 0x4194304 bytes

EAL: Virtual area found at 0x7f1897a00000 (size = 0x400000)

EAL: Ask a virtual area of 0x4194304 bytes

EAL: Virtual area found at 0x7f1897400000 (size = 0x400000)

EAL: Ask a virtual area of 0x4194304 bytes

EAL: Virtual area found at 0x7f1896e00000 (size = 0x400000)

EAL: Ask a virtual area of 0x4194304 bytes

EAL: Virtual area found at 0x7f1896800000 (size = 0x400000)

EAL: Ask a virtual area of 0x4194304 bytes

EAL: Virtual area found at 0x7f1896200000 (size = 0x400000)

EAL: Ask a virtual area of 0x4194304 bytes

EAL: Virtual area found at 0x7f1895c00000 (size = 0x400000)

EAL: Ask a virtual area of 0x4194304 bytes

EAL: Virtual area found at 0x7f1895600000 (size = 0x400000)

EAL: Ask a virtual area of 0x4194304 bytes

EAL: Virtual area found at...

0 Kudos
10 Replies
Muthurajan_J_Intel
2,860 Views

 

Dear Customer,

Thank you for using Intel DPDK.

Sorry for the inconvenience.

Couple of observations and questions please.

1) Please kindly verify if are you using huge page ? In the following document, page 20 refers about option 8 to configure huge page .

http://www.intel.com/content/dam/www/public/us/en/documents/guides/intel-dpdk-getting-started-guide.pdf http://www.intel.com/content/dam/www/public/us/en/documents/guides/intel-dpdk-getting-started-guide.pdf

2) In response to your

testpmd> show config fwd

below it shows the core1 is in socket 1 and ports are in socket 0.

That means they are far apart in terms of locality. Can you please have both cores and ports in the same socket please and take the observation with that configuration please.

io packet forwarding - ports=2 - cores=1 - streams=2 - NUMA support disabled

Logical Core 1 (socket 1) forwards packets on 2 streams:

RX P=0/Q=0 (socket 0) -> TX P=1/Q=0 (socket 0) peer=02:00:00:00:00:01

RX P=1/Q=0 (socket 0) -> TX P=0/Q=0 (socket 0) peer=02:00:00:00:00:00

0 Kudos
YXu21
Beginner
2,861 Views

Thanks for the reply.

I have set the huge page, as shown by option 13 of testpmd app.

Option: 13

AnonHugePages: 282624 kB

HugePages_Total: 1024

HugePages_Free: 0

HugePages_Rsvd: 0

HugePages_Surp: 0

Hugepagesize: 2048 kB

 

Now I pin lcore 1 on socket 0, aligned with queues.

 

testpmd> show config fwd

io packet forwarding - ports=2 - cores=1 - streams=2 - NUMA support disabled

Logical Core 1 (socket 0) forwards packets on 2 streams:

RX P=0/Q=0 (socket 0) -> TX P=1/Q=0 (socket 0) peer=02:00:00:00:00:01

RX P=1/Q=0 (socket 0) -> TX P=0/Q=0 (socket 0) peer=02:00:00:00:00:00

Also, I replaced CentOS 6.4 with Fedora 18.

[root]# uname -a

Linux rcs8 3.11.9-100.fc18.x86_64 # 1 SMP Wed Nov 20 21:22:39 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

However, I can only get 500Mbps bidirectional speed between my directly connected 1G NIC cards (Intel 82571EB).

Are there any kernel parameters or BIOS parameters I need to set? I did not change any default kernel or BIOS settings. My server has DMA disabled in BIOS, should it be enabled?

Please help. Many thanks!

0 Kudos
YXu21
Beginner
2,860 Views

The problem has not been solved yet. Why is the question marked as "Assumed Answered"? How can I change the status?

0 Kudos
Muthurajan_J_Intel
2,860 Views

Dear Customer,

Thank you.

You are correct regarding the BIOS update for NUMA.

The configuration command shows as NUMA support disabled.

1) Please in the BIOS enable NUMA

So, Please first in BIOS enable NUMA and find out the rate you achieve.

What happens when you enable NUMA is the continuous addresses of memory are assigned from same node whereas if the NUMA is disabled, successive addresses are from alternate socket. So, enabling NUMA is right thing to do.

When you do enable NUMA, the print out for your option

Please find the performance numbers with the above enabling.

In addition, please feel free to use the following info

2) Also, please print out the EAL print out as what EAL finds lcore and socket mapping as well as how the huge page is allocated in both sockets.

I see you have pinned lcore1 to socket0. EAL by default finds lcore1 in socket1 and lcore2 in socket0. [lcore0 is not available for forwarding hence looking choices between lcore1 and lcore2 for forwarding]

Since EAL finds even number lcores in socket0 and odd number lcores in socket1, you can use lcore2 instead of lcore1 for your purpose which means in the mask , so the core mask can be 1111 1101 which is 0xFD.

3) You were referring about enabling DMA.. Please feel free to enable.

Thanks

 

0 Kudos
YXu21
Beginner
2,860 Views

Thanks for the reply.

I checked the server (SUN FIRE X4150, Intel CPU E5450, BIOS version 1ADQW060), and could not find NUMA in BIOS settings.

I enabled Crystal Beach / DMA in BIOS, but did not see any improvement on packet rate.

Option: 12

Enter hex bitmask of cores to execute testpmd app on

Example: to execute app on cores 0 to 7, enter 0xff

bitmask: 0xfd

Launching app

EAL: Detected lcore 0 as core 0 on socket 0

EAL: Detected lcore 1 as core 0 on socket 0

EAL: Detected lcore 2 as core 1 on socket 0

EAL: Detected lcore 3 as core 1 on socket 0

EAL: Detected lcore 4 as core 2 on socket 0

EAL: Detected lcore 5 as core 2 on socket 0

EAL: Detected lcore 6 as core 3 on socket 0

EAL: Detected lcore 7 as core 3 on socket 0

EAL: Skip lcore 8 (not detected)

EAL: Skip lcore 9 (not detected)

EAL: Skip lcore 10 (not detected)

EAL: Skip lcore 11 (not detected)

EAL: Skip lcore 12 (not detected)

EAL: Skip lcore 13 (not detected)

EAL: Skip lcore 14 (not detected)

EAL: Skip lcore 15 (not detected)

EAL: Skip lcore 16 (not detected)

EAL: Skip lcore 17 (not detected)

EAL: Skip lcore 18 (not detected)

EAL: Skip lcore 19 (not detected)

EAL: Skip lcore 20 (not detected)

EAL: Skip lcore 21 (not detected)

EAL: Skip lcore 22 (not detected)

EAL: Skip lcore 23 (not detected)

EAL: Skip lcore 24 (not detected)

EAL: Skip lcore 25 (not detected)

EAL: Skip lcore 26 (not detected)

EAL: Skip lcore 27 (not detected)

EAL: Skip lcore 28 (not detected)

EAL: Skip lcore 29 (not detected)

EAL: Skip lcore 30 (not detected)

EAL: Skip lcore 31 (not detected)

EAL: Skip lcore 32 (not detected)

EAL: Skip lcore 33 (not detected)

EAL: Skip lcore 34 (not detected)

EAL: Skip lcore 35 (not detected)

EAL: Skip lcore 36 (not detected)

EAL: Skip lcore 37 (not detected)

EAL: Skip lcore 38 (not detected)

EAL: Skip lcore 39 (not detected)

EAL: Skip lcore 40 (not detected)

EAL: Skip lcore 41 (not detected)

EAL: Skip lcore 42 (not detected)

EAL: Skip lcore 43 (not detected)

EAL: Skip lcore 44 (not detected)

EAL: Skip lcore 45 (not detected)

EAL: Skip lcore 46 (not detected)

EAL: Skip lcore 47 (not detected)

EAL: Skip lcore 48 (not detected)

EAL: Skip lcore 49 (not detected)

EAL: Skip lcore 50 (not detected)

EAL: Skip lcore 51 (not detected)

EAL: Skip lcore 52 (not detected)

EAL: Skip lcore 53 (not detected)

EAL: Skip lcore 54 (not detected)

EAL: Skip lcore 55 (not detected)

EAL: Skip lcore 56 (not detected)

EAL: Skip lcore 57 (not detected)

EAL: Skip lcore 58 (not detected)

EAL: Skip lcore 59 (not detected)

EAL: Skip lcore 60 (not detected)

EAL: Skip lcore 61 (not detected)

EAL: Skip lcore 62 (not...

0 Kudos
Muthurajan_J_Intel
2,861 Views

Dear Customer,

Thank you.

Basically looking at the system, have 2 thoughts to share - 1) About the current system and 2) About possibility of upgrading the system.

Just looked at the CPU number in the print out.

Looking at the CPU, this is frontside bus based system. I have attached the benefit of DPDK with integrated memory controller as well as integrated PCI controller that are available in the later generation. Attached word document titled Processor Generation to show the delta offered by later CPUs which have integrated memory controller and integrated PCI controller.

Please kindly find that the one you are using is listed in the graph on the year 2008.

1) About the current system:

A) What is the packet size? Is it 64 byte packets?

B) The PCI connector where you have populated the PCI card - is it connected to the memory controller directly or is it connected to i/o controller. Putting it on the slots that are connected to memory controller will give better performance.

C) Can you please use packet size of 256 - connecting the PCI card to the slot that is connected to memory controller and see howmuch you are able to go up in performance?

D) Also, with the script, the testpmd is using default configuration.By using testpmd directly from command line you will be able to give various parameters. For example, currently the default configuration the script is using is 1 cores to receive from 2 ports. You can use testpmd in commandline and have 2 cores to receive from 2 ports - thereby you are able to scale. [testpmd doc available at http://www.intel.com/content/dam/www/public/us/en/documents/guides/intel-dpdk-testpmd-application-user-guide.pdf http://www.intel.com/content/dam/www/public/us/en/documents/guides/intel-dpdk-testpmd-application-user-guide.pdf Section 4.2 lists testpmd command line options.

2) Is there any way you can please use latest processor generation? Sandybridge for example?

In which case you will be able to have better system performance.

Thank you

0 Kudos
YXu21
Beginner
2,861 Views

Thanks for the suggestions and the performance benchmark which seems to explain the poor result on my outdated server.

Based on the stats printed out by testpmd app, the packet size is 68 bytes. I tried the command

testpmd> port config all max-pkt-len 256

and ran the test, the results still show 68 bytes packets. I also started testpmd from command line with 2 cores, no improvement was seen on the rate either.

I will get a new server, and will run more tests on that.

One thing I want to verify is that my result is not CPU bound. With Linux top command, the CPU utilization is shown 100% all the time on the cores under DPDK control. Is there any command which can show me the true CPU utilization with DPDK?

thanks!

0 Kudos
Muthurajan_J_Intel
2,861 Views

Dear customer,

Thank you for planniing to get new server.

How are you injecting packets into the NIC ports? [like equipments like IXIA and SPIRENT? or something of your own?]

By configurating that equipment for the packet size, you will be able to get to the bigger packet size.

The commandline is for defining MAXIMUM packet size which by default is 1518.

 

Since the core is polling the ports all the time, you are seeing 100% utlization. It is normal because the nature of the polling.

Thanks,

0 Kudos
YXu21
Beginner
2,861 Views

I connect two NICs with a direct cable, and run "start tx_first" in testpmd to kick off a burst of traffic. I hope this test setting can give me result close to real traffic generator. I also plan to use pktgen when needed.

I calculated packet size by using RX-bytes/RX-packets, obtained from "show port stats all" result. It seems testpmd generates 68 bytes packets. I care more about small packets because that's where Linux kernel performs poorly.

Thanks for your excellent support!

0 Kudos
Muthurajan_J_Intel
2,860 Views

Dear Customer,

Thank you.

You are correct on Testpmd generated packet size. It is fixed. Using external pktgen gives more flexibility.

Once you get Sandybridge system [system with on-chip [on the CPU] memory controller and integrated [on the CPU] PCI controller, you will see performance boost.

Thank you

 

0 Kudos
Reply