Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Silvan
Beginner
684 Views

is there a bug in the Ethernet MAC of Arria10 SoC devices?

Hi all,

I have an Arria10 SoC device on which an embedded Linux (4.14) is executed. It use the EMAC1 and an external Phy (Micrel KSZ9031RNX) connected through RGMII .

After the transfer of a huge amount of data from the SoC to a PC through the Gigabit Ethernet interface the following observations can be made:

  • Large communication latency. A device ping require around 1s
  • The EMAC1 gmacgrp_debug register (0xFF802024) has a value of 0x120

It seams, that something is wrong with the FIFO state in the MAC or with the FIFO flush mechanism?

The data transfer was done through the iperf3 tool with the PC as server and the SoC as client.

Does anyone know about an issue in the Ethernet drivers (MAC or PHY) when a lot of data is transferred? Or what could be the next step to solve the issue?

By the way: The high latency error state is exited by reinitialize the network connection in the SoC device with ifconfig eth0 down and ifconfig eth0 up 

Thanks for any hints and proposals for a solution or debugging hints

0 Kudos
15 Replies
609 Views

Hi,

I have not seen such issues occurs before. Did you change or modify any Uboot or devce tree configs?

Also, you mentioned you are using EMAC1, but ifconfig eth0 down and ifconfig eth0 up to exit the high latency issue. Why are you doing it on the eth0 instead of eth1?

Can you check if you have set the correct EMAC* in the device tree:

"Ubootdirectory\arch\arm\dts"

Silvan
Beginner
592 Views

Hi Eberlazare,

Thanks for your feedback.

 

This issue is not observed in u-boot but on a running Linux. Our hardware has only one Ethernet connection which use EMAC1 (EMAC0 is disabled). In the Linux device tree the EMAC1 configuration is connected to ethernet0 (aliases section) which result in the naming eth0 on Linux.

 

Bellow you find the Linux device tree entry for the ethernet configuration.

Do you think there is something wrong? Or what could be the reason for this high Ethernet latency?

 

hps_gmac: ethernet@ff802000 {
	compatible = "altr,socfpga-stmmac", "snps,dwmac-3.72a", "snps,dwmac";
	altr,sysmgr-syscon = <&sysmgr 0x48 8>;
	reg = <0xff802000 0x2000>;
	interrupts = <0 93 4>;
	interrupt-names = "macirq";
	/* Filled in by bootloader */
	mac-address = [00 00 00 00 00 00];
	snps,multicast-filter-bins = <256>;
	snps,perfect-filter-entries = <128>;
	snps,axi-config = <&socfpga_axi_setup>;
	tx-fifo-depth = <4096>;
	rx-fifo-depth = <16384>;

	clocks = <&l4_mp_clk>, <&peri_emac_ptp_clk>;
	clock-names = "stmmaceth", "ptp_ref";

	resets = <&rst 33>, <&rst 41>;
	reset-names = "stmmaceth", "stmmaceth-ocp";
		
	phy-mode = "rgmii-id";
	max-frame-size = <3800>;
	/* probe for phy addr */
	phy-addr = <0xffffffff>;

	txd0-skew-ps = <420>; /* 0ps */
	txd1-skew-ps = <420>; /* 0ps */
	txd2-skew-ps = <420>; /* 0ps */
	txd3-skew-ps = <420>; /* 0ps */
	rxd0-skew-ps = <420>; /* 0ps */
	rxd1-skew-ps = <420>; /* 0ps */
	rxd2-skew-ps = <420>; /* 0ps */
	rxd3-skew-ps = <420>; /* 0ps */
	txen-skew-ps = <420>; /* 0ps */
	rxdv-skew-ps = <420>; /* 0ps */
	txc-skew-ps = <1440>; /* 540ps */
	rxc-skew-ps = <1680>; /* 780ps */

	status = "okay";
};
576 Views

Hi,

How did you define the clock skews? Is it default? Where did you get them?

Have you run on different versions of Linux? Preferably 5.4

Silvan
Beginner
566 Views

Hi Eberlazare,

The skews are calculated based on the trace length and FPGA Pin delays on our custom board and firmware.

I tried it also once with a really early version of kernel 5.4. And at least in the MAC driver I didn't saw any changes since then.

May it is possible to reproduce the issue with an Arria10 development board? Unfortunately, our development board still has an ES2 device which isn't available in Quartus to build the GHRD. When you have an image for this development board I could also try to reproduce the issue on this hardware...?

552 Views

Hi,

Yes, you may want to reproduce using the default settings from the GHRD. 

Can you share the device part number and the Quartus version you are working on?

Silvan
Beginner
531 Views

We use the device 10AS057K4F40E3SG and Quartus 19.1.0.

488 Views

Hi,

The GHRD that is use for 19.1 Arria 10 SoC Dev Kit can be found here:

https://rocketboards.org/foswiki/Documentation/GSRD191ReleaseNotes#Release_Contents

http://releases.rocketboards.org/release/2019.04/gsrd/a10_gsrd/sdimage.tar.gz

The EMAC use in the GHRD is EMAC0/gmac0.

 

 

412 Views

Helo,

Is there any update from your side?

Silvan
Beginner
406 Views

Hi, Thanks for asking.

I ordered the current version of the Arria 10 development board which allows me to directly use the provided image. On this development kit I would try to reproduce the issue and send you a step-by-step explanation to reproduce the issue. This should make it possible that you could reproduce and fix the issue on your side.

The expected receiving time of the development kit is at the end of next week. As soon as I have any news I would give you the information.

Thanks

376 Views

Hi,

My recommendation is to use our GHRD and latest version of Uboot with its default device tree settings, I previously  tested using dev kit and latest version of Uboot and kernel from below, I have not face the high latency you mentioned:

https://rocketboards.org/foswiki/Documentation/BuildingBootloader#Arria_10_SoC

 

262 Views

Hi,

Any update on the issue?

Thanks.

Silvan
Beginner
225 Views

Hi Eberlazare,

It seams, that the issue is still present in GSRD release 2020.11.

The observation is less often than before but it is observable. Right now, I doesn't understand the exact root cause. I try to find a setup on which the observation is reproducible.

Best regards,

Silvan

162 Views

Hi,

Could you share how was the setup/testing to reproduce how the error can be seen using the GHRD on our dev kit if possible.

 

Silvan
Beginner
123 Views

Hi Eberlazare,

The Hardware Setup is quite simple. The DevKit is directly connected to a PC. Both have a static network configuration and the data traffic is generated through the tool "iperf3". The PC acts as ‘server’ and the tool is started with the command “iperf3 -s”. The DevKit is the client running the command “iperf3 -c <PC-IP>” In the attached file "Overview.pdf" the setup is shown in more details.

I created a python test script, executed on the DevKit for additional tests. In the attachment the archive “TestSequence.zip” contains the script "TestSequence.py". You find additional test documentation in the script. Right now, I working with this script and try to figure out how to reproduce the issue. In some test runs, I see errors (ErrorCnt variable). A general observation is, that many “Retransmits” are exist which I doesn’t understand.

Thanks for your support, Silvan

60 Views

Hi Silvan,

This is an uncommon issue, I will try to do this using our GHRD by this week.

Reply