I am using a Cyclone V with dual-Arm processors in an SOC design where we use the HPS MAC to receive packets from a server. We have seen a problem where, using high packet-per-second rates, we see the last packet getting stuck in the MAC on RX.
They way this happens is thus:
- Run iperf3 to send 100 byte TCP packets at the highest rate possible ( about 77K packets per second) to the Cyclone board
- This runs for 1-2 minutes, with no issues.
- The transmission is stopped
- Wireshark is run/cleared using wiretap devices on the wires going into and out of our board (we use two HPS MACs)
- The transmission is restarted
- The very first packet we see coming out, via wireshark, is the last packet we sent in in the previous transmission. Then the new stream's packets will start coming after some delay due to retransmissions and such caused by that bogus first packet.
The first packet is obviously from the previous run as the port number changes for each run of iperf and the port number in that packet is from the previous run. The next packets all have the proper port numbers for this run.
We know the packet is stuck in the MAC for we can set breakpoints in our code and watch the first DMA from the MAC after we restart and see that it's the bad/previous packet. That is, it's not stuck in our system in our own packet buffers or anything like that.
Any ideas on how this could happen? Any possible MAC/DMA configuration that could cause packets to not get pushed out of the MAC at the end of a stream?
I have an update. After doing a run, I check the debug register (0xff700024 for EMAC0) and see that when it's zero, there is no stuck packet and things work as expected. But when it's non-zero, it's always 0x00000120 which indicates that a) The FIFO is not empty and b) the RX state is "receiving frame data."
But at this point, I'm not sending anything. There is nothing on the wire (I use Wireshark to confirm this). The register will not change no matter how long I wait.
So... what could cause the RX state machine to hang like this? It seems as though it doesn't see the EOF.
We have only seen this on EMAC0. Running the same traffic into EMAC1 has never shown an issue, but it was not a terribly robust test. It's so easy to se this on EMAC0 that after trying for an hour on EMAC1 we gave up. And to be clear, we see this most easily by sending small packets (100 bytes) at high rates. "Normal" sized packets and typical rates has no problems.
The "solution" seems to be to set the DFF bit in the DMA operation mode register (0xff70018). The documentation on this is unclear as to what it really does ("When this bit is set, the Rx DMA does not flush any frames because of the unavailability of receive descriptors or buffers as it does normally when this bit is reset.") but the problem is gone when we set it. We see no new bad behavior, but since it's not clear what flushing the DMA means/does, we don't know what lurks, ahead.
Anyone know what this bit really does, internally?
About transmission stopped, because it’s TCP and RX dma doesn’t work and response as well. This behavior looks normal. Flushing the DMA means cleaning the registers in case of over flow. This is to make sure that the system won't hang because of over flow.
Can you please check the RX buffer size in the Linux driver and increase it to improve the RX performance.
Thanks for the information. But...
This happens most quickly doing UDP, by the way. There would be no number of buffers that would solve this problem (we currently use 64 rx descriptors, and I've tried 256 with the same results). We can send packets into the system much faster than we can process them. And, this is bare metal, not Linux (but it happens with Linux, too.)
It's normal that once buffers are exhausted, that sometimes a packet will get stuck, forever, in the DMA engine? That's what I see. It's not always when all descriptors are filled, but sometimes. And there being no way to flush it seems crazy... But the DFF bit seems to fix it, but what does that really do? It's not clear what the difference is regarding how the DMA engine works and if there is no downside to having this bit set, why would that not be the default?
The DMA is made flexible for your use, I don't have answer why this is not the default. It isn't normal to have packet stuck in the descriptors if every thing been set correctly. Can you tell me how you used the DFF to fix this? Does everything working fine with you now?.