FPGA Intellectual Property
PCI Express*, Networking and Connectivity, Memory Interfaces, DSP IP, and Video IP

PCIe freezing

Altera_Forum
名誉分销商 II
5,174 次查看

Hi, 

 

I have strange problem with PCIe IP. I have a custom board with ArriaGX and PCIe instantiated using MegaCore flow with AvalonST interface. The PCIe works (there are some DMAs running, target access, everything works fine), but occasionally the PC freezes when accessing my device. The design is quite complicated, so I started to minimize it and I ended with simple PCIe interface to on-chip memory and SW application that only writes to the device as fast as is possible. In this situation the system freezes after few seconds. 

 

Using SignalTap I see folowing: 

- there are coming writes before the freeze - only rx_xxx, nothing else. 

- no interrupt is signalled 

- rx_ready, tx_ready are continuously asserted 

- tx_credits are nonzero 

- there are just no transactions coming out of the IP when the system freezes even though my application is ready to receive. 

 

So I configured the IP to show me the test_out (512 bit) interface to see more and I found this: 

- in the time, when the packets stop coming there is bit 313 and 314 set (received erroneous TLP, packet with wrong sequence number) and they are coming repeatedly after that moment 

- also in the same times there is bit 85 set (that is Data link layer error - TLP error) 

- in the reaction to bad TLP it seems to me, that the IP sends a NAK to the root port (just few tick after the first erroneous TLP is received) 

- some ticks after the NAK is sent there is stop in receiving PLP ACKs for some time (I suppose that the IP tries to reset the link) 

 

and that's all - I do not see anything more coming. 

 

At first I thought that It could be some weird problem with the TLP sequencing, but I discovered, that time from time there is exactly the same situation (bad TLP sequence number, NAK, re-training of the link) but IP recovers from that error without any problem - the packets start coming again. 

 

Do you have any ideas, where should I look, what could cause this problem? Maybe I did not find the real moment where the IP freezes. How to track/debug it? 

 

Martin
0 项奖励
23 回复数
Altera_Forum
名誉分销商 II
337 次查看

Martin, 

 

Have you tried playing with the parameters of the PCIe Soft IP? Like maximum payload size and performance levels? Is your clocking clean?
0 项奖励
Altera_Forum
名誉分销商 II
337 次查看

Hey, did you ever figure out what was wrong. I think I have a similar scenario. 

 

 

--- Quote Start ---  

Hi Matthias, 

 

right now I have simplified everything that is possible - the FPGA design now consist only of PCIe IP and some virtual pins - rx_ready is '1', rx_mask is '0', there is some reset logic, interrupt and msi are '0', and the tx side is grounded also. 

 

PC detects my device and I am able to write to it. The driver is now just a "skeleton driver" that only connects to my device and then writes to to it, nothing else. After the write sequence is started, the PC freezes - sometimes it does it immediatelly, sometimes after a few seconds. 

 

I will add some detection logic for detecting non-posted requests, if there is a need to answer to some request, also I will create a detector for the error condition from the very beginning of this post. But after that I am stuck. 

 

This is already the simplest design that I can make - there is nothing more left to put away and it still freezes... Do you have any simple design that has only the Soft PCIe and some driver that communicates with it that you could share with me? Or do you know about any that I could use? Maybe there could be some problem with the hardware... 

 

Martin 

--- Quote End ---  

0 项奖励
Altera_Forum
名誉分销商 II
337 次查看

Hi, unfortunately no. I've tried to check the design here and also with Altera support and nobody was able to figure out, where could be the problem. 

 

 

Martin
0 项奖励
回复