Re: request timeouts in PCIE

Altera_Forum · ‎01-07-2012

Hi,

I am developing DMA for PCIE. I want to know how do we implement request timeouts for PCIE requests. I have some quarries regarding timeouts in PCIE....

1. Do we implement them through counters to wait for a specific period to get completion of a read request?

2. Do Altera's PCIe core implement timeout counters inside their PCIE core?

Please share some ideas or useful links about how we can account for request timeouts of PCIE.

Thanks

Altera_Forum · ‎01-08-2012

I implemented a timeout counter and return deadbeef as data when the timeout occurs.

without that:

my machine hangs when the pcie read is not completed, I have to repower it.

Altera_Forum · ‎01-08-2012

thanks for the reply Hbair...

Did you implement seperate counters for n number of supported back to back read requests?

And a timeout counter is nothing but a simple counter, right?

Altera_Forum · ‎01-08-2012

I handle the tags and timeout functionality centrally: All blocks/entities that can issue non-posted (read) requests have to ask for a tag from my tag management block.

Upon issuing a non-posted request, the tag management block remembers (roughly) when you issued the request, and every now and then – i.e. with low priority – I check the table for requests that timed out. In my tables, I store two bits per pending request:

00 → Issued between 0 and 12.5 ms,

01 → Issued between 12.5 and 25 ms,

10 → Issued between 25 and 37.5 ms,

11 → Issued between 37.5 and 50 ms.

Say, you have a request issued with timeout group “10”, you would drop it in the next round when the current timeout counter is at “01”. This means that the request is at least more than 25 ms old and at most less than 50 ms.

The blocks that issue read requests surely read the AST RX port to see read data completions, but with respect to the timeout functionality, they now need a separate port that receives timeout events of such requests.

A different approach might be beneficial if the blocks that issue read requests handle the used tags on their own, like done for handling the descriptor table in the SGDMA example which uses a specific, constant reserved tag for all non-posted requests. In this case, all the tag management is handled in a distributed manner, so one could have a central trigger that goes high once per 12.5 milliseconds, and each block could simply count these ticks and drop the request when it has received three such strike ticks.

This was the technical way of detecting and distributing a timeout event properly. The second step is to design your application in such a way that these timeout events can be handled gracefully in terms of your application, like re-issuing the requests, indicating errors or just dropping higher-level transactions. This is very application-dependent, and sometimes it’s the harder part to design properly, e.g. flushing a queue that already received part of the full read request response when a timeout is detected.

– Matthias

Altera_Forum · ‎01-08-2012

no

yes a simple counter

Altera_Forum · ‎01-09-2012

Remember that

the timeout does not stop if any data is received and
the timeout is not retriggered when any matching completion is received.

The timeout starts when the request is issued, and it is only stopped if all data was received in time, i.e. after the final completion.

If there is not a high degree of request–completion parallelism in your application, say, if you have at most one flying request at any time, you'll sure be better off implementing a single timer counter.

– Matthias

Altera_Forum · ‎01-09-2012

ali_umair21,

Have you looked at the PCIE Capabilities register? There's a Completion Timeout field that you can program. It's also programmable thru the PCIE core instantiation GUI. Just look up completion timeout in the PCIe Compiler Guide. Hope this help.

--- Quote Start ---

Hi,

I am developing DMA for PCIE. I want to know how do we implement request timeouts for PCIE requests. I have some quarries regarding timeouts in PCIE....

1. Do we implement them through counters to wait for a specific period to get completion of a read request?

2. Do Altera's PCIe core implement timeout counters inside their PCIE core?

Please share some ideas or useful links about how we can account for request timeouts of PCIE.

Thanks

--- Quote End ---

Altera_Forum · ‎01-09-2012

Thanks to all of you....

@alybruin: yes i did see that but thats only for Gen2 as i understand.

@Matthias Wächter: Thanks for sharing some great ideas....

Now i am planing for out of order completion.... I want to ask, is it possible that we get

split completions for read request with payload size <= maximum supported payload size...

Actually my idea is to split the DMA read request into n number of maximum supported TLP request with each request a separate TAG will be attached.... for example if 4KB of DMA read is to be performed and the maximum payload size is 128 Bytes... i will generate 32 read requests (128Bytes requested payload size) with separate TAG values.... so if the completion corresponding to each read request cannot split further... then using TAG i will catch the out of ordered completions. Once i get all the completions placed in the correct order in a buffer, task is done.

Please share your ideas about how we can handle out of ordered completions more efficiently...

Thanks

Altera_Forum · ‎01-09-2012

Yes, that’s possible. But be warned: Not only you have to split your 4k read into maximum-payload size requests, but the completer – typically the north bridge – can split his responses again at 64 byte boundaries!

Remember that the completions for different requests – i.e. different tag ids – can be received out of order, but the various part completions for the same request will be received in order.

One of the most important aspects of high performance DMA read transfers is to perform them in an interleaved way, i.e. issue new read requests while old read requests are still pending. This way you can reach the maximum PCIe transfer rate with little CPU overhead.

Note that the PCIe hard/soft IP tells you the maximum allowed read request size in one of the PCI(e) configuration space registers that are repeatedly distributed on the tl_* signal outputs. But as a educated guess, you could choose to max at 128 bytes, so you avoid this optimization path.

Remember that once you intend to have multiple read requests pending, you should take good care of managing your read request credit, not to overrun your completion reception buffers and not to overrun the system by monopolizing the ‘bus’, i.e. the PCIe switches and the north bridge. Still my recommended reading on this subject is http://www.xilinx.com/support/documentation/user_guides/v6_pcie_ug517.pdf, Appendix E. And, as the subject is still about request timeouts, you have to watch your requests closely and time them out properly. Handling the read request credit properly in this case is a pitfall, though.

– Matthias

Altera_Forum · ‎07-23-2012

A->B->C , B is a normal pcie switch.

I have a question: if A sends a read request to C, for some reason, C has no response and the TLP stays in B, after a while A issue a completion timeout. what would happen to the read request TLP in B? should B discard the TLP ? but who can notify B to do it ?

thanks a lot