I have the SoC (see attachments). Recording speed up the read speed (to/from internal RAM, external DDR) up to 6 times. The project is compiled in Quartus 11.1, 12.1 and loaded in CYCLONE IV, Arria II and obtained the same result. In SignalTap I see that read cycle on the local bus (internal Avalon) ends very quickly, but response from the PCIe compiler is delayed by 2 us (>20 cycles of 100MHz; response was analyzed by the signals 'test_out'). No options in the compiler does not help. How to increase the speed of reading data on PCIe without the use of DMA ?