FPGA Intellectual Property
PCI Express*, Networking and Connectivity, Memory Interfaces, DSP IP, and Video IP
6355 Discussions

A design based on the PCIe DMA transfer example design for Arria 10 device.

Sijith
New Contributor I
3,307 Views

Hi,

This is a extension of the discussion https://community.intel.com/t5/Programmable-Devices/Modifying-the-PCIe-DMA-transfer-example-design-for-Arria-10/m-p/1484799#M90698  and https://community.intel.com/t5/FPGA-Intellectual-Property/API-calls-failed-while-running-PCIe-DMA-transfer-example-design/m-p/1520494#M28014 where we could not get a solution to the problem we are facing.

We also tried trying to get a Intel Premium help (We are based on a University in USA, but purchased the FPGA and Intel Quartus Prime Pro with normal rate NOT through University reduced rate). But thats is rejected by Intel explaining that since, We are from a University, We are not eligible for Premium help. It would be great if you suggest a way to resolve the issue. 

Story in short: We were working on a project to use the FPGA to increase the data throughput in an experiment. We are planning to use FPGA as an intermediate in signal transmission through an optical fiber from electronic readout to the Data Acquisition System. So we need a way to transfer signal though the FPGA (input though the QSFP+ port and output through the PCIe). We were planning to develop a design based on the PCIe DMA transfer example design (that involve the DDR4 memory to store data). In the example design, data is created at a host computer and is written into the DDR4 memory through DMA write through PCIe. And then it read it back to the hot computer to verify the sending data is same as the receiving data.

What we need in our project to stream data from QSFP+ to the DDR4 memory and then DMA read through PCIe to host computer using the API provided. 

As an initial step (to test the working of the FIFO + PCIE DMA transfer example design, as our final aim is to get the data from QSFP+ to FIFO to goto the DDR4 element), we used a custom IP of data counter that counts upto 1000 (works on getting an trigger from a switch SW[0] ) to connect to Avalon FIFO IP. Then this design is integrated to the  PCIe DMA transfer example design using Platform Builder. The idea is to stream created at the counter though FIFO to the DDR4 memory element. Then DMA read though PCIE to a host computer.  But when we try to do DMA read from the host computer, we could not see the counter outputs (which we are suppose to get?).

Great if you could help us out. Any suggestion where to get help for making our design work is highly appreciated. 

 

0 Kudos
33 Replies
FvM
Valued Contributor III
1,898 Views
Hi,
I reviewed the previous thread but I didn't understand a simple point.
Intel provides a complete DMA example design with test application on the host side. Did you try to run the test framework, did it work for you?
0 Kudos
Sijith
New Contributor I
1,875 Views

Yup the example design provided by Intel works fine (Even though I had some problem initially, It was a kind of random crashes. I assume its somehow related to the physical DDR4 memory element--when I re-insert the physical card the problem resolves)

The real issue is when I modify the design. When I added a FIFO + Counter to the design (I used Platform Designer for Integrating FIFO+ Counter IP to the DMA transfer example design), the test application (I removed the writing part from the test application retaining the read part) simply fails. Fails I mean I could read data but all of then are sum junk data. A screen capture of running the API application from the host computer is attached.

Platform designer view of the modified system (DMA transfer example + FIFO + Counter) : https://community.intel.com/t5/Programmable-Devices/How-to-read-data-from-the-DDR4-memory-of-a-Modified-PCIe-DMA/m-p/1468923#M90078  please go through the attached screenshot (PNG files) of my first message in the thread.

0 Kudos
VenTingT
Employee
1,856 Views

Hi @Sijith,


Thanks for reaching out to the Intel Community Forum.


Can you check on the counter and FIFO to ensure that they're working properly by using Signal Tap?


Thanks.

Best Regards,

VenTing_Intel


0 Kudos
Sijith
New Contributor I
1,842 Views

Hi,

I tried simulating the counter and FIFO separately using modelsim and found they simulation looks all good. Just curious, do you would like me to try Signal Tap? 

0 Kudos
FvM
Valued Contributor III
1,824 Views

Hi,
using Signaltap to trace data send to DMA interface would be also my suggestion. Find out if and where it's corrupted in the framework. 

I have used it successfully to locate a data alignment problem in PCIe hprxm_master https://community.intel.com/t5/FPGA-Intellectual-Property/PCIe-hprxm-master-doesn-t-handle-unaligned-reads-correctly/m-p/1498943

0 Kudos
FvM
Valued Contributor III
1,809 Views

I reviewed the modified design with counter and fifo, but I can't even detect a valid data path from fifo to dma_wr_master. Data seems to run through ddr interface, although ddr is operated at a different clock. No idea why it's connected this way.

0 Kudos
Sijith
New Contributor I
1,712 Views

Hi FvM,

Thank you very much for looking into the design. I am new to the hardware design and basically from a basic science background, so any suggestion to improve/correct the modified design design is highly appreciated (this is what I am looking for too).

The present design i have is on the basis of the my thought that the counter-FIFO data can be written directly to DDR4 memory without the intervention of DMA write (thought DMA write should involve if the transfer of data from the host computer to the DDR4-- in the example design where we create data in host computer and write it to DDR4). Please correct my understanding if it is not possible somehow in this case.

What I really want is stream the data generated in the counter (which on/off with an external switch) through a FIFO (Intel Avalon MM FIFO IP, with streaming input and MM-output that I have used) then the data should go to the PCIe DMA transfer example design and the data written in DDR4 element then should read from the host computer through API functions provided)

Any suggestion for this is highly appreciated.

 

 

 

0 Kudos
VenTingT
Employee
1,741 Views

Hi @Sijith,


Thanks for your feedback.


Yes, can you please try the Signal Tap as well? To test the real-time signal behavior of the counter and FIFO.

Besides, have you tried to run the example design without modification? The data can be read correctly?


Thanks.

Best Regards,

VenTing_Intel


0 Kudos
Sijith
New Contributor I
1,712 Views

Sure! it would be great if you could have a look into my design (the signal connection of the Counter FIFO to the PCIe DMA example design) to make sure the connections makes sense (pls see png files attached to the link in my first message). If not any suggestions to improve it is highly appreciated.  (FYI: What I really want is stream the data generated in the counter (which on/off with an external switch) through a FIFO (Intel Avalon MM FIFO IP, with streaming input and MM-output that I have used) then the data should go to the PCIe DMA transfer example design and the data written in DDR4 element then should read from the host computer through API functions provided)

 

Also, I have tried run example design and it was running fine that time.

0 Kudos
VenTingT
Employee
1,628 Views

Hi @Sijith,


As per your modification, you want to use the counter to generate data and stream the data to DDR4 through FIFO. But when looking into your qsys file (in the attached screenshot) , I found that the Avalon FIFO is not connected to the DDR4. May I know why the Avalon FIFO is connected directly to DMA instead DDR4? The data from the FIFO should be first stored in the DDR4 right?


You may try to generate the Example Design from the External Memory Interfaces Intel Aria 10 IP to view the connections of the DDR4 and data generator.

To keep your EMIF IP configuration settings, open the .qsys of your current design in Platform Designer, click the EMIF IP, and click Generate Example Design in the Parameters tab.


Since you're able to run the example design without modification previously, this means that the DMA is working. Then, we need to check on the data transmitted from the data generator (counter) to ensure it is correctly stored in the DDR4. Can you please run the Signal Tap to check the DDR4 as well?


Thanks.

Best Regards,

VenTing_Intel


0 Kudos
VenTingT
Employee
1,627 Views

FIFO Connection in Qsys.png

0 Kudos
Sijith
New Contributor I
1,589 Views

Hi, 

Thank you very much for your reply. May I know that you are talking about the .qsys file from the attached ModifiedDesign.zip community.intel.com/t5/Programmable-Devices/Modifying-the-PCIe-DMA-transfer-example-design-for-Arria-10/m-p/1477669#M90418 

or some other screenshot attachments?

Actually my modification is the first step of a couple of modification (a kind of step by step). After current step (testing the counter-FIFO writing to DDR4 and then DMA read data on DDR4  from a  hostcomputer), I would like to replace the counter data with an input datastream that injected from external source through the QSFP+ port to FIFO(the input data rate is around 1.5 GB/s).

So involving the DMA controller to write data to DDR4 at this step, will help to efficiently handle the high speed input data (1.5 GHz/s in next step of my project ) writing to DDR4 and then DMA read from DDR4 by host computer (without any data loss), is that makes any sense for you? If not please let me know. Also, it would be great if you could give me any idea/suggestion that can help too.

sure I will try generating the EMIF example design as you suggested (could you please elaborate a bit how the EMIF example design will help us?)

0 Kudos
VenTingT
Employee
1,512 Views

Hi @Sijith,


Yes, I obtained the .qsys file from the attached ModifiedDesign.zip in the previous thread. May I know have you tried to run the Signal Tap to check the DDR? What is the observation?


The objective of generating the EMIF example design is to observe the connections on how the DDR store the incoming data because your qsys connection seemed off. 


Since you have the next step which is using an external source to replace the counter for input data, I'd suggest you to directly go to this step. And then we can debug from there. Because it's not sure if we replace the counter with the external source, it might create other issues or not.


But if you want to use the counter to generate data and stream the data to DDR4 through FIFO, I think the counter should connect to the DDR4.

From the unmodified example design which you've run successfully, it already performed the DMA read and write operations to DDR4.


May I know what is protocol that you used to feed the data stream to FIFO and to DDR?


Thanks.

Best Regards,

VenTing_Intel


0 Kudos
Sijith
New Contributor I
1,484 Views

Hi,

Thank you very much. Did not finished run SignalTap yet. Will update you very soon the result.

For feeding data from external source, I have to get the design and to develop data transfer design (QSFP+ to the FIFO then to DMA example design) strategy. So thought using counter to test the data transfer from FIFO to DMA transfer example design will be a quick validation I can do first.

 

 In this reply message above (

community.intel.com/t5/FPGA-Intellectual-Property/A-design-based-on-the-PCIe-DMA-transfer-example-design-for-Arria/m-p/1560274#M28411  )

He was mentioning about the absence of "valid data path from fifo to dma_wr_master ", this need only when we connect the FIFO to DMA, right? Or am I missing something?

The correct way to connect the FIFO to DDR4 is though the EMIF, right?

 

 

I have used the Avalon Streaming from counter to FIFO input and Avalon MM from FIFO out to the DDR4.

 

 

0 Kudos
FvM
Valued Contributor III
1,474 Views

Hi,
I didn't realize that you want to buffer counter data in DDR4 memory. But anyway case, it seems that data is send across clock domains without synchronization . Counter and FIFO are running in the PCIe (250 MHz) and DDR4 in the emif_usr_clk domain. I also don't recognize at first sight how DDR4 data path is multiplexed between clock crossing bridge and counter FIFO and how write control for counter is achieved.

0 Kudos
VenTingT
Employee
1,299 Views

Hi @Sijith,


Thanks for your update on the Signal Tap progress. 


Yes, to connect FIFO to DDR4, it is through EMIF IP as the EMIF IP is used to interface with the memory devices.


Thanks.

Best Regards,

VenTing_Intel


0 Kudos
Sijith
New Contributor I
1,275 Views

Thank you VenTing,

It would be great if you could have a look to the Platform Designer connections for the modified design to check all connections makes sense for my design purpose (I mentioned earlier) by the time I will update you Signal Tap

0 Kudos
VenTingT
Employee
1,224 Views

Hi @Sijith,


I've looked into your modified design Qsys connection in Platform Designer. 


Here are my findings and inquiries:


1. I'd like to ask where do you store the counter value/ source data. From which memory RAM that the Arria 10 Hard IP for PCIe can DMA read from?

As for a normal DMA write operation, the Write Data Mover requires the source address of the data is to be read. The write descriptors table requires the information of:

• Descriptor ID, ranging from 0-127

• Source address

• Destination address

• Size

You may refer to the user guide on the DMA Write operation: https://www.intel.com/content/www/us/en/docs/programmable/683425/18-0/datasheet.html


2. You can connect the FIFO to the EMIF IP to store the source data in the DDR4.

You may refer to the EMIF IP Example Design which I've included the steps to create the Example Design in the previous reply.

There is a clear Qsys connection provided in the Example Design between the data generator (which is Traffic Generator, tg in the Example Design) and EMIF IP for you as a reference when you're trying to connect the data generator to EMIF IP in your design. 


3. After you confirm the FIFO to EMIF connection is correct, you may test the DMA read/write. Before that, please check the Signal Tap on the DDR to ensure that the source data generator is correctly stored in the DDR.

If the DMA operation is not working, please check out the PCIe reference design to compare your design with the reference design. In this reference design, there is a clear Qsys connection between the Arria 10 Hard IP for PCIe and the EMIF IP for you to refer to. And, the reference design has DMA transfers. The operation of this reference design shares the same operation as what you wanted to do in your modified design which is to perform DMA read/write from DDR.

Reference Design: https://www.intel.com/content/www/us/en/design-example/714949/intel-arria-10-fpga-pcie-3-0-x8-dma-design-example.html

Reference Design User Guide: https://www.intel.com/content/www/us/en/docs/programmable/683554/18-0/an-829-mm-dma-reference-design.html


Lastly, if you're still failing to get the correct data value from DDR using DMA after following the connections in the above-mentioned Design Example and Reference Design, please send the Signal Tap result to clarify whether the issue is coming from the PCIe to DDR, or counter to DDR. Then, we can continue to debug from there.


I hope the above findings help you proceed with your project.


Thanks.

Best Regards,

VenTing_Intel


0 Kudos
VenTingT
Employee
1,161 Views

Hi @Sijith,


May I know if you have further questions on this case?


Thanks.

Best Regards,

VenTing_Intel


0 Kudos
Sijith
New Contributor I
1,062 Views

Hi VenTing,

Thank you very much. Actually I am facing some issue to get signal tap running. I would like to review you the procedure I followed. Please correct me if i went wrong in some steps or missing something.

1) Opened my counter+fifo+DMA trasfer example design  (basically the modified design we were talking about so far) in Quartus Prime Pro.

2)Opened Platform designer and added Signal Tap IP  and connected to the design. then validated system integrity and Generated HDL (clicking the GUI tab "Generate HDL") Then compiled the whole design in Quartus prime pro.

3)Opened the Quartus Prime Signal Tap Logical analyser and saved the .stp file. In the  Signal Tap GUI, searched for nodes. But could not see any.

 

Am I missing some step? Do I want add some info in the .stp file manually?

 

0 Kudos
Reply