FPGA Intellectual Property
PCI Express*, Networking and Connectivity, Memory Interfaces, DSP IP, and Video IP
6669 Discussions

How Can I set up DMA operation with my own PC software application?

Altera_Forum
Honored Contributor II
8,054 Views

Hi All: 

 

I want to re-use the pcie_highperformancedesign example provided by the Arria GX Development Kit. Now I am confused with the PC software application altpcie_demo.exe. :confused:  

I am trying to control FPGA to initiate dma read and write operation just like altpcie_demo does with my own PC software application but failed. 

 

Firstly, I used Jungo Windriver to generate a pcie driver. With the API functions provided by the driver I can access(R/W) configure registers ,memory bar 1:0(the syncram) and bar2(dma control registers).  

 

Secondly, I creat a Read Descriptor Table--Header+2 Descriptors and set data(Length,Ep mem addr, RC mem addr)for desciptors. The header has four dw(DW0,DW1,DW2,DW3). For DMA Read, I set DW0=0x00040002,DW1=0,DW2=addr of header,DW3=0x1. Then I write DW0 to Bar2+0x10,DW1 to Bar2+0x14,DW2 to Bar2+0x18,DW3 to Bar2+0x1c. 

 

My first question: Where(mem addr) can I poll the RCLast value to indicate the completion of DMA read?:confused:  

 

Thirdly, I want to transfer the DMA Read data back to PC. I creat a Write Descriptor Table--Header+2 Descriptors.In each descriptor, I set PC mem addr for write back data and addr of EP mem correctly .The header has four dw(DW0-DW3). For DMA Write, I set DW0=0x00050002,DW1=0,DW2=addr of Header,DW3=0x1. Then I write DW0 to Bar2+0x0,DW1 to Bar2+0x4,DW2 to Bar2+0x8,DW3 to Bar2+0xc. 

 

At the end, I checked the write back data and found that the write back data are all zeros.It seems like that the FPGA does nothing at all.:confused:  

 

What are the detailed steps I should follow to set up the DMA operation correctly? I read the pci express compiler doc but didn't get enough information about software application.  

 

Thanks a lot for any help.
0 Kudos
33 Replies
Altera_Forum
Honored Contributor II
1,570 Views

 

--- Quote Start ---  

I am correct that DMA operates on the End Point memory mapped to BAR[0], or is the memory involved in DMA located elsewhere? 

 

Also, for individual DWORD read/writes to RC_SLAVE memory, must I set USE_RC_DIRECT_MEM to 1? 

 

altpcierd_rc_slave.vhd: 

 

USE_EP_MWR := 0;-- Allow EP to issue MemWr to RC on command 

USE_RC_MWR_MRD := 1; -- Allow RC access to EP MEM thru opcode regs 

USE_INIT_MEM: INTEGER := 0; 

USE_RC_DIRECT_MEM: INTEGER := 0;-- Allow RC direct access to EP MEM 

USE_EP_IO_RDWR:= 0; -- Allow EP to issue IO Rd/Wr to RC on command 

 

Thanks for any insights. 

--- Quote End ---  

 

 

I believe that with USE_RC_DIRECT_MEM set to 1 and RC_SLAVE set to 1, accesses to EP memory mapped to BAR[0] will access the same memory that the DMA accesses. However because the first 64 bytes of the BAR0 space has some special functions, you need to add 64 when trying to access the local memory the DMA sees. IE to access the memory that the DMA accesses at it's local address (DW1 of the descriptor) you need to add 64 when accessing the BAR[0] mapped space.
0 Kudos
Altera_Forum
Honored Contributor II
1,570 Views

Hello Hey Hey, 

 

I'm trying to understand how all things tie together but cannot rely on the documentation alone, so thanks for seeing the actual implementation and explaining it. 

 

After I found how the reset the DMA controller (write 0x0000FFFF) from the code, I can now perform multiple DMA operations in a row. The open-source driver so far is here: 

 

http://www.linuxdriverproject.org/twiki/bin/view/main/prj012 

 

I also found in the descriptor header, MSI and EPLAST_ENA bits seem reversed in the documentation. 

 

I am aware of the 64 byte BAR[0]. I stay out of that part. 

 

A 4096 bytes DMA loop back copy fails somewhere on word 0x30. 

 

Slowly getting there, thanks, 

 

Leon.
0 Kudos
Altera_Forum
Honored Contributor II
1,570 Views

:( 

We tried to make a software application to manage dma transfer using AN456. 

We configure the descriptor table correctly and start the dma transfer. 

All data transfered are OK with the right number of descriptor packets. 

But it seems that the EPLAST is never updated by the dma and we are not capable to check the end of transfer. 

we tryed to set all EPLAST_ENA bit in decscriptor header and in dma descriptor header register but nothing changed. 

We tryed to set register and run one transfer done by altpcie_demo.exe reported in pcie_log.txt : the transfer and data are ok but when we read the EPLAST location (stored at offeset 3 (DWORD format) in our descriptor table) nothing change. 

Please help us we are near sucide ...
0 Kudos
Altera_Forum
Honored Contributor II
1,570 Views

We are doing a DMA read

0 Kudos
Altera_Forum
Honored Contributor II
1,570 Views

Ok we found the problem. We were reseting only the dma read (the only we use) and we need to reset the dma write to !!! and that working fine

0 Kudos
Altera_Forum
Honored Contributor II
1,570 Views

The last problem we have is that using chainning dma exmemple in verilog the every thing is ok when it is generated in VHDL dma transfers are halted ?!? using the same pc driver and api : ( 

any ideas ?
0 Kudos
Altera_Forum
Honored Contributor II
1,570 Views

I have published a Linux device driver: 

 

http://marc.info/?l=linux-kernel&m=122813921631142&w=2
0 Kudos
Altera_Forum
Honored Contributor II
1,570 Views

 

--- Quote Start ---  

Hello Hey Hey, 

Thank you for your response, that was very informative and cleared up a lot. 

 

I have atleast one more point of confusion. There are two Chaining DMA Descriptor Headers at offset 0x00 and 0x10. The first for write and the other for read. Why is there a Direction bit in the Control Fields (Table 7-4 of PCI Express Compilers Users Guide 8.0)? Is this a redudant thing, or is there some significance to this bit. To me, I would assume the registers at 0x00 and 0x10 specify the direction. 

 

Thanks. 

--- Quote End ---  

 

 

 

I am little confused about above post. It says descriptor headers is at offset 0x00 and 0x10 where as Table 4-9 in compiler guide show that 0x00 holds device/vendor ID. Any help to clarify is appreciated
0 Kudos
Altera_Forum
Honored Contributor II
1,570 Views

Well, I have figured out it is offset in BAR[2] regsiter.

0 Kudos
Altera_Forum
Honored Contributor II
1,570 Views

I am having some difficulty generating multiple DMA transfers in the chaining DMA design example that is used in the PCI Express high performance reference design. After looking through the RTL, it appears to me that I must perform the following sequence prior to initiating a new transfer: 

 

[1] write the value 0x0000ffff to the control register DW0 (either read or write depending on the desired transfer) prior to initiating a new transfer 

[2] setup DW0, DW1, DW2, and DW3 with desired transfer parameters 

[3] be sure to write DW3 last as writing DW3 puts the DMA engine in motion 

 

From what I can tell, step[1] is required for each descriptor table that describes a transfer. Without step [1], it appears that a write to DW3 will not initiate a new transfer. 

 

There have been a few comments in this thread that seem to support this. However, I wanted to spell it out in detail and get clarification. Can anyone comment?
0 Kudos
Altera_Forum
Honored Contributor II
1,570 Views

Brett, 

You are right. The first step(writing 0xffff to DW0)will reset the DMA and last 

step(writing to DW3) will start the DMA. But i am seeing some issue with write_dma read address to end point memory in RTL. The address goes beyond the length specified in descriptor. It looks ok with read DMA. Has anybody observed same and tested at higher level ?. 

 

Thanks 

Sharan
0 Kudos
Altera_Forum
Honored Contributor II
1,570 Views

Hi, 

The extra read address to end point memory is not an issue during dma write. data from end point memory is written to local fifo and only exact size is read from fifo. fifo is cleared at the end of transfer. 

 

But when i disabled the dma read and enable only dma write in root complex BFM driver(inside chained_dma_test task), the 2nd tranfer is stuck in the middle as tx_st_ready from pci core is asserted and test fails with timeout error. 

 

it works only if dma read is also enabled. in our application, we don't need the dma read operation. please help.. 

 

thanks.
0 Kudos
Altera_Forum
Honored Contributor II
1,570 Views

Hi,  

I am trying to write a pci driver code to write and read data in to the kernel space.. 

 

while i am reading back i am not getting the actual data, sometimes getting segmentation problem. I wanted to know what is maximum memory we can allocate using pci_alloc_consistent().
0 Kudos
Reply