Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Altera_Forum
Honored Contributor I
1,866 Views

Simple data transfer over PCIe for co-processing

I will briefly explain what I am trying do. I have a host PC running linux and connected to an Arria GX II over PCIe. I am trying to send some data over to the FPGA, do some processing on that ( say simple FFT) and then send the data back to the host PC. The idea is to explore the use of the FPGA as a co processor. I am using the PCIe compiler express hard IP on the FPGA.  

 

 

I am facing some issues :  

1. Are there any reference designs with Linux drivers that I can use to set up a simple data transfer between the host and FPGA ( no need for DMA right now)  

 

2. When using the DMA driver in the linux staging area (altpciechdma) I get the following error:  

 

[ 4378.233120] BAR0 0xd0000000-0xd07fffff flags 0x0012120c 

[ 4378.233153] BAR2 0xff600000-0xff60ffff flags 0x00020200 

[ 4378.233276] BAR[0] mapped at 0xf81c0000 with length 32768(/8388608). 

[ 4378.233373] BAR[2] mapped at 0xf8140000 with length 256(/65536).  

[ 4378.233447] bar_tests(). 

[ 4378.233466] write_header = 0xf8140000. 

[ 4378.233488] read_header = 0xf8140010. 

[ 4378.233511] &write_header->w3 = 0xf814000c 

[ 4378.233534] &read_header->w3 = 0xf814001c 

[ 4378.233556] ape->table_virt = 0xf65a4000. 

[ 4378.233608] Allocated cache-coherent DMA buffer (virtual address = 0xfffffffff6150000, bus address = 0x0000000036150000). 

[ 4378.233747] Filled First descriptor , read  

[ 4378.233773] Descriptor Table (Read, in Root Complex Memory,# = 1) 

[ 4378.233804] 0xf65a4000/0x00: 0x00000000 

[ 4378.233829] 0xf65a4004/0x04: 0x00000000 

[ 4378.233852] 0xf65a4008/0x08: 0x00000000 

[ 4378.233875] 0xf65a400c/0x0c: 0x0000fade 

[ 4378.233900] 0xf65a4010/0x10: 0x00000800 

[ 4378.233924] 0xf65a4014/0x14: 0x00001000 

[ 4378.233947] 0xf65a4018/0x18: 0x00000000 

[ 4378.233971] 0xf65a401c/0x1c: 0x36150000 

[ 4378.233996] writing 0x00060001 to 0xf8140010 

[ 4378.234023] writing 0x(null) to 0xf8140014 

[ 4378.234049] writing 0x365a4000 to 0xf8140018 

[ 4378.234069] Flush posted writes 

[ 4378.234088]  

[ 4378.234097] Start DMA read 

[ 4378.234119] writing 0x00000000 to 0xf814001c 

[ 4378.234141] EPLAST = 64222 

[ 4378.234159] POLL FOR READ: 

[ 4378.234182] ape->table_virt->eplast (0xf65a400c) = 0x0000fade. 

[ 4378.234208] EPLAST = 64222, n = 0 

[ 4378.234242] ape->table_virt->eplast (0xf65a400c) = 0x0000fade. 

[ 4378.234268] EPLAST = 64222, n = 0 

[ 4378.234300] ape->table_virt->eplast (0xf65a400c) = 0x0000fade. 

[ 4378.234327] EPLAST = 64222, n = 0 

< repeats>  

 

[ 4378.240440] Descriptor Table (Write, in Root Complex Memory,# = 1) 

[ 4378.240469] 0xf65a4000/0x00: 0x00000000 

[ 4378.240495] 0xf65a4004/0x04: 0x00000000 

[ 4378.240521] 0xf65a4008/0x08: 0x00000000 

[ 4378.240546] 0xf65a400c/0x0c: 0x0000fade 

[ 4378.240570] 0xf65a4010/0x10: 0x00000800 

[ 4378.240596] 0xf65a4014/0x14: 0x00001000 

[ 4378.240621] 0xf65a4018/0x18: 0x00000000 

[ 4378.240646] 0xf65a401c/0x1c: 0x36152000 

[ 4378.240667]  

[ 4378.240676] Start DMA write 

[ 4378.240695] POLL FOR WRITE: 

[ 4378.240718] ape->table_virt->eplast (0xf65a400c) = 0x0000fade. 

[ 4378.240745] EPLAST = 64222, n = 0 

[ 4378.240780] ape->table_virt->eplast (0xf65a400c) = 0x0000fade. 

[ 4378.240806] EPLAST = 64222, n = 0 

<repeats>  

 

[ 4378.246908] COMPARE: 

[ 4378.246933] [f6150000] = 0xf6150000 != [f6152000] = 0x00000000 ?! 

[ 4378.246956] DMA loop back (CPU->FPGA->CPU) FAILED 

[ 4378.246975] DMA loop back test FAILED. 

 

 

 

Am I taking a very complicated look at a very simple issue? All I want to do is to be able to do some kind of data transfer between the 2. I do not have constraints on performance as of now.  

 

Any help will be highly appreciated!
0 Kudos
10 Replies
Altera_Forum
Honored Contributor I
218 Views

Hi , 

gpushkar 

 

1. There are no other Linux driver's for that design. You have to write your own kernel module. What do you mean by simple data transfer ? You can modify this driver, and test read / write speed from BAR's for example.  

 

2. DMA test failed. FPGA couldn't read and write from host PC mem. Host PC is 32 bit? 

 

Regards,  

Igor
Altera_Forum
Honored Contributor I
218 Views

For (1) I've used a simple driver that gets the virtual address from pci_iomap() then in response to read/write requests (normally pread/pwrite) from the application does a copy_to/from_user() directly from the io space. 

Remember to check the limit of the user copy! 

 

However this will be somewhat slow - you will almost certainly need to use DMA (initiated at either end) in order to get reasonable sized PCIe requests. 

Possibly very careful use of the data cache will suffice.
Altera_Forum
Honored Contributor I
218 Views

 

--- Quote Start ---  

Hi , 

gpushkar 

 

1. There are no other Linux driver's for that design. You have to write your own kernel module. What do you mean by simple data transfer ? You can modify this driver, and test read / write speed from BAR's for example.  

 

2. DMA test failed. FPGA couldn't read and write from host PC mem. Host PC is 32 bit? 

 

Regards,  

Igor 

--- Quote End ---  

 

 

1. By simple data transfer I mean being able to either use the FPGA as a mapped memory ( already doing this) or stream data to the FPGA ( to do some DSP filtering for example)  

 

2. Yes, Host PC is 32 bit.
Altera_Forum
Honored Contributor I
218 Views

1. That's kind of data transfer is called DMA. You have a descriptor table. A descriptor maps a buffer ( in case of several buffers - severeral descriptors). The DMA test application fill's the descriptor table, and after that write's a command word to DMA controller. Than FPGA DMA controller copies the the Host PC buffer to another Host PC preallocated buffer. If data in the source buffer is the same in copied buffer, than DMA test is successful. If you want to copy data to FPGA buffer, you have to change decriptor destination filed (don't remember the correct field). Altpciechdma driver support's this. More description in the core manual. 

 

2. Maybe your's dma test fails, becouse the DMA controller is configured for 64bit adressing.
Altera_Forum
Honored Contributor I
218 Views

Irrespective of some configuration bits, 64 bit PCIe TLP addressing may only be used if the address cannot be reached by using 32 bits. It could only be the other way round, if you use it in a 64 bit system and are not prepared for addresses beyond 32 bits, then you are in trouble.

Altera_Forum
Honored Contributor I
218 Views

Hi gpushkar, 

 

I have a host PC running linux (Ubuntu 10.04 LTS ) and connected to an DE4 Stratix IV over PCIe. I explore the use of the FPGA as a co-processor, too. I try to do the same thing with the DE4, but i get same problem. Do you solve it?  

 

 

 

--- Quote Start ---  

 

... 

[ 4378.233120] BAR0 0xd0000000-0xd07fffff flags 0x0012120c 

[ 4378.233153] BAR2 0xff600000-0xff60ffff flags 0x00020200 

[ 4378.233276] BAR[0] mapped at 0xf81c0000 with length 32768(/8388608). 

[ 4378.233373] BAR[2] mapped at 0xf8140000 with length 256(/65536).  

[ 4378.233447] bar_tests(). 

... 

Any help will be highly appreciated! 

--- Quote End ---  

 

 

In your configuration you use 23 bits in BAR[0] and 16 bits in BAR[2], but you only has mapped 15 bits for BAR 0 and 8 bits for BAR 2 in your drive! Why do you change it in your "altpciechdma.c" ?  

 

Best Regards
Altera_Forum
Honored Contributor I
218 Views

I could never solve the DMA problem. Just was able to write and read to the BAR memory. It was a simple project so just did simple memory operations.

Altera_Forum
Honored Contributor I
218 Views

hi gpushkar, 

 

I made, the altpciechdma.c work fine here. Thus, I'd like to test my drive in user level with simple write and read example , but i don't know how! 

I saw some things about ioctl, but no success. 

Do you has some suggestion? 

 

Best Regards
Altera_Forum
Honored Contributor I
218 Views

You should be able to write a simple linux driver that will use pread() and pwrite() to read/write memory fpga memory. 

 

Your driver read code probably need to be something like: 

 

static ssize_t foo_read(struct file *fp, char *u_buf, size_t len, loff_t *offp) { foo_info_t *au = fp->private_data; loff_t loff = *offp; if (loff > MAX_SIZE || loff + len > MAX_SIZE) return -EINVAL; if (copy_to_user(u_buf, au->au_base + loff, len) != 0) return -EFAULT; *offp = loff + len; return len; } 

 

Updating 'offp' lets you use hexdump etc.
Altera_Forum
Honored Contributor I
218 Views

thanks, dsl!

Reply