FPGA Intellectual Property
PCI Express*, Networking and Connectivity, Memory Interfaces, DSP IP, and Video IP
6356 Discussions

PCIe PIO Beginner Questions

Altera_Forum
Honored Contributor II
1,881 Views

I've been pouring over the documenation, examples, and forums for information on PCie HIP implementation and I'm still really struggling here so I'm hoping someone can help me in my first battle with IP cores. 

 

Board: Cyclone IV GX Transciever Kit 

Goal: Have FPGA issue commands to a PCI device 

 

Basic Questions: 

Nios II library for PCIe PIO? 

PCIe required vs optional connections? 

Find PCI device by polling physical addresses? 

 

Note that partial answers are much appreciated, as well as pointers to useful documentation I might have missed. Below is the TL;DR version where I am explaining where my head is right now to see if I have just a gross misunderstanding. Anyone willing to tackle that entire thing would be an angel. 

 

Thanks all! 

 

So where I'm at, is I think I want regular old PIO instead of DMA because the transfers of data are very short (like a single DW every once and a while). Truth be told, I am not sure what the best way to implement this is, but its looking like using avalonMM. I was thinking a NiosII would have some header file where I could just write_pci(address,data) thing, but such libraries are either hidden or nonexistant. If niosII is the way (and I still think it might be) then I guess the plan of attack is to generate a qsys project, connect it with some memory, the PCIe compiler (Q12 doesnt have a HIP for Cyc IV?), and then program the Nios. The number of ports on the PCIe compiler however is rather intimidating to those new with PCIe, so is there any indication of what ports you actually NEED vs Extras for various... things? Assuming then that my fpga gets built right I need to program the Nios which in my mind goes the route of configuration reading the various physical addresses, finding the right ID, grabbing the BAR values, and then issuing memory writes (and waiting for ACKs before going again) to do your actual work. Some concern here to me is address translation, type 1 vs 0 commands, and how to do the physical address polling. 

If Nios II is NOT the way to go here, then megafunction with an awesome .v state machine might I guess work, but I feel like I will need to understand PCIe better for that which is something I am not confident in at the moment. If you have read all of the above many thanks for atleast your consideration. Any and all wisdom or shared experience would be lovely
0 Kudos
6 Replies
Altera_Forum
Honored Contributor II
822 Views

Read this thread: 

 

http://www.alteraforum.com/forum/showthread.php?t=35678 

 

Download the PDF and zip file. You can port the design to your kit. 

 

If you want to read/write from the FPGA via PCIe to talk to another PCIe device all you need to do is 

 

a) Use the host PC to determine the PCIe address of the device you need to talk to. 

 

b) Configure the Qsys PCIe address remap registers (the 1MB region discussed in the document) so that it maps to the region of the device you want to control. 

 

c) Issue a read/write to the Avalon-MM slave interface of the Qsys PCIe component. That Avalon-MM transaction will map to a PCIe transaction that will then perform a read/write to the PCIe device. 

 

And that's it. Simple, eh :) 

 

Cheers, 

Dave
0 Kudos
Altera_Forum
Honored Contributor II
822 Views

Dave, 

 

Thanks for the quick reply, and for your incredibly explicit directions in the pdf. I feel a bit like an idiot asking more of you but, I did get stuck in your directions very close to the end at the "Source Contraints" step. I found the .tcl file, but I'm not sure how to source it and I assume once I do that successfully the following line of code will work, but at the moment it does not for me. I also tried your autogeneration thing, but quartus 12.1 threw a bit of a fit. I would rather step through it instead of auto-generating though, so I'm more interested in what you meant about those constraints. 

 

Some comments I noticed: 

Everything I've seen always has a DMA block connected to the pcie block. Since my data transfers are relatively basic, I was wondering if that was really necessary (I've read DMA has a large overhead, and so is not efficent for small amounts of data). 

 

From a curiosity standpoint, is there a way to configure this thing without having to use the host PC? Isn't there a way to find the PCIe addresses without just finding it on the host PC? I thought my idea of polling the physical addresses wasn't the worst thing... 

 

And as a more general question, is the purpose of this whole thing to have something like a PCIe block that just has a more basic interface (tx/rx) kind of thing to be used in other designs? 

 

Thanks so much for your help thus far, 

Craig
0 Kudos
Altera_Forum
Honored Contributor II
822 Views

 

--- Quote Start ---  

 

I feel a bit like an idiot asking more of you but ... 

 

--- Quote End ---  

 

Don't feel bad. The tools are complicated and the official documentation and examples often suck ... 

 

 

--- Quote Start ---  

 

I did get stuck in your directions very close to the end at the "Source Contraints" step. I found the .tcl file, but I'm not sure how to source it 

 

--- Quote End ---  

 

 

The word 'source' is Tcl programming terminology. 

 

In Quartus, you click the mouse in the Tcl console window (View->Utility Windows->Tcl console), and at the tcl prompt you type 

 

tcl> source $TUTORIAL/hdl/s4gxdk/share/scripts/constraints.tcl  

 

 

--- Quote Start ---  

 

I assume once I do that successfully the following line of code will work, but at the moment it does not for me. 

 

--- Quote End ---  

 

Let me know if it doesn't. 

 

 

--- Quote Start ---  

 

I also tried your autogeneration thing, but quartus 12.1 threw a bit of a fit. 

 

--- Quote End ---  

 

If you have trouble sourcing the constraints file, I'll re-run the test on 12.1 and fix the problem. 

 

 

--- Quote Start ---  

 

I would rather step through it instead of auto-generating though, so I'm more interested in what you meant about those constraints. 

 

--- Quote End ---  

 

Excellent attitude! :) 

 

 

--- Quote Start ---  

 

Everything I've seen always has a DMA block connected to the pcie block. Since my data transfers are relatively basic, I was wondering if that was really necessary (I've read DMA has a large overhead, and so is not efficent for small amounts of data). 

 

--- Quote End ---  

 

 

It depends on what and when you consider programming the DMA controller 'overhead'. 

 

The naive way to think of using a DMA controller is that you use the processor to write the DMA controller registers to setup the transfer, and then wait for the interrupt to indicate the DMA transfer is done. That is overhead. 

 

The more typical way to use a DMA controller is that you use the processor to setup a scatter-gather buffer of "things to do", and then enable the DMA controller to do them completely independent of the CPU. The CPU may or may not have to service ISRs, depending on what your scatter-gather buffer is doing, eg., reading a temperature sensor and DMAing the result to a 7-segment display can be done completely independent of the CPU. 

 

Anyway, that'll give you an idea of why a DMA controller might be useful. 

 

With respect to PCIe transactions, the most efficient transactions are always implemented by a bus master issuing burst transactions. An x86 CPU or NIOS II CPU will not generate those sorts of transactions, so you use a DMA controller. 

 

 

--- Quote Start ---  

 

From a curiosity standpoint, is there a way to configure this thing without having to use the host PC? Isn't there a way to find the PCIe addresses without just finding it on the host PC? I thought my idea of polling the physical addresses wasn't the worst thing... 

 

--- Quote End ---  

 

 

You need a PCIe end-point to configure the PCIe address map. The PCIe end-point is generally the source of the PCIe reference clock and in the old-days of PCI, it was where the interrupt lines routed and the bus arbiter lived. You can do this with an FPGA, but that is not how the development boards are wired. 

 

 

--- Quote Start ---  

 

And as a more general question, is the purpose of this whole thing to have something like a PCIe block that just has a more basic interface (tx/rx) kind of thing to be used in other designs? 

 

--- Quote End ---  

 

Which "whole thing" are you talking about? The Qsys PCIe block? The purpose of that block is "a black box that can speak PCIe" :) 

 

The alternative is implementing TLP packet parsing ... refer to the PCIe spec ... 

 

Cheers, 

Dave
0 Kudos
Altera_Forum
Honored Contributor II
822 Views

Dave, 

First off I want to thank you for your quick, exceedingly helpful responses, as well as your words of encouragement. I've been working on this trying to avoid asking another question (and skiing lake tahoe) but I think I have one last round of questions I still need to come through. 

 

First off, about that thread you link in the beginning. The autocode generation did not work for me (12.1) but more importantly the step by step has not compiled either. The error its throwing is "pcie_pipe_ext_gxb_powerdown does not exist in macrofunction 'u5'(dbl clicking this reveals it is having problems with the top level .sv file)." The same error pops for ...pll_powerdown. I've been looking at this and it might be that my knowledge of system verilog is limited to the altera training, but I can't figure out what's up. the line ".pcie_pipe...powerdown (somevar)," definitely exists. If I look at the qsys_system.v file, pcie_pipe...powerdown is not assigned as an output. In fact the directions don't ask the powerdown signals on the pcie block to be exported. If you do export them it generates "pcie_powerdown_gxb_powerdown" and not what is in the .sv file. I tried renaming the .sv file to match the system.v file, tried deleting it all with and without exporting the powerdown group in qsys and I can't get the error to go away. Not sure where I've gone wrong, but I can't think of what else to try. 

 

Second question is more about building understanding. So in your program there is the RAM, PCIe, and DMA block... so how does it ever know to do anything? Maybe it doesn't since you were just using it to test timing, in which case how do I get it to follow my directions? My current state of mind is that if I make a happy state machine as a .v, I can apparently turn that into a custom avalon mm block. Naive to what that entails, I could then hook it up in qsys to.. something. In your first post it sounds like I can hook it directly up to the PCIe block, but don't I have to go through the DMA guy? I thought the whole point of the DMA fella was to be able to interact with the PCIe block. 

 

Thanks! 

PS: thank you so much for correcting my impression of DMA. That was an "ooOOoo" moment over here
0 Kudos
Altera_Forum
Honored Contributor II
822 Views

 

--- Quote Start ---  

 

First off, about that thread you link in the beginning. The autocode generation did not work for me (12.1) but more importantly the step by step has not compiled either. 

 

--- Quote End ---  

 

That is a frustrating reflection of Altera's tools. Most of the time Altera's examples will not even recompile! 

 

Install the version I used to create the PCIe example designs, and start from there - at least then you have a working reference design. 

 

 

--- Quote Start ---  

 

The error its throwing is "pcie_pipe_ext_gxb_powerdown does not exist in macrofunction 'u5'(dbl clicking this reveals it is having problems with the top level .sv file)." The same error pops for ...pll_powerdown. I've been looking at this and it might be that my knowledge of system verilog is limited to the altera training, but I can't figure out what's up. the line ".pcie_pipe...powerdown (somevar)," definitely exists. If I look at the qsys_system.v file, pcie_pipe...powerdown is not assigned as an output. In fact the directions don't ask the powerdown signals on the pcie block to be exported. If you do export them it generates "pcie_powerdown_gxb_powerdown" and not what is in the .sv file. I tried renaming the .sv file to match the system.v file, tried deleting it all with and without exporting the powerdown group in qsys and I can't get the error to go away. Not sure where I've gone wrong, but I can't think of what else to try. 

 

--- Quote End ---  

 

The altgx and altgx_reconfig components describe these signals in detail. Its possible that the PCIe IP now internally instantiates the required component. I'd have to re-read the PCIe handbook and look at the code, however, I don't have time to do that right now, sorry. Think of it as a good exercise in learning for yourself :) 

 

 

--- Quote Start ---  

 

Second question is more about building understanding. So in your program there is the RAM, PCIe, and DMA block... so how does it ever know to do anything?  

 

--- Quote End ---  

 

The host PC can always write to registers on the board, so it can control "what to do".  

 

 

--- Quote Start ---  

 

how do I get it to follow my directions? My current state of mind is that if I make a happy state machine as a .v, I can apparently turn that into a custom avalon mm block.  

 

--- Quote End ---  

 

You can create an Avalon-MM master to initiate "what to do", or you can instantiate a NIOS II core and have its software determine "what to do". You could even use your host PC to setup a DMA controller, and have the DMA controller perform a fixed sequence of tasks. 

 

 

--- Quote Start ---  

 

I thought the whole point of the DMA fella was to be able to interact with the PCIe block. 

 

--- Quote End ---  

 

Nope, its just there to generate burst transactions. 

 

Cheers, 

Dave 

 

PS. If you're driving back from Tahoe down 395, stop by and say hello :)
0 Kudos
Altera_Forum
Honored Contributor II
822 Views

hi to all, 

i have to implement dma read data found in memorymap and write that on PIo or peripphical. 

do somebody have doing this exercice. 

please i need your help.* 

I've started the programme , but i think my problem came from on my c program
0 Kudos
Reply