Intel® Quartus® Prime Software
Intel® Quartus® Prime Design Software, Design Entry, Synthesis, Simulation, Verification, Timing Analysis, System Design (Platform Designer, formerly Qsys)
16603 Discussions

On-Chip Memory in Qsys/Quartus

Altera_Forum
Honored Contributor II
5,051 Views

I have a design based on an Arria II GX part in which Qsys is utilized to generate a system made up of many different interfaces, other components and IP in the design. This system implements a PCIe bus that interacts with an off chip processor. This interface operates well and to the extent necessary thus far. I need to implement ROM with 12 bytes of information in the FPGA that is memory mapped and accessible through the PCIe bus. I have used Qsys to implement and instantiate the On-Chip memory and configured it as a ROM component with a 32 bit data width and 16 bytes of total memory. This made both Qsys and Quartus happy as it was indicated that there is a minimum amount of memory in this configuration and that the 4 bytes would have been padded, anyway. The memory block type in Qsys was identified as 'Auto'. The memory is connected to the PCIe via the Avalon interconnect in the Qsys tool along with the appropriate clock and reset signals. The address is specified in the location the off chip processor is expecting. 

 

Once the Qsys 'system' is generated, I edit the .hex file indicated in the Qsys component, enter my data in a 32 bit format, and save the file. No additional comments, warnings or errors are generated regarding this component during compile, synthesis and routing. The programming file is generated/converted and the part programmed. 

 

The external processor is then programmed to read the addresses for the information contained. The data that is read back by the processor is limited to the contents of bytes 8 - 11 only. I have attempted to read further addresses in the range and do not receive any of the other data that I entered in the hex file.  

 

I have inspected the .hex file for the contents and the format for verification that it is correct. Apart from manually calculating the CRC in the Intel hex file, the contents are accurate. 

 

Testing: 

In many circumstances and in many forums, there are recommendations to delete the db and incremental_db folders and recompile. I have performed this action without resolution. I have changed the format of the memory w.r.t width and depth to construct the same number of bits/bytes. 

 

What might be going on here? I believe I have ruled out endian-ness issues in dealing with the format of the data, but I don't believe that all of the data is in fact being loaded into the .sof or .pof files correctly. Is there any way to inspect these files for appropriate data? Is there something going on with the synthesis of the memory type or the information contained? Is there a known bug in using the on chip memory as ROM in Arria II devices? 

 

Thanks in advance to any who can offer assistance. 

Steve
0 Kudos
11 Replies
Altera_Forum
Honored Contributor II
2,454 Views

 

--- Quote Start ---  

 

I need to implement ROM with 12 bytes of information in the FPGA that is memory mapped and accessible through the PCIe bus.  

 

--- Quote End ---  

Is this memory accessible in a PCIe BAR by itself, or with other registers? 

 

 

--- Quote Start ---  

 

The address is specified in the location the off chip processor is expecting. 

 

--- Quote End ---  

What address? There's a Qsys address, and there is an offset-into-the-PCIe-BAR address? 

 

 

 

--- Quote Start ---  

 

The external processor is then programmed to read the addresses for the information contained. 

 

--- Quote End ---  

The external processor would first need to read the BAR address, then map that into its memory map. The granularity of the BAR size is 256-bytes, however, OSes like Linux will only map 4kB pages, so you then need to determine the offset into the BAR to find where your registers actually start. 

 

 

--- Quote Start ---  

 

What might be going on here? 

 

--- Quote End ---  

I suspect that you are accessing the registers at an alias address where the incomplete decoding of the Qsys fabric is allowing you to access some, but not all the registers at the alias address. 

 

Add a Avalon-MM BFM to your Qsys system, and perform read/write accesses on the memory. That'll convince you the memory works correctly. Then use a PCIe BFM to perform PCIe transactions to the same memory.  

 

If you don't have a simulation setup, then try using SignalTap II to probe the Avalon-MM master address that comes out of the PCIe-to-Qsys bridge. Check that those addresses are what you expect. You could also add a JTAG-to-Avalon-MM bridge to access your on-chip RAM directly, and that'll also help convince you the problem is PCIe related. 

 

Cheers, 

Dave
0 Kudos
Altera_Forum
Honored Contributor II
2,454 Views

The BAR is by itself, if I understand the question properly. There are other registers that access the other modules. Those all work fine. This one is a point in the contiguous memory map that has been identified. For example, this register sits at 0x28000 and another module consumes 0x27000 - 0x27FFF in memory space. 

 

I assume that the OS, drivers, and the PCIe handles the BAR offsetting automatically. I did not write the OS drivers (Linux), but can forward questions to the team. 

 

With respect to the Avalon bridge diagnostic work you are speaking of, I am familiar with FPGA's and Altera devices from several years ago. I have not familiarized myself with the JTAG and Avalon-MM Bridge. I can give it a go and see what I find. 

 

Thanks for the help.
0 Kudos
Altera_Forum
Honored Contributor II
2,454 Views

 

--- Quote Start ---  

The BAR is by itself, if I understand the question properly. There are other registers that access the other modules. Those all work fine. This one is a point in the contiguous memory map that has been identified. For example, this register sits at 0x28000 and another module consumes 0x27000 - 0x27FFF in memory space. 

 

--- Quote End ---  

There can be multiple address maps when dealing with PCIe devices. 

 

If you had an Avalon-MM master component in your Qsys system, then it would have an address map containing all the devices (Avalon-MM slaves) it connects to. 

 

Every PCIe BAR master also has an address map. Each BAR address map contains only the devices (Avalon-MM slaves) it connects to. 

 

If you only map a single Avalon-MM slave with a few registers into say BAR0, then Qsys should only create a BAR0 window big enough to see the registers, i.e., Linux lspci should show a BAR0 size of maybe 256-bytes, or perhaps 4kB, depending on how smart Qsys is. 

 

What does lspci indicate for the PCIe BAR you are using? Post the output of lspci -s <slot tuple> -vvv run as root. 

 

 

--- Quote Start ---  

 

I assume that the OS, drivers, and the PCIe handles the BAR offsetting automatically. I did not write the OS drivers (Linux), but can forward questions to the team. 

 

--- Quote End ---  

No, it will not. If you use Linux to mmap the region, it has a granularity of 4kB (actually the granularity is PAGE_SIZE). If the BAR is only 256-bytes, then the BAR can show up within the 4kB region, but not necessarily at offset 0. 

 

 

--- Quote Start ---  

 

With respect to the Avalon bridge diagnostic work you are speaking of, I am familiar with FPGA's and Altera devices from several years ago. I have not familiarized myself with the JTAG and Avalon-MM Bridge. I can give it a go and see what I find. 

 

--- Quote End ---  

Here's a tutorial I wrote: 

 

http://www.alterawiki.com/wiki/using_the_usb-blaster_as_an_sopc/qsys_avalon-mm_master_tutorial 

 

Cheers, 

Dave
0 Kudos
Altera_Forum
Honored Contributor II
2,454 Views

Here is more context from the project itself: 

 

The PCI BAR is set to Avalon Base 0x00000000 with a size of 19. 

Maximum payload size: 256 Bytes 

Peripheral Mode set to Requester/Completer with CRA selected 

 

The Address translation is a fixed table, 1 address page and the size of the address page is 20 bits --> Does this conflct with the above? I inherited this project and still coming up to speed on it. 

 

The Address Translation table contents for Page 0 is 0x0000000 and 0x0000000. I have passed the Linux question to the software team. 

 

In the Qsys system contents and connections, the PCIe has the MM Master set to IRQ range 0-15 and the Slave set to range 0x00000000 to 0x3FFF
0 Kudos
Altera_Forum
Honored Contributor II
2,454 Views

 

--- Quote Start ---  

 

Here is more context from the project itself: 

 

The PCI BAR is set to Avalon Base 0x00000000 with a size of 19. 

 

--- Quote End ---  

19-bits is 512kB. lspci should show you this size. Note that this BAR size is way bigger than you need to display a few registers. Qsys is not very smart, in that it is using the absolute address map, not realizing that the MSBs are static, and hence the decode region can be reduced. If you move your registers to address zero, the BAR size will decrease. 

 

edit: qsys's defaults are not very smart. you can create a master-specific address map the reduces the bar0 decode size. see the discussion further on in this thread. 

 

Does the size of the BAR matter? Yes. If its too large, then the BIOS cannot allocate an address mapping to it. For example, with the PCIe examples in the users guide, they have a 256MB region for BAR0. I found my EliteBook would not boot with such a large BAR defined. 

 

 

--- Quote Start ---  

 

The Address translation is a fixed table, 1 address page and the size of the address page is 20 bits --> Does this conflct with the above? I inherited this project and still coming up to speed on it. 

 

--- Quote End ---  

This is for outgoing PCIe transactions, i.e., transactions your Avalon-MM masters make to access the PCIe bus. Its basically the opposite direction to PCIe masters accessing your Avalon-MM system via BAR0. 

 

Cheers, 

Dave
0 Kudos
Altera_Forum
Honored Contributor II
2,454 Views

One thing that we discover is that the data that is configured is stored in 3-32 bit words. The data that we are receiving back is information contained in the 3rd byte, spread across 3 addresses. 

 

For example, if I preload in the .hex file data as 0x11223344, 0x55667788, 0x99AABBCC. 

 

The data that is read back from address offsets 0x00 - 0x0C when the read back data type is expected in 32 bit unsigned integer (word) form is: 

 

Address offset Data 

0x00 : 0x000000CC 

0x04 : 0x000000BB 

0x08 : 0x000000AA 

0x0C : 0x00000099 

 

When the expected information should be 

Address offset Data 

0x00 : 0x11223344 

0x04 : 0x55667788 

0x08 : 0x99AABBCC 

 

Again, I have attempted variations of the configuration of the memory in both Qsys and in Quartus as I established the On Chip RAM component and the .hex file. Is there some way that Quartus is configuring the memory differently than I am expecting? I have also attempted to read other addresses in both directions adjacent to this area without finding any of the 'lost' data. 

 

 

Here is what was received back from the lscpi command: 

 

01:00.0 Unassigned class [ff00]: Altera Corporation Device e001 (rev 01) 

Subsystem: Altera Corporation Device 0004 

Physical Slot: 1 

Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ 

Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- 

Latency: 0, Cache Line Size: 64 bytes 

Interrupt: pin A routed to IRQ 42 

Region 0: Memory at fd000000 (32-bit, non-prefetchable)  

Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit+ 

Address: 00000000fee0300c Data: 41a1 

Capabilities: [78] Power Management version 3 

Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) 

Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- 

Capabilities: [80] Express (v1) Endpoint, MSI 00 

DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us 

ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- 

DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported- 

RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop+ 

MaxPayload 128 bytes, MaxReadReq 512 bytes 

DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend- 

LnkCap: Port# 1, Speed 2.5GT/s, Width x4, ASPM L0s, Latency L0 unlimited, L1 unlimited 

ClockPM- Surprise- LLActRep- BwNot- 

LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk- 

ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- 

LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk- DLActive- BWMgmt- ABWMgmt- 

Capabilities: [100 v1] Virtual Channel 

Caps: LPEVC=0 RefClk=100ns PATEntryBits=1 

Arb: Fixed+ WRR32- WRR64- WRR128- 

Ctrl: ArbSelect=Fixed 

Status: InProgress- 

VC0: Caps: PATOffset=00 MaxTimeSlots=1 RejSnoopTrans- 

Arb: Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256- 

Ctrl: Enable+ ID=0 ArbSelect=Fixed TC/VC=01 

Status: NegoPending- InProgress- 

Kernel driver in use: PV_PCIE_100461 

 

 

 

I hope this helps. I'll check out the tutorial you posted. Thanks. 

Steve
0 Kudos
Altera_Forum
Honored Contributor II
2,454 Views

 

--- Quote Start ---  

 

Here is what was received back from the lscpi command: 

... 

Region 0: Memory at fd000000 (32-bit, non-prefetchable)  

 

--- Quote End ---  

Ok, so you have a window with an address range of 0 to 7FFFFh. 

 

 

--- Quote Start ---  

 

The data that is read back from address offsets 0x00 - 0x0C when the read back data type is expected in 32 bit unsigned integer (word) form is: 

 

Address offset Data 

0x00 : 0x000000CC 

0x04 : 0x000000BB 

0x08 : 0x000000AA 

0x0C : 0x00000099 

 

--- Quote End ---  

See, here's where the addressing is confusing. 

 

Qsys should have an address map per master, however, it does not. 

 

edit: actually, it does have an address map per master option, its just not that obvious, as you cannot enter the per-master addresses under the 'system contents' tab base/end columns, you have to enter them under the 'address map' tab. 

 

You indicate your registers were located at 0x28000. Can you please try accessing them at this offset into the BAR? 

 

By accessing them at offset 0, you are actually accessing an alias of the registers created by some incomplete decoding of the Qsys fabric. 

 

Cheers, 

Dave
0 Kudos
Altera_Forum
Honored Contributor II
2,454 Views

The software team reports that they are using the 0x28000 offset into the memory. They included this as the code excerpt: 

 

 

Address = pci_resource_start( altera device id, BAR {0, 1, 2, etc… } ) 

Ioread32( address + 0x28000 ) 

 

I believe they used BAR 0 in the top line.
0 Kudos
Altera_Forum
Honored Contributor II
2,454 Views

 

--- Quote Start ---  

The software team reports that they are using the 0x28000 offset into the memory. 

--- Quote End ---  

 

Ok, that looks correct then. 

 

I'd suggest using Signal Tap II on the Avalon side of BAR0; that way you can see what the read commands look like (byte-enables etc) and the returned read data. 

 

Even better would be a simulation with a PCIe bus functional model, but Altera's support for the PCIe BFM was removed after version 11.0 

 

Cheers, 

Dave
0 Kudos
Altera_Forum
Honored Contributor II
2,454 Views

I've just been playing with Qsys to see if it supports per-master address maps, and it appears it does, its just not that obvious how to use it. 

 

Open up your Qsys design, and select the 'Address Map' tab. There should be a column per Avalon-MM master. Eg., one for your BAR0 control register PCIe slave (Qsys system master), and another for BAR1 or whatever you have in your system. You could also add a JTAG-to-Avalon-MM master or a Avalon-MM BFM master and connect them to some slaves to see that new columns are created. 

 

In the column for BAR0, your registers at address offset 0x28000 should be listed. Assuming that this is the only slave that decodes in that region, edit the address and set it to zero. This eliminates the MSBs from the BAR0 decode region. 

 

Rebuild the system.  

 

Look under fitter->resource section->pci express hard ip blocks. It'll tell you what the new BAR0 size is (it should be smaller). 

 

Download the new design to the FPGA, and lspci should show the smaller BAR0. 

 

Does the decoding logic work now? 

 

It should have worked in the previous setup, so I'm just trying to give you an alternative (but better, since the BAR is smaller) implementation. 

 

Cheers, 

Dave 

 

PS. For details on per-master address maps in Qsys, see Introduction to Qsys (OQSYS1000) 

 

http://www.altera.com/education/training/courses/oqsys1000 

 

(specifically slides 44 and 45).
0 Kudos
SJEYIN_T_Intel
Employee
2,454 Views

Hi, May I know can I create a custom IP and direct do read write from/into onchip memory?

i tried it but the data cannot read out properly. Is there any signal i miss or anyone can guide on this

0 Kudos
Reply