Solved: Re: Re:Bug! Quartus Pro 20.1.1, Cyclone V, utilizing PCIe example from 16.1.

BrianM · ‎01-23-2021

Problem Details
Error:
Internal Error: Sub-system: VPR20KMAIN, File: /quartus/fitter/vpr20k/altera_arch_common/altera_arch_place_anneal.c, Line: 2744

Internal Error

Stack Trace:

    0xdce7a: vpr_qi_jump_to_exit + 0x6f (fitter_vpr20kmain)

   0x797f83: vpr_exit_at_line + 0x53 (fitter_vpr20kmain)

   0x2ec700: l_initial_low_temp_moves + 0x1cb (fitter_vpr20kmain)

   0x6f97f1: l_thread_pool_do_work + 0x41 (fitter_vpr20kmain)

   0x2c74fb: l_thread_pool_fn + 0x4e (fitter_vpr20kmain)

    0xefe28: l_thread_start_wrapper(void*) + 0x29 (fitter_vpr20kmain)

     0x5acc: thr_final_wrapper + 0xc (ccl_thr)

    0x3eeef: msg_thread_wrapper(void* (*)(void*), void*) + 0x62 (ccl_msg)

     0x9f9c: mem_thread_wrapper(void* (*)(void*), void*) + 0x5c (ccl_mem)

     0x8b39: err_thread_wrapper(void* (*)(void*), void*) + 0x27 (ccl_err)

     0x5b0f: thr_thread_wrapper + 0x15 (ccl_thr)

     0x5df2: thr_thread_begin + 0x46 (ccl_thr)

     0x7f9e: start_thread + 0xde (pthread.so.0)

    0xfd0af: clone + 0x3f (c.so.6)

End-trace



Executable: quartus
Comment:
Device is very full. Trying to shoe horn it in by forcing most ram into M10K. Reduced the PCIe DMA buffer from 256K to 128K got me the space that I needed, but now this crash has occurred.

System Information
Platform: linux64
OS name: This is
OS version: 

Quartus Prime Information
Address bits: 64
Version: 20.1.1
Build: 720
Edition: Standard Edition

I'd love some support. I'm trying to get PCIe Root Port working in Cyclone V and so far have not found any examples that will fit and meet timing in the part I chose: 5CSXFC5D6F31C7.

I followed an example called: cv_soc_rp_simple_design and it won't fit with my logic.

I followed another example that didn't have bus syncs, where I tied everything from PCIe directly to the HPS and the xcvr_reconfig block, but that missed timing by up to 1.8ns on the 125MHz path to the main HPS DRAM.

This failure happens with the former, the cv_soc_rp_simple_design/pcie_rp_ed_5csxfc6.qsys which didn't fit until I trimmed all the unnecessary logic, including the jtag port and I shrunk all the bus retimers to be as small as possible. The last change was shrinking the 256KB pcie DMA buffer to 128K and then this error occurred.

If there's a simple way to send the database through the FAEs I've been working with, let me know.

Thanks in advance.

BrianM · ‎03-27-2021

I got it working.

The last hurdle was the MSI interface. I'm not sure why it wasn't working, but restarting the design from scratch with my recently acquired knowledge got everything working.

I've attached my qsys file and socfpga.dtsi. Hopefully it will help others get a jump on things so they don't have to learn everything the hard way as I did.

This design is not optimized for speed, nor is it optimized for space.

NVMe read speed is around 80 MB/s.

NVMe write speed is around 50 MB/s.

Faster drives will do a little better, but even the over a gig per second on PC 4 lane part I have doesn't do much better than 110 MB/s read. Bandwidth is limited by the ARM memory interface and the fact that bursting logic at 125MHz causes timing violations. A burst length of one on the Txs interface has got to slow things down.

Performance is quoted with the 5.11 kernel, the 5.4 kernel is not quite as fast. Still fast enough for an embedded system though.

Don't forget to enable the fpga, pcie and msi modules in your top level board dts file.

And don't forget to reserve the first 64K of DRAM as stated above. If you don't you will get read errors.

There are probably better configurations, and eventually I'll probably try to optimize for size since my logic will need to get bigger on the next project. But for now. It finally works.

Good luck to you all.

View solution in original post

SengKok_L_Intel · ‎01-24-2021

Hi,

I download the CV simple design from the link below, open it at v20.1.1 standard version, change the device to 5CSXFC5D6F31C7, and then re-generate the Platform design. By doing so, it is able to compile successfully, and the logic utilization is 23%.

https://rocketboards.org/foswiki/Projects/PCIeRootPortWithMSI

Regards -SK

BrianM · ‎01-24-2021

Thanks for the reply.

I ended up using this: https://releases.rocketboards.org/release/2015.10/pcie-ed/hw/cv_soc_rp_simple_design.tar.gz

I think that is the same one that you used.

When I included my design with it, it did not fit. I ended up reducing the PCIe modules functionality by reducing synchroniser memory usage and switching all my FIFOs to using M10K to save MLAB space.

The error I posted above occurred when I changed all the resync buffers to 1 deep, switched all my fifos to M10K and cut the 256K DMA buffer to 128K.

After I made the post, I cut the DMA buffer down to 60K and for the first time all the code fit and met timing.

Now that the FPGA seems like it might work, I need to get linux working with the PCIe port.

sopc2dts.JAVA does not generate a usable file. There are a few features I had to add to the source code to get it to run on the sopc file generated by 20.1.1 (the java did not know about altera_pll, and there was an issue getting the Txs port address due to a bug) . Even after those modifications and even though the kernel accepts it, it does not work and does not generate any error messages from the kernel. I end up with hardware that is missing a CPU, an ethernet and many of the other features, including PCIe.

If I could see a working example of the device tree for the pcie, msgdma, etc, then I'm certain I can get my hardware to work with the linux kernel using a manually created dts I made based on the socfpga device tree files found in the linux kernel.

Do you have an example of the device tree entries for this design that function on kernel 5.4 that you can show me?

Thanks again for your help. I really appreciate it.

SengKok_L_Intel · ‎01-26-2021

No additional design that I aware of other than the design posted at rocketboard.

BrianM · ‎01-26-2021

I'm not going to be able to get anywhere unless I get some help creating a dts. Is there contractor experts on these matters?

More importantly, has anyone noticed that rocketboards.org has lost its domain?

BrianM · ‎01-28-2021

I need help with dts. Surely there is someone somewhere who know how to do a dts with kernel 5.4.72 and Quartus Pro 20.1.1.

EBERLAZARE_I_Intel · ‎01-28-2021

Hi,

There are no design on Cyclone V on the device tree modification, but you may refer to how it is done on Stratix 10 SoC's PCIe Root Port:

https://rocketboards.org/foswiki/Projects/Stratix10PCIeRootPortWithMSI#Compiling_Linux

BrianM · ‎02-02-2021

That helped a lot!

I have a dts that is working, however, I'm having another problem that seems pretty typical.

Background info:

1. I'm using the cv_soc_rp_simple_design (pcie_rp_ed_5csxfc6.qsys) as the basis for this design.

2. It does not have the DMA module in it. (I was not able to use the version that does because it would not fit with my code in the part I have chosen).

3. My system ties X4 pcie to an NVMe board.

I have two issues at this point:

A. The board hangs after this message in the log: "Deasserting all peripheral resets," unless I remove the PCIe info from the dtb. It stops hanging after the board warms up. After the SOC heat sink has become warm to the touch, the system will boot just fine.

B. The NVMe is timing out at boot:

...

[    0.612618] altera-pcie c0000000.pcie: host bridge /soc/bridge@c0000000/pcie@000000000 ranges:
[    0.612640] altera-pcie c0000000.pcie: Parsing ranges property...
[    0.612675] altera-pcie c0000000.pcie:   MEM 0xc0000000..0xdfffffff -> 0x00000000
[    0.612859] altera-pcie c0000000.pcie: PCI host bridge to bus 0000:00
[    0.612874] pci_bus 0000:00: root bus resource [bus 00-ff]
[    0.612888] pci_bus 0000:00: root bus resource [mem 0xc0000000-0xdfffffff] (bus address [0x00000000-0x1fffffff])
[    0.612899] pci_bus 0000:00: scanning bus
[    0.613082] pci 0000:00:00.0: [1172:e000] type 01 class 0x060400
[    0.616677] pci_bus 0000:00: fixups for bus
[    0.616772] PCI: bus0: Fast back to back transfers disabled
[    0.616800] pci 0000:00:00.0: scanning [bus 00-00] behind bridge, pass 0
[    0.616810] pci 0000:00:00.0: bridge configuration invalid ([bus 00-00]), reconfiguring
[    0.616899] pci 0000:00:00.0: scanning [bus 00-00] behind bridge, pass 1
[    0.617399] pci_bus 0000:01: scanning bus
[    0.617636] pci 0000:01:00.0: [8086:2522] type 00 class 0x010802
[    0.618152] pci 0000:01:00.0: reg 0x10: [mem 0xc0000000-0xc0003fff 64bit]
[    0.618519] pci 0000:01:00.0: reg 0x20: [mem 0xc0000000-0xc000ffff 64bit]
[    0.618746] pci 0000:01:00.0: enabling Extended Tags
[    0.620471] pci 0000:01:00.0: 4.000 Gb/s available PCIe bandwidth, limited by 2.5 GT/s x2 link at 0000:00:00.0 (capable of 15.752 Gb/s with 8 GT/s x2 link)
[    0.622632] pci_bus 0000:01: fixups for bus
[    0.622718] PCI: bus1: Fast back to back transfers disabled
[    0.622729] pci_bus 0000:01: bus scan returning with max=01
[    0.622743] pci_bus 0000:01: busn_res: [bus 01-ff] end is updated to 01
[    0.622772] pci_bus 0000:00: bus scan returning with max=01
[    0.622799] pci 0000:00:00.0: BAR 8: assigned [mem 0xc0000000-0xc00fffff]
[    0.622816] pci 0000:01:00.0: BAR 4: assigned [mem 0xc0000000-0xc000ffff 64bit]
[    0.622954] pci 0000:01:00.0: BAR 0: assigned [mem 0xc0010000-0xc0013fff 64bit]
[    0.623089] pci 0000:00:00.0: PCI bridge to [bus 01]
[    0.623127] pci 0000:00:00.0:   bridge window [mem 0xc0000000-0xc00fffff]
[    0.623344] pcieport 0000:00:00.0: assign IRQ: got 132
[    0.623408] pcieport 0000:00:00.0: enabling device (0140 -> 0142)
[    0.623535] pcieport 0000:00:00.0: enabling bus mastering
[    0.624027] pcieport 0000:00:00.0: PME: Signaling with IRQ 133
...

[    1.459278] nvme 0000:01:00.0: assign IRQ: got 132
[    1.464339] nvme nvme0: pci function 0000:01:00.0
[    1.469107] nvme 0000:01:00.0: enabling device (0140 -> 0142)
...

[   64.471966] nvme nvme0: I/O 16 QID 0 timeout, disable controller
[   64.478147] nvme nvme0: Identify Controller failed (-4)
[   64.483380] nvme nvme0: Removing after probe failure status: -5

The delay is so long that sometimes watchdog triggers and u-boot is reinitialized, sometimes it boots normally and lspci shows this:

root@cyclone5:~# lspci -v
00:00.0 PCI bridge: Altera Corporation Device e000 (rev 01) (prog-if 00 [Normal decode])
       Flags: bus master, fast devsel, latency 0, IRQ 133
       Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
       I/O behind bridge: 00000000-00000fff [size=4K]
       Memory behind bridge: 00000000-000fffff [size=1M]
       Prefetchable memory behind bridge: 00000000-000fffff [size=1M]
       Capabilities: [50] MSI: Enable+ Count=1/4 Maskable- 64bit+
       Capabilities: [78] Power Management version 3
       Capabilities: [80] Express Root Port (Slot-), MSI 00
       Capabilities: [100] Virtual Channel
       Capabilities: [200] Vendor Specific Information: ID=a000 Rev=0 Len=044 <?>
       Kernel driver in use: pcieport
lspci: Unable to load libkmod resources: error -12

01:00.0 Non-Volatile memory controller: Intel Corporation Device 2522 (prog-if 02 [NVM Express])
       Subsystem: Intel Corporation Device 3810
       Flags: fast devsel, IRQ 132
       Memory at c0010000 (64-bit, non-prefetchable) [size=16K]
       [virtual] Memory at c0000000 (64-bit, non-prefetchable) [size=64K]
       Capabilities: [40] Power Management version 3
       Capabilities: [50] MSI-X: Enable- Count=9 Masked-
       Capabilities: [60] Express Endpoint, MSI 00
       Capabilities: [a0] MSI: Enable- Count=1/16 Maskable+ 64bit+
       Capabilities: [100] Advanced Error Reporting
       Capabilities: [150] Virtual Channel
       Capabilities: [180] Power Budgeting <?>
       Capabilities: [190] Alternative Routing-ID Interpretation (ARI)
       Capabilities: [2a0] Secondary PCI Express <?>
       Capabilities: [2d0] Latency Tolerance Reporting
       Capabilities: [310] L1 PM Substates

I have read that issues with the SOC memory mapping are to blame for this, however, I haven't changed any of the addresses or remapping as far as I know, so it should be set to the same as in the example PCIe qsys file.

Does the pcie driver support implementations without the MSGDMA module installed?

Any clues would be appreciated.

Thanks.

BrianM · ‎02-04-2021

Hello. Yesterday I figured out what the cold boot issue was and fixed it.

There is a bug in 19.1 where the tool allows you to do a pin swap on pcie_perst# without error message. So when the layout guy asked me to swap it, I did and the tool didn't complain so I thought it was okay.

20.1.1 caught the mistake at some point (not right away) and refused to build the FPGA until I put it back on PIN_W22.

I worked around the issue for now by putting a weak pullup on PIN_W22 and assigning the net that had been on it to a spare I/O.

I also managed to build a version of the FPGA with PCIe, 256K SRAM buffer, and MSGDMA installed, fitting and meeting timing. Although I'm having some hold time violations periodically. The part is 87% ALM utilized and 55% SRAM utilized.

The only remaining issue is the fact that the nvme driver fails to put the NVMe device in bus master mode. It times out waiting for the busmaster to acknowledge.

To get the full design to fit in the FPGA with my logic, I had to set the flag, "Single DW Completer." I think that should be fine because there is and only ever will be a single NVMe card on the PCIe slot.

I'm currently running a vanilla 5.7.10 kernel because I noticed there were changes to some of the altera drivers and I wanted to make sure there wasn't a change that would help my cause.

The PCIe driver reports the following:

[    0.781196] altera-pcie c0000000.pcie: host bridge /soc/bridge@c0000000/pcie@000000000 ranges: 
[    0.781225] altera-pcie c0000000.pcie: Parsing ranges property... 
[    0.781262] altera-pcie c0000000.pcie:      MEM 0x00c0000000..0x00dfffffff -> 0x0000000000 
[    0.781504] altera-pcie c0000000.pcie: PCI host bridge to bus 0000:00 
[    0.781528] pci_bus 0000:00: root bus resource [bus 00-ff] 
[    0.781545] pci_bus 0000:00: root bus resource [mem 0xc0000000-0xdfffffff] (bus address [0x00000000-0x1fffffff]) 
[    0.781557] pci_bus 0000:00: scanning bus 
[    0.781855] pci 0000:00:00.0: [1172:e000] type 01 class 0x060400 
[    0.785218] pci_bus 0000:00: fixups for bus 
[    0.785307] PCI: bus0: Fast back to back transfers disabled 
[    0.785338] pci 0000:00:00.0: scanning [bus 00-00] behind bridge, pass 0 
[    0.785351] pci 0000:00:00.0: bridge configuration invalid ([bus 00-00]), reconfiguring 
[    0.785439] pci 0000:00:00.0: scanning [bus 00-00] behind bridge, pass 1 
[    0.785928] pci_bus 0000:01: scanning bus 
[    0.786399] pci 0000:01:00.0: [8086:2522] type 00 class 0x010802 
[    0.786709] pci 0000:01:00.0: reg 0x10: [mem 0xc0000000-0xc0003fff 64bit] 
[    0.787092] pci 0000:01:00.0: reg 0x20: [mem 0xc0000000-0xc000ffff 64bit] 
[    0.787328] pci 0000:01:00.0: enabling Extended Tags 
[    0.789559] pci 0000:01:00.0: 4.000 Gb/s available PCIe bandwidth, limited by 2.5 GT/s PCIe x2 link at 0000:00:00.0 (capable of 15.752 Gb/s with 8.0 GT/s PC
Ie x2 link) 
[    0.791611] pci_bus 0000:01: fixups for bus 
[    0.791701] PCI: bus1: Fast back to back transfers disabled 
[    0.791714] pci_bus 0000:01: bus scan returning with max=01 
[    0.791729] pci_bus 0000:01: busn_res: [bus 01-ff] end is updated to 01 
[    0.791761] pci_bus 0000:00: bus scan returning with max=01 
[    0.791787] pci 0000:00:00.0: BAR 8: assigned [mem 0xc0000000-0xc00fffff] 
[    0.791805] pci 0000:01:00.0: BAR 4: assigned [mem 0xc0000000-0xc000ffff 64bit] 
[    0.791950] pci 0000:01:00.0: BAR 0: assigned [mem 0xc0010000-0xc0013fff 64bit] 
[    0.792093] pci 0000:00:00.0: PCI bridge to [bus 01] 
[    0.792134] pci 0000:00:00.0:   bridge window [mem 0xc0000000-0xc00fffff] 
[    0.792353] pcieport 0000:00:00.0: assign IRQ: got 56 
[    0.792415] pcieport 0000:00:00.0: enabling device (0140 -> 0142) 
[    0.792543] pcieport 0000:00:00.0: enabling bus mastering 
[    0.793014] pcieport 0000:00:00.0: PME: Signaling with IRQ 57 
[    0.793261] pcieport 0000:00:00.0: saving config space at offset 0x0 (reading 0xe0001172) 
[    0.793292] pcieport 0000:00:00.0: saving config space at offset 0x4 (reading 0x100546) 
[    0.793320] pcieport 0000:00:00.0: saving config space at offset 0x8 (reading 0x6040001) 
[    0.793348] pcieport 0000:00:00.0: saving config space at offset 0xc (reading 0x10010) 
[    0.793359] pcieport 0000:00:00.0: saving config space at offset 0x10 (reading 0x0) 
[    0.793386] pcieport 0000:00:00.0: saving config space at offset 0x14 (reading 0x0) 
[    0.793412] pcieport 0000:00:00.0: saving config space at offset 0x18 (reading 0x10100) 
[    0.793439] pcieport 0000:00:00.0: saving config space at offset 0x1c (reading 0x0) 
[    0.793465] pcieport 0000:00:00.0: saving config space at offset 0x20 (reading 0x0) 
[    0.793492] pcieport 0000:00:00.0: saving config space at offset 0x24 (reading 0x0) 
[    0.793518] pcieport 0000:00:00.0: saving config space at offset 0x28 (reading 0x0) 
[    0.793544] pcieport 0000:00:00.0: saving config space at offset 0x2c (reading 0x0) 
[    0.793571] pcieport 0000:00:00.0: saving config space at offset 0x30 (reading 0x0) 
[    0.793597] pcieport 0000:00:00.0: saving config space at offset 0x34 (reading 0x50) 
[    0.793623] pcieport 0000:00:00.0: saving config space at offset 0x38 (reading 0x0) 
[    0.793649] pcieport 0000:00:00.0: saving config space at offset 0x3c (reading 0x30138) 
[    0.796920] dma-pl330 ffe01000.pdma: Loaded driver for PL330 DMAC-341330 
[    0.796940] dma-pl330 ffe01000.pdma:         DBUFF-512x8bytes Num_Chans-8 Num_Peri-32 Num_Events-8

The PCIe Driver reports the following:

[    1.879231] nvme 0000:01:00.0: assign IRQ: got 56
[    1.884200] nvme nvme0: pci function 0000:01:00.0
[    1.889047] nvme 0000:01:00.0: enabling device (0140 -> 0142)
[    1.894865] nvme 0000:01:00.0: enabling bus mastering
[    1.904632] nvme 0000:01:00.0: saving config space at offset 0x0 (reading 0x25228086)
[    1.915605] nvme 0000:01:00.0: saving config space at offset 0x4 (reading 0x100146)
[    1.929796] nvme 0000:01:00.0: saving config space at offset 0x8 (reading 0x1080200)
[    1.943672] nvme 0000:01:00.0: saving config space at offset 0xc (reading 0x10)
[    1.956942] nvme 0000:01:00.0: saving config space at offset 0x10 (reading 0x10004)
[    1.964611] nvme 0000:01:00.0: saving config space at offset 0x14 (reading 0x0)
[    1.971942] nvme 0000:01:00.0: saving config space at offset 0x18 (reading 0x0)
[    1.985610] nvme 0000:01:00.0: saving config space at offset 0x1c (reading 0x0)
[    1.998116] nvme 0000:01:00.0: saving config space at offset 0x20 (reading 0x4)
[    2.012843] nvme 0000:01:00.0: saving config space at offset 0x24 (reading 0x0)
[    2.027572] nvme 0000:01:00.0: saving config space at offset 0x28 (reading 0x0)
[    2.040072] nvme 0000:01:00.0: saving config space at offset 0x2c (reading 0x38108086)
[    2.054023] nvme 0000:01:00.0: saving config space at offset 0x30 (reading 0x0)
[    2.069193] nvme 0000:01:00.0: saving config space at offset 0x34 (reading 0x40)
[    2.089687] nvme 0000:01:00.0: saving config space at offset 0x38 (reading 0x0)
[    2.096992] nvme 0000:01:00.0: saving config space at offset 0x3c (reading 0x138)

Then later, after a 62 second delay, the kernel reports:

[   64.498070] nvme nvme0: I/O 20 QID 0 timeout, disable controller
[   64.504259] nvme nvme0: Identify Controller failed (-4)
[   64.509492] nvme nvme0: Removing after probe failure status: -5

I have tried four brands of NVMe card: intel, Samsung, Western Digital, and Greenliant. The Samsung drive is x4, the other are x2. All were properly identified.

Note: the output from linux-socfpga-5.4.74 is almost the same (just a little less verbose).

I'm hoping someone will have some idea where I need to look to figure out what is going wrong in the kernel.

Any help would be greatly appreciated.

BrianM · ‎02-04-2021

Just read the documentation on "Single DW Completer". I can't use it. I'm not sure how I'll get it to fit... But I'll work on that today.

SengKok_L_Intel · ‎02-05-2021

Hi

Apologize that no much helpful info regarding the Root Port driver development that can be offered here. Your understanding is much appreciated.

SengKok_L_Intel · ‎02-09-2021

If further support is needed in this thread, please post a response within 15 days. After 15 days, this thread will be transitioned to community support. The community users will be able to help you with your follow-up questions.

BrianM · ‎02-09-2021

Let's keep this open until I figure it out. I'll update the thread with the solution.

BrianM · ‎02-19-2021

I've made significant progress. Let me state some things I've learned so people don't have to go through the same stuff.

1. You must regenerate u-boot every time you change Platform Designer's qsys file. This took me two weeks to realize. Nothing worked because I simply hadn't understood that there is a huge amount of hardware configuration in u-boot that the kernel cannot do because once the DRAM is active, the registers can't be changed.

2. In order to run at full speed, 125 MHz, you must use clock-crossing and pipeline bridges. The "auto constructor" software is not generally good enough to build connections that will run at speed.

3. All addresses listed in Platform Designer must be consistent across MM Masters. Period. The software assumes that and there is no way to work around it.

4. It does not appear (although I still have to confirm this) that the MSGDMA and embedded SRAM modules are used at all by the linux kernel. There is no point in adding them.

5. Bandwidth from a two lane NVMe is roughly 132 MB / second read. But my config is not working yet, so this will need to be confirmed as well.

I think I'm stuck in the same way as the posters in this thread: https://forum.rocketboards.org/t/altera-pcie-driver-issue-with-ssd-devices/545

There is an issue with the way the Altera PCIe linux driver programs the Root Port to write DRAM. I have not been able to get it to work. There are two proposed (and working versions from 15.1 to 18.1) in that thread. I have not tried the one that requires editing of the verilog generated by Platform Designer. Creating a flow based on code hacks is not appealing to me. I have tried the second fix, but either I have misinterpreted how they got it to work, or more recent software does not allow the hack. I have tried dozens of configurations to try to get it to work, but all have failed. If I tamper with the offset address of Txs, Root Port DMA dead locks every time, I assume because the data does not go where it is expected to be.

As documented in the thread, the issue occurs when reading from the PCIe device. One group was reading from a PCIe SATA controller and the other group from an NVMe like me. Reads fail. Zero data is returned instead of actual data.

According to the thread, the Root Port is trashing ram, writing over Kernel space or something like that. I don't fully understand the claims. One solution suggested involves adding an address span extender between the h2f HPS interface and the Txs port and adjusting the offset to the Txs by 64 MB and also reserving the second 64MB of RAM for Root Port DMA space. The offset is magically added to the DRAM destination address, thus preventing the Root Port from destroying DRAM. The other solutio shifts the Txs the Root Port DMA out of DRAM altogether and hacking the generated code to add an offset of 0x50000000 to the Txs address. I don't see how that can work, since bit 30 is not utilized on the h2f interface and the Root Port needs to write to physical DRAM to bring data in from the PCIe device.

In my case I get errors such as this:

root@cyclone5:~# ./hdparm -tT /dev/nvme0n1p1

/dev/nvme0n1p1:
Timing cached reads: 940 MB in 2.00 seconds = 469.80 MB/sec
Timing buffered disk reads: [ 204.434383] blk_update_request: critical medium error, dev nvme0n1, sector 281088 op 0x0:(READ) flags 0x80700 phys_seg 1 prio c
lass 0
[ 204.449791] blk_update_request: critical medium error, dev nvme0n1, sector 281096 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
[ 204.461410] Buffer I/O error on dev nvme0n1p1, logical block 34881, async page read
read(2097152) returned 266240 bytes

After that I can do an fsck.ext4 /dev/nvme0n1p1 and repair the drive, but the next read will fail again.

Sometimes the read hangs the CPU and watchdog triggers. Which genuinely sounds like code space is being corrupted.

I have studied the 'ranges' and the 'dma-ranges' properties. I have dug through the code to try to understand how I can control the addresses used by the Root Port. I have not found anything that seems to work. The only thing I've succeeded in doing is causing the pcie driver to not load because it can't find the Txs port and the nvme driver to hang.

I feel I am very close to getting this to work, but I need a better understanding of how the Root Port should be configured to ensure "DMA" works from the Root Port to DRAM.

Thank you for your patience and any assistance you can offer.

BrianM · ‎02-26-2021

Another week, another bit of progress.

First the observations:

1. I proved that the MSGDMA and 256KB of SRAM are not used by removing them and getting the same performance. This saves a lot of space and performance is good enough with 140MB/s read.

2. The most important thing that I missed, which I forgot to mention last week, was the third and forth FPGA Bridges in the dts file. Here they are so others can add them: (note this has been added to the latest kernel, you just have to activate them. I found them in the latest kernel yesterday after spending weeks trying to figure out why DMA didn't work and adding them myself last week.)

fpga_bridge0: fpga_bridge@ff400000 {
compatible = "altr,socfpga-lwhps2fpga-bridge";
reg = <0xff400000 0x100000>;
resets = <&rst LWHPS2FPGA_RESET>;
clocks = <&l4_main_clk>;
};

fpga_bridge1: fpga_bridge@ff500000 {
compatible = "altr,socfpga-hps2fpga-bridge";
reg = <0xff500000 0x10000>;
resets = <&rst HPS2FPGA_RESET>;
clocks = <&l4_main_clk>;
};

fpga_bridge2: fpga-bridge@ff600000 {
compatible = "altr,socfpga-fpga2hps-bridge";
reg = <0xff600000 0x100000>;
resets = <&rst FPGA2HPS_RESET>;
clocks = <&l4_main_clk>;
status = "disabled";
};

fpga_bridge3: fpga-bridge@ffc25080 {
compatible = "altr,socfpga-fpga2sdram-bridge";
reg = <0xffc25080 0x4>;
status = "disabled";
};

fpgamgr0: fpgamgr@ff706000 {
compatible = "altr,socfpga-fpga-mgr";
reg = <0xff706000 0x1000
0xffb90000 0x4>;
interrupts = <0 175 4>;
};

3. The evil hack I came up with to fix the read errors is documented here. It seems to work, but as with all hacks, YMMV. (No I never understood how the DMA destination address is supposed to work, I'm not even positive I understand why this works).

Add this to your dts top: (sorry no formatting)

reserved-memory {
#address-cells = <1>;
#size-cells = <1>;
ranges;

// 2 MiB reserved for PCI Express DMA
pcidma1@0 {
reg = <0x00000000 0x00200000>;
// Note: this is the maximum you can reserve with kernel defaults and relocating the kernel doesn't seem to work with ARM ARCH.
no-map;
};
};

This reserves the first 2MB of DRAM. The kernel complains:

ERROR: reserving fdt memory region failed (addr=0 size=200000)

however, it still works, as shown in 'cat /proc/iomem':

root@cyclone5:/mnt/test# cat /proc/iomem
00200000-3fffffff : System RAM <- Notice first 2MB is not used...
00c00000-00cabe7f : Kernel data
c0000000-c01fffff : pcie@000000000
c0000000-c00fffff : PCI Bus 0000:01
c0000000-c0003fff : 0000:01:00.0
c0000000-c0003fff : nvme
ff200000-ff20007f : ff200080.msi vector_slave
ff200080-ff20008f : ff200080.msi csr
ff220000-ff223fff : c0000000.pcie Cra
ff700000-ff701fff : ff700000.ethernet ethernet@ff700000
ff702000-ff703fff : ff702000.ethernet ethernet@ff702000
ff704000-ff704fff : ff704000.dwmmc0 dwmmc0@ff704000
ff705000-ff705fff : ff705000.spi spi@ff705000
ff706000-ff706fff : ff706000.fpgamgr fpgamgr@ff706000
ff709000-ff709fff : ff709000.gpio gpio@ff709000
ffa00000-ffa00fff : ff705000.spi spi@ff705000
ffb90000-ffb90003 : ff706000.fpgamgr fpgamgr@ff706000
ffc02000-ffc0201f : serial
ffc03000-ffc0301f : serial
ffc04000-ffc04fff : ffc04000.i2c i2c@ffc04000
ffd02000-ffd02fff : ffd02000.watchdog watchdog@ffd02000
ffd05000-ffd05fff : rstmgr
ffe01000-ffe01fff : pdma@ffe01000
ffff0000-ffffffff : ffff0000.sram

Next, make the Txs bus on the Avalon-MM PCIe hard macro, 2MB big by setting it to 32 bits wide with a 1MB address width and two address pages. (This might work with 1MB in 64 bit mode, I will verify that later to shrink the reserved memory area once I figure out my latest issue)

On to the latest issue.

Interrupts are being dropped by hardware or software:

In this example I copy a 463MB file from the SDCARD to the NVMe drive: (It seems to happen more on writes than reads).

root@cyclone5:/mnt/test# cp /home/root/463MB.bin 3MB_Copy.bin
[ 226.418525] nvme nvme0: I/O 128 QID 1 timeout, completion polled
[ 257.138480] nvme nvme0: I/O 160 QID 1 timeout, completion polled
[ 290.408531] nvme nvme0: I/O 192 QID 1 timeout, completion polled
[ 320.498581] nvme nvme0: I/O 260 QID 1 timeout, completion polled
[ 350.569413] nvme nvme0: I/O 288 QID 1 timeout, completion polled
[ 380.648522] nvme nvme0: I/O 288 QID 1 timeout, completion polled
[ 410.728549] nvme nvme0: I/O 288 QID 1 timeout, completion polled
[ 440.808519] nvme nvme0: I/O 288 QID 1 timeout, completion polled
[ 470.888467] nvme nvme0: I/O 320 QID 1 timeout, completion polled
[ 500.968466] nvme nvme0: I/O 329 QID 1 timeout, completion polled
[ 534.898518] nvme nvme0: I/O 323 QID 1 timeout, completion polled
root@cyclone5:/mnt/test#
root@cyclone5:/mnt/test# [ 544.193059] systemd-journald[64]: Sent WATCHDOG=1 notification.
[ 564.968528] nvme nvme0: I/O 352 QID 1 timeout, completion polled
[ 595.048526] nvme nvme0: I/O 384 QID 1 timeout, completion polled
[ 625.128505] nvme nvme0: I/O 418 QID 1 timeout, completion polled
[ 655.848455] nvme nvme0: I/O 469 QID 1 timeout, completion polled

You can see that many many interrupts are being dropped. I'm expecting to have to outfit the altera-msi.c code with printk's to try to figure this out. So far, all attempts to improve it by adjusting hardware have not made any difference. I'm tempted to try an old kernel, but that might open a bigger can of worms.

Again, if anyone knows anything about this issue, I'm all ears.

BrianM · ‎03-03-2021

I think the issue I'm having relates to the fact that the only place the Root Port can write data in main memory is at location 0x00000000. I Have managed to move the target address (address behind):

[ 165.301799] altera-pcie c4000000.pcie: host bridge /soc/bridge@c0000000/pcie@000000000 ranges:
[ 165.310496] altera-pcie c4000000.pcie: Parsing ranges property...
[ 165.316659] altera-pcie c4000000.pcie: MEM 0x00c4000000..0x00c40fffff -> 0x0004000000
[ 165.324966] altera-pcie c4000000.pcie: Parsing dma-ranges property...
[ 165.331416] altera-pcie c4000000.pcie: IB MEM 0x0004000000..0x00040fffff -> 0x0004000000
[ 165.339966] altera-pcie c4000000.pcie: PCI host bridge to bus 0000:00

The NVMe driver locks up on a read to the device.

With the address behind the device at 0x00000000, the Root Port can overwrite the vector tables and crash the system.

It seems to me that the altera-pcie driver is immature and can't work for most applications. I don't have the time to work on this. My project is already two months behind because of this.

I'm going to assume that the SDCard is fast enough for now and work on the rest of the hardware bring up / debug and let the software guys determine if the SDCard will work for them. If it won't, I will revisit this thread and the other threads at Rocketboards that I have participated in.

It seems to me that a single knowledgeable kernel software Engineer could resolve these issues in a few weeks. Three weeks of work within intel to save months of work outside intel seems like an obvious choice. But as you have stated that intel cannot help me with linux or u-boot drivers, I'm disappointed in the level of support I've received from intel.

BrianM · ‎03-26-2021

It's almost completely working. There is one problem left. First let me summarize something that people may not know.

With Cyclone V pcie controller, there is no way to specify dma-ranges with the 5.4 kernel. It does nothing. However, on kernel 5.11 dma-ranges can be set and if you set it, the pci device driver will crash.

There is a problem with read errors from pcie devices. I fixed this by adding this to the top of my board device tree dts file:

(Long comment above it, code after)

	// Reserve the first 64KB cause the pcie driver doesn't know better?
	// Assumption: the pcie driver overwrites the vector table.
	// If this is not present, the nvme has read errors.
	// The kernel complains about the reservation, but does it:
	// "ERROR: reserving fdt memory region failed (addr=0 size=10000)"
	// You can see it reserved the memory in iomem:
	// root@cyclone5:~# cat /proc/iomem 
	// 00010000-001fffff : System RAM
	// 00200000-3fffffff : System RAM
	//   00d00000-00daf47f : Kernel data
	// ff700000-ff701fff : ff700000.ethernet ethernet@ff700000
	// ff702000-ff703fff : ff702000.ethernet ethernet@ff702000
	// ff704000-ff704fff : ff704000.dwmmc0 dwmmc0@ff704000
	// ff705000-ff705fff : ff705000.spi spi@ff705000
	// ff706000-ff706fff : ff706000.fpgamgr fpgamgr@ff706000
	// ff709000-ff709fff : ff709000.gpio gpio@ff709000
	// ffa00000-ffa00fff : ff705000.spi spi@ff705000
	// ffb90000-ffb90003 : ff706000.fpgamgr fpgamgr@ff706000
	// ffc02000-ffc0201f : serial
	// ffc03000-ffc0301f : serial
	// ffc04000-ffc04fff : ffc04000.i2c i2c@ffc04000
	// ffc05000-ffc05fff : ffc05000.i2c i2c@ffc05000
	// ffd02000-ffd02fff : ffd02000.watchdog watchdog@ffd02000
	// ffd05000-ffd05fff : rstmgr
	// ffd08140-ffd08143 : ffd08140.l2-ecc
	// ffe01000-ffe01fff : pdma@ffe01000
	// ffff0000-ffffffff : ffff0000.sram
	
	reserved-memory {
		#address-cells = <1>;
		#size-cells = <1>;
		ranges;

		pcidma1@0 {
			reg = <0x00000000 0x00010000>;
			no-map;
		};
	};

Note: The 5.4 kernel reserves exactly 64K for the vector table and stops the read errors. The 5.11 kernel reserves 2 MegaBytes and also stops the read errors. I do not understand this issue, but this is the work around.

I believe this only applies to Cyclone V which has an ARMv5 whose vector table is not movable. I suspect the kernel is not protecting the vector table from PCIe writes, unless the dts specifically reserves the memory. But this is just a guess. A kernel guru needs to be consulted to be sure.

I have a very simplified design that works well, almost. There are interrupt issues which I haven't been able to get past. I have the help of a linux driver expert and he suspects the hardware is misbehaving.

The failure:

When doing extensive writes to the NVMe, the following messages are generated and if the timeout is set to 30 seconds, each of them represents a 30 second delay.

root@cyclone5:~# ./test_nvme.sh 
[   75.836736] nvme nvme0: I/O 904 QID 1 timeout, completion polled
[   75.981635] EXT4-fs (nvme0n1p1): mounted filesystem with ordered data mode. Opts: (null). Quota mode: disabled.
Starting Trace...
[   82.826885] nvme nvme0: I/O 896 QID 1 timeout, completion polled
[   85.116846] nvme nvme0: I/O 928 QID 1 timeout, completion polled
[   86.156823] nvme nvme0: I/O 998 QID 1 timeout, completion polled
[   87.196816] nvme nvme0: I/O 928 QID 1 timeout, completion polled
[   88.236805] nvme nvme0: I/O 929 QID 1 timeout, completion polled
[   89.276824] nvme nvme0: I/O 928 QID 1 timeout, completion polled
root@cyclone5:~# grep timeout nvme_
nvme_full_2021_03_23_debug.log   nvme_full_2021_03_25_debug.log   nvme_full_2021_03_26_debug.log   nvme_trace_2021_0323.log
nvme_full_2021_03_24_debug.log   nvme_full_2021_03_25b_debug.log  nvme_trace.log                   
root@cyclone5:~# grep timeout nvme_full_2021_03_26_debug.log 
    kworker/0:2H-76      [000] ....    27.676716: nvme_timeout: qid: 1, tag: 98, completed: 4
    kworker/0:2H-76      [000] ....    75.836728: nvme_timeout: qid: 1, tag: 904, completed: 1
    kworker/1:1H-50      [001] ....    82.826875: nvme_timeout: qid: 1, tag: 896, completed: 65
    kworker/0:2H-76      [000] ....    85.116834: nvme_timeout: qid: 1, tag: 928, completed: 65
    kworker/0:2H-76      [000] ....    86.156812: nvme_timeout: qid: 1, tag: 998, completed: -961
    kworker/0:2H-76      [000] ....    87.196806: nvme_timeout: qid: 1, tag: 928, completed: 64
    kworker/0:2H-76      [000] ....    88.236793: nvme_timeout: qid: 1, tag: 929, completed: 65
    kworker/0:2H-76      [000] ....    89.276815: nvme_timeout: qid: 1, tag: 928, completed: 57

The trace is generated by patches applied to drivers within the kernel.

Some NVMe's have less issue with this than others.

I have noticed that the MSI and MSI-X interrupts are not being used and I suspect this is related to the issue.

This is lspci -v for my laptops nvme controller:

02:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller SM981/PM981/PM983 (prog-if 02 [NVM Express])
        Subsystem: Samsung Electronics Co Ltd NVMe SSD Controller SM981/PM981/PM983
        Flags: bus master, fast devsel, latency 0, IRQ 31, NUMA node 0
        Memory at d1800000 (64-bit, non-prefetchable) [size=16K]
        Capabilities: [40] Power Management version 3
        Capabilities: [50] MSI: Enable- Count=1/32 Maskable- 64bit+
        Capabilities: [70] Express Endpoint, MSI 00
        Capabilities: [b0] MSI-X: Enable+ Count=33 Masked-
        Capabilities: [100] Advanced Error Reporting
        Capabilities: [148] Device Serial Number 00-00-00-00-00-00-00-00
        Capabilities: [158] Power Budgeting <?>
        Capabilities: [168] Secondary PCI Express
        Capabilities: [188] Latency Tolerance Reporting
        Capabilities: [190] L1 PM Substates
        Kernel driver in use: nvme

Notice MSI-X Enable has a '+' sign after it. Here is the output of lspci -v for the Cyclone V

00:00.0 PCI bridge: Altera Corporation Device e000 (rev 01) (prog-if 00 [Normal decode])
        Flags: bus master, fast devsel, latency 0
        Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
        I/O behind bridge: 00000000-00000fff [size=4K]
        Memory behind bridge: 00000000-000fffff [size=1M]
        Prefetchable memory behind bridge: 00000000-000fffff [size=1M]
        Capabilities: [50] MSI: Enable- Count=1/4 Maskable- 64bit+
        Capabilities: [68] MSI-X: Enable- Count=513 Masked-
        Capabilities: [78] Power Management version 3
        Capabilities: [80] Express Root Port (Slot-), MSI 00
        Capabilities: [100] Virtual Channel
        Capabilities: [200] Vendor Specific Information: ID=1172 Rev=0 Len=044 <?>
lspci: Unable to load libkmod resources: error -12

01:00.0 Non-Volatile memory controller: Sandisk Corp WD Black 2018/PC SN520 NVMe SSD (rev 01) (prog-if 02 [NVM Express])
        Subsystem: Sandisk Corp WD Black 2018/PC SN520 NVMe SSD
        Flags: bus master, fast devsel, latency 0, IRQ 57
        [virtual] Memory at c0000000 (64-bit, non-prefetchable) [size=16K]
        Capabilities: [80] Power Management version 3
        Capabilities: [90] MSI: Enable- Count=1/32 Maskable- 64bit+
        Capabilities: [b0] MSI-X: Enable- Count=17 Masked-
        Capabilities: [c0] Express Endpoint, MSI 00
        Capabilities: [100] Advanced Error Reporting
        Capabilities: [150] Device Serial Number 00-00-00-00-00-00-00-00
        Capabilities: [1b8] Latency Tolerance Reporting
        Capabilities: [300] Secondary PCI Express <?>
        Capabilities: [900] L1 PM Substates
        Kernel driver in use: nvme

Notice that MSI Enable and MSI-X Enable have '-' minus signs after them.

I thought that maybe I have to enable the msi driver in the kernel, but I think the pcie-altera-msi driver is not used for the Avalon-MM PCIe controller because it is designed for the stand alone MSI to GIC module.

How do I get MSI / MSI-X to enable with the pcie-altera driver?

Thanks for your support.

Brian

BrianM · ‎03-27-2021