- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I've developed an application on a Terasic DE10-Lite board which has a MAX10 and an SDRAM chip onboard, using Quartus 18.1 for this. In my design I have used a nios II/f core (w/o cache), the altera_msgdma IP and have developed a small driver to use it. Testing it it I've discovered that sometimes the device would never clear the busy bit, making it appear stuck and the written data is corrupted. I've provided a patch that solves it, .sof, .elf and .qsys and .stp files, below are all the details.
The altera_msgdma is configured with mode MM-to-MM and transfer_type set to unaligned access, no burst.
I'll use the pdf UG-01085 | 2018.09.24 as reference for everything: https://www.intel.com/content/www/us/en/docs/programmable/683130/18-1/introduction.html
Test application: make an array of bytes (uint8_t) on an on-chip memory (altera_avalon_onchip_memory2), fill it with increasing values (starting from 0). Transfer 500 times using the DMA device (meaning that the same descriptor is placed 500 times in the descriptor FIFO) a number of bytes to SDRAM (altera_avalon_new_sdram_controller). Wait for the DMA to end the transfers (polling the status register), check if the copied data matches, then increase the number of bytes to transfer and repeat. The number of bytes starts from 1 and goes up to the array size.
To make it clearer: src array has values (uint8_t) 0, 1, 2, ..., 255, 0, 1, 2, ..., dma copies 1 byte from src to dest 500 times, then dma copies 2 bytes from src to dest 500 times, and so on. The test never ends, waiting for the DMA to assert the Descriptor Buffer Empty bit (table 298 p. 330) when the number of bytes to transfer is 25. This is repeatable, but I think specific to my setup.
I run this test under GDB without breakpoints and, after more than enough time, pause it, the dma has (table 297 on page 329 and table 298 p. 330) the status register set to 1, and it stays like that until the end of times. This is what I can see from GDB when I stop it:
52 while (dma_jolly.csr_port.status.get().descriptor_empty != 1) {
1: /x dest = {_M_elems = {0x10, 0x11, 0x12, 0x13, 0x14, 0x15, 0x16, 0x17, 0x18, 0x19, 0x1a, 0x1b, 0x0, 0x1, 0x2, 0x3, 0x4, 0x5, 0x6, 0x7, 0x8, 0x9, 0xa, 0xb, 0xc, 0xba <repeats 975 times>}}
(gdb) p /x src
$6 = {_M_elems = {0x0, 0x1, 0x2, 0x3, 0x4, 0x5, 0x6, 0x7, 0x8, 0x9, 0xa, 0xb, 0xc, 0xd, 0xe, 0xf, 0x10, 0x11, 0x12, 0x13, 0x14, 0x15, 0x16, 0x17, 0x18, 0x19, 0x1a, 0x1b, 0x1c, 0x1d, 0x1e,
0x1f, 0x20...}}
As you can see, dest does not have the same values as src, it should start with 0x0, the DMA has "changed" the order of the written data.
After many trials I've tried to use Signal Tap Logic Analyzer in Quartus: there is a hardware bug that "hangs" the device, it exists only if the IP is configured with TRANSFER_TYPE set to unaligned access (in platform designer).
In the screenshot you can see the last node [...]read_master:read_mstr_internal|scfifo:the_master_to_st_fifo|usedw[4..0] as a bar chart, this is a FIFO in the read_master of the altera_sgdma that overflows. The issue is in the read_master module, its task is reading from the MM Avalon bus and forwarding data to the write_master, it uses a FIFO and has to stop requesting data from the bus early "enough" to not overflow that FIFO. When the module is configured to allow unaligned access it creates the signal (line 751 of read_master.v)
assign too_many_pending_reads = (({fifo_full,fifo_used} + pending_reads_counter) > (FIFO_DEPTH - (maximum_burst_count * 3)));
It assumes a pipeline depth of 3, but it really is 4 deep, because of a barrel shifter in module MM_to_ST_Adapter.v, lines 289-300, that needs an extra cycle to empty itself, after all the reads from the Avalon bus are already done.
The .stp file shows the first time the overflow happens at sample -2472.
src is at addess 0x3e90, dest is at 0x4000000
I have no issue uploading every source file, but it requires an upstream copy of gcc 13.2 to compile, I'm attaching only my main .cpp file to show how the test works.
- Tags:
- altera_msgdma
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Could you provide the whole design file for duplicating?
There're some software library files missing.
Thanks,
Regards,
Sheng
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Sheng,
sorry for the delay, I've cleaned up the software files and left only what is needed for the test to compile.
My setup is a bit "twisted", you will likely need a copy of the toolchain, I've attached it. I had to modify the Makefile to have it work under WSL (to escape paths), and there is a patch for the bsp makefile (it removes 2 source files) under the buffering folder.
If you start with a clean bsp, then I'd start with
patch ../buffering_fast_bsp/Makefile bsp_makefile.patch
The toolchain is compiled to use picolibc and it is used for both the BSP and the test application, so you'll have to copy the file Software/buffering/picolibc_custom.specs to the BSP directory (as I've done under buffering_fast_bsp).
I've run make clean_all to clean all the build files, I think you can use the BSP I've provided and not bother with all that above.
The toolchain was compiled under ubuntu 22.04, it is not static linked.
Cheers,
etome
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi etome,
May I know where is the attached file?
Better that you can attach the whole software folder and the .qsf as well? Thanks.
Regards,
Sheng
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@ShengN_Intel wrote:May I know where is the attached file?
I'd like to know myself... Sorry for the delay, I'm attaching all you've requested to this reply.
Thanks,
etome
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Here is the toolchain
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi etome,
Could you provide also the sof and elf files that works fine after applying the patch?
I'll report this to internal team. Please stick with the workaround in the mean time.
Thanks,
Regards,
Sheng
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Sheng,
sure. This is what the test outputs when it is working fine:
nios2-terminal: connected to hardware target using JTAG UART on cable
nios2-terminal: "USB-Blaster [USB-0]", device 1, instance 0
nios2-terminal: (Use the IDE stop button or Ctrl-C to terminate)
DMA bugger
=== continuous loop ===
[dma_bugger] success = 999, fail = 0, setup time = 12517544, transfer time = 129804940
[dma_bugger] success = 1998, fail = 0, setup time = 12517515, transfer time = 129805544
[dma_bugger] success = 2997, fail = 0, setup time = 12521287, transfer time = 129805067
[dma_bugger] success = 3996, fail = 0, setup time = 12517509, transfer time = 129805073
[dma_bugger] success = 4995, fail = 0, setup time = 12521300, transfer time = 129804962
[dma_bugger] success = 5994, fail = 0, setup time = 12517506, transfer time = 129804818
[dma_bugger] success = 6993, fail = 0, setup time = 12518670, transfer time = 129805148
[dma_bugger] success = 7992, fail = 0, setup time = 12517509, transfer time = 129805062
[dma_bugger] success = 8991, fail = 0, setup time = 12517509, transfer time = 129805089
[dma_bugger] success = 9990, fail = 0, setup time = 12517512, transfer time = 129806820
[dma_bugger] success = 10989, fail = 0, setup time = 12518671, transfer time = 129804916
[dma_bugger] success = 11988, fail = 0, setup time = 12521309, transfer time = 129804786
[dma_bugger] success = 12987, fail = 0, setup time = 12517512, transfer time = 129805243
[dma_bugger] success = 13986, fail = 0, setup time = 12517509, transfer time = 129805216
Use the same elf as before, I have attached the Quartus project with the patch applied and built.
Thanks,
etome
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page