Intel® SoC FPGA Embedded Development Suite
Support for SoC FPGA Software Development, SoC FPGA HPS Architecture, HPS SoC Boot and Configuration, Operating Systems

DMA plan on the Agilex5

K606
Novice
491 Views

I am looking to build a very basic mSGDMA example design and test it on the Agilex-V.

The plan is to program software on the SoC (running linux), which will tell a DMA (memory-mapped to memory-mapped) to move data from one on-chip-memory block to another.

I have two ideas for a design, sketched out below:

Screenshot (85).png

My hope is that with design B, I might be able to:

1: use /dev/mem to read from ocm_write (record initial state)

2: use /dev/mem to write to ocm_read (write a simple pattern)

3: initiate a transfer from ocm_read to ocm_write using the DMA

4: use /dev/mem to read from ocm_write (check to see if the state has changed)

 

So far, I have compiled both designs A and B using only Intel IP. I have not included any custom verilog based drivers. Nothing is tested yet on hardware.

 

I am making this post to hopefully get some more opinions/sanity checks that this is not barking up the wrong tree - so in that respect, does this plan seem reasonable?

 

Many thanks!

K

0 Kudos
1 Solution
JitLoonL_Altera
Employee
303 Views

Hi, 

Q1: "Would this [UART firmware programming] be achievable via device tree edits?"

Not directly. The device tree (DT) describes the hardware for Linux — it’s used by the kernel to understand available peripherals like UART, memory regions, DMA, etc. It does not affect how bootloaders or firmware tools work.

In terms of UART firmware burning, DT changes won’t enable UART-based firmware flashing unless:

  • You're writing a custom bootloader or application that reads firmware data via UART and writes it to flash.

  • You use DT to mark certain memory regions as non-cached (helpful if the loader runs under Linux).

Q2: "If I understand correctly, you are saying that /dev/mem does not flush/handle cache in any way?"

Correct.
/dev/mem provides raw access to physical memory — but:

  • It bypasses Linux’s cache coherency management.

  • Reads/writes using /dev/mem do not flush or invalidate the CPU cache.

  • This means if DMA wrote data to memory, but cache holds stale data, a read from /dev/mem may show incorrect results unless:

    • You manually flush/invalidate the cache in user space (not trivial).

    • Or you map the memory as non-cacheable (preferred approach).

Recommendations for DMA Testing with Cache Coherency:

  1. Use dma-coherent memory:

    • In DT: use dma-coherent; or map memory via dma_alloc_coherent() if in kernel driver.

  2. Mark OCM region as non-cacheable via DT:

    • E.g., using no-map; and specifying memory-region entries.

  3. Avoid caching in /dev/mem mappings:

    • Use mmap() with O_SYNC and MAP_SHARED to minimize caching.

  4. Flush/invalidate cache manually:

    • Possible from kernel space.

    • User-space solutions are hacky and not guaranteed.

View solution in original post

10 Replies
RolandoS_Altera
Employee
458 Views

Hello

 

This is Rolando and I will try to help you. The example that you want to develop, you want to move a set of data from one OCM to another OCM.  Right now I just remember an example that was implemented for Agilex 7 in which a set of data is being moved from one memory location of main memory (SDRAM) to another memory location using the mSGDMA IP. This example is located at:

https://altera-fpga.github.io/rel-25.1/embedded-designs/agilex-7/f-series/soc/setup-use-bridges/ug-setup-use-bridges-agx7f-soc/

 

Please give me some time to find another example that could be closer to what you want to do.

 

Thanks

Rolando

JitLoonL_Altera
Employee
429 Views

Yes, your plan is quite reasonable and aligns with a common approach for validating simple DMA flows using Intel’s Modular SGDMA (mSGDMA) IP on SoC platforms like Agilex-V. You’re approaching this the right way by minimizing custom logic and using /dev/mem to manipulate and validate memory contents.

Here are some thoughts and sanity checks:

  1. mSGDMA Setup: Ensure that your mSGDMA is configured in memory-mapped to memory-mapped mode, with Avalon-MM ports connected to both OCM blocks properly. Avoid scatter-gather mode for this test.
  2. Memory Region Mapping: Make sure the OCM blocks are mapped to accessible physical addresses and not optimized away or restricted by MMU settings in Linux.
  3. Be aware of potential issues with cache when using /dev/mem on a Linux system.
  4. Use dma-coherent memory or flush/invalidate the cache before/after DMA if required (or mark the region as non-cacheable via device tree or MMU settings if needed).
  5. You'll need to write to the mSGDMA’s control/status registers through /dev/mem too, to provide the transfer descriptor: source addr, dest addr, length, start, etc.
  6. This part is often trickier than expected. Ensure you’ve got the right base address and offsets from the Platform Designer output.
  7. OCM blocks must be large enough and have enough address range to hold your test patterns.
  8. Align your transfers to word boundaries expected by the DMA.

Suggestions:

  • Add small test patterns like 0xAA55AA55 or a known incrementing pattern to easily spot transfer errors.
  • Start with a small transfer size (e.g., 16–64 bytes) and scale up once the mechanism is confirmed.
  • Use hexdump or a simple userspace C program for /dev/mem access to make reads/writes easier and more repeatable.

Points of Caution:

  • mSGDMA might not complete the transfer if any part of source/destination addresses is incorrect or misaligned.
  • /dev/mem might be blocked or restricted depending on your kernel config — check permissions or run as root.
  • Some Agilex SoC configurations need proper HPS handoff from the bootloader to allow memory-mapped access cleanly.
RolandoS_Altera
Employee
389 Views

Hello 

 

I couldn't find any example design exactly as the one you want to create, but I was able to create a design which is based on our Agilex 5 GHRD. This design is very similar to the CONFIG B that you have. The only differences is that I connect the OCRs blocks to the HPS2FPGA bridge instead of the LWHPS2FPGA bridges as this is how we do it in our GHRD. Also we keep the interrupt in the MSGDMA disconnected.

RolandoS_Altera_0-1746145039175.png

 

I am still trying to verify the functionality of the design from Linux created also from the GSRD of the Agilex 5 device (Premium Dev Kit).

To verify this I am using the devmem2 command to write to the OCRs and also to the registers in the MSGDMA. So far I can confirm the writing/reading to/from the OCRs works fine. The configuration that I am using the MSGDMA component is the same than the one provided in the Bridges examples that I shared earlier.

K404
Beginner
350 Views

Hi Ronaldo,

Thanks for your input!

I look forward to hearing if you can verify your design.

Many thanks!

0 Kudos
RolandoS_Altera
Employee
193 Views

Hello

 

I was able to validate the move of a set of data in read OCRAM to the write OCRAM through the MSGDMA using the U-Boot commands (this is based on the bridge example page that I shared earlier). I still have not been able to exercise this it in Linux.

The snipet of how I tested this in U-Boot is the following: 

#MEMORY LOCATIONS
#read_ocr=0x40000000
#write_ocr=0x40040000

#msgdma_descriptorlave=0x20020000
#msgdma_csr=0x20020020
#msgdma_response=0x20020040

#msgdma offsets
#msgdma_csr_control=0x4
#msgdma_descriptorlave_read_high=0x14
#msgdma_descriptorlave_write_low=0x4
#msgdma_descriptorlave_write_high=0x18
#msgdma_descriptorlave_length=0x8
#msgdma_descriptorlave_burst=0xc
#msgdma_descriptorlave_control=0x1c

setenv autoload no
dhcp
setenv serverip 10.244.157.112

tftp ${loadaddr} ghrd_a5ed065bb32ae6sr0.core.rbf
bridge disable
fpga load 0 ${loadaddr} ${filesize}
bridge enable

# Read SysID (should read 0xACD5CAFE)
md 0x20010000 1


# Write to Read OCR
mw 0x40000000 0xcafecafe 0x10
# Check 1st set of data
md 0x40000000 4


# Stop dispatcher
mw 0x20020024 0x1
# Stop descriptors
mw 0x20020024 0x20
# Reset Dispatcher
mw 0x20020024 0x2

# Write dma_read_pointer_low
mw 0x20020000 0x40000000
# Write dma_read_pointer_high
mw 0x20020014 0x0
# Write dma_write_pointer_low
mw 0x20020004 0x40040000
# Write dma_write_pointer_low
mw 0x20020018 0x0
# Write length in bytes 512
mw 0x20020008 0x200
# Write Burst counts 32, sequence number = 1
mw 0x2002000c 0x20200001

# Check status
md 0x20020020 1

# Control, wait for resp
mw 0x2002001c 0x02000000
sleep 1
# Control, wait for resp and go
mw 0x2002001c 0x82000000

# Check status (Should read 0x2)
md 0x20020020 1

# Read back (should read 0xcafecafe in each location)
md 0x40040000 10

 

K404
Beginner
349 Views

Hi JitLoon,

Thanks for your reply - very useful! I have a few questions, I hope it's ok:

On point 2: would this be achievable via device Tree edits?

On point 3: if I understand correctly you are saying that dev/mem does not flush/handle cache in any way?

Many thanks,
K

 

(By the way, sorry, I had to make another account as the phone with my old my authenticator app is having issue

0 Kudos
JitLoonL_Altera
Employee
304 Views

Hi, 

Q1: "Would this [UART firmware programming] be achievable via device tree edits?"

Not directly. The device tree (DT) describes the hardware for Linux — it’s used by the kernel to understand available peripherals like UART, memory regions, DMA, etc. It does not affect how bootloaders or firmware tools work.

In terms of UART firmware burning, DT changes won’t enable UART-based firmware flashing unless:

  • You're writing a custom bootloader or application that reads firmware data via UART and writes it to flash.

  • You use DT to mark certain memory regions as non-cached (helpful if the loader runs under Linux).

Q2: "If I understand correctly, you are saying that /dev/mem does not flush/handle cache in any way?"

Correct.
/dev/mem provides raw access to physical memory — but:

  • It bypasses Linux’s cache coherency management.

  • Reads/writes using /dev/mem do not flush or invalidate the CPU cache.

  • This means if DMA wrote data to memory, but cache holds stale data, a read from /dev/mem may show incorrect results unless:

    • You manually flush/invalidate the cache in user space (not trivial).

    • Or you map the memory as non-cacheable (preferred approach).

Recommendations for DMA Testing with Cache Coherency:

  1. Use dma-coherent memory:

    • In DT: use dma-coherent; or map memory via dma_alloc_coherent() if in kernel driver.

  2. Mark OCM region as non-cacheable via DT:

    • E.g., using no-map; and specifying memory-region entries.

  3. Avoid caching in /dev/mem mappings:

    • Use mmap() with O_SYNC and MAP_SHARED to minimize caching.

  4. Flush/invalidate cache manually:

    • Possible from kernel space.

    • User-space solutions are hacky and not guaranteed.

K606
Novice
251 Views
0 Kudos
RolandoS_Altera
Employee
78 Views

I was able to test this example in Linux and it also works, but I found out that the Descriptor registers in the MSGDMA controller are write only registers and I can't write to them using the regular devmem2 command because this performs a read/write/read-back operation in each write so I had to modify the application to perform only a write whenever I wanted to do a write.  I am attaching the application that I used to validate the design. 

K606
Novice
65 Views

Hi Rolando,

 

Wow great! I wish I could also add your post to 'Accepted Solution'. Next time. Thanks for your time.

0 Kudos
Reply