Programmable Devices
CPLDs, FPGAs, SoC FPGAs, Configuration, and Transceivers
20688 Discussions

A10 AVMM DMA has only a single MSI to Host, how to distinguish this DMA completion is from READ DMA controller, or from WRITE DMA controller?

JET60200
New Contributor I
1,206 Views

hi everyone,

I'm using Arria10 which is PCIe EP in X86 Host. Per my understanding, Altera AVMM-DMA only provides single MSI to notify the DMA is completion(both READ DMA controller and WRITE DMA controller share the same MSI IRQ ).

 

IN our application, we need start PCIE DMA transmission and receiving simultaneously. But in interrupt handler, we need distinguish whether it's READ DMA completion event occur, Or it's WRITE DMA completion occurs. But I don;t find any Register to tell this.

 

HOW can we distinguish them, in MSI handler() ?

 

Thanks very much for any helps.

0 Kudos
7 Replies
SengKok_L_Intel
Moderator
989 Views

Hi,

 

When the driver receives the interrupt, it can poll the DONE bit of each descriptor to make sure all descriptors are finished, it is either write or read DMA , you can poll both to confirm.

 

Regards -SK

0 Kudos
JET60200
New Contributor I
989 Views

Thanks SK,

If there's no independent IRQ NO for AVMM-DMA Read or Write, try to query DONE BIT seems the only method to know whether irq comes from READ DMA or WRITE DMA, even though this way is n't so straightforward.

Thanks for suggestion.

 

Best regards ​

0 Kudos
JET60200
New Contributor I
989 Views

Hi SK,

I'm checking A10 AVMM DMA reference linux driver code (gen3x8_avmm_dma_Linux), in " altera_dma.c " file -> dma_test() function, there are below code :

   last_id = last_id + bk_ptr->dma_status.altera_dma_descriptor_num; /// calculate to get new "last-id" param

   if(last_id > 127){

      last_id = last_id - 128;

      if((bk_ptr->dma_status.altera_dma_descriptor_num > 1) && (last_id != 127)) write_127 = 1; /// check whether "last-id" roll back to "descriptor[] head"

   }

if(write_127)  iowrite32 (127, bk_ptr->bar[0]+DESC_CTRLLER_BASE+ALTERA_LITE_DMA_RD_LAST_PTR); /// (--- 1 )

   iowrite32 (last_id, bk_ptr->bar[0]+DESC_CTRLLER_BASE+ALTERA_LITE_DMA_RD_LAST_PTR);          /// (--- 2 )

Here to my understand it write "id" into LAST_PTR to start DMA operation. But why can the "LAST_PTR" be be written twice continueously ?

To my understanding, the second  iowrite32 (last_id, .. ) will overwrite the first iowrite32 (127, ..).

Does DMA Controller support the sequence writing : iowrite32 (a,) -> iowrite32 (b,) -> iowrite32 (c,) -> iowrite32 (d,) .. And still keep running correctly ?

Can you clarify a bit. Thanks

===

In https://www.intel.com/content/dam/www/programmable/us/en/pdfs/literature/ug/ug_a10_pcie_avmm_dma.pdf, there are sentence in page 81 as following :

" .. host software should write multiple IDs into the last pointer register. as example :

  1. Program the RD_DMA_LAST_PTR = 63.
  2. Program the RD_DMA_LAST_PTR = 127.
  3. Poll the status DWORD for read descriptor 63.
  4. Poll the status DWORD for read descriptor 127

"

Does this explain it from Intel perspective ? ​

 

Thanks

0 Kudos
SengKok_L_Intel
Moderator
989 Views

Hi,

Yes, page 81 explains the behavior of multiple writes to the LAST_PTR. You can also refer to page 82, the description of RDDMA_LAST_PTR for more definition.

 

Regards -SK

0 Kudos
JET60200
New Contributor I
989 Views

Hi SK,

Thanks for your quick support!!

​We met new problem when writing our own A10 pcie dma driver. We use "A10 AVMM DMA reference linux driver code ( gen3x8_avmm_dma_Linux)" as code reference, in this code there are below :

[

#define MAX_NUM_DWORDS                 0x100000 //1M DWORDS

altera_pci_probe(struct pci_dev *dev, const struct pci_device_id *id)

{

..

 bk_ptr->numpages = (PAGE_SIZE >= MAX_NUM_DWORDS*4) ? 1 : (int)((MAX_NUM_DWORDS*4)/PAGE_SIZE);

   bk_ptr->rp_rd_buffer_virt_addr = pci_alloc_consistent(dev, PAGE_SIZE*bk_ptr->numpages, &bk_ptr->rp_rd_buffer_bus_addr);

   if (!bk_ptr->rp_rd_buffer_virt_addr) {

       rc = -ENOMEM;

       goto err_rd_buffer;

   }

   bk_ptr->rp_wr_buffer_virt_addr = pci_alloc_consistent(dev, PAGE_SIZE*bk_ptr->numpages, &bk_ptr->rp_wr_buffer_bus_addr);

   if (!bk_ptr->rp_wr_buffer_virt_addr) {

       rc = -ENOMEM;

       goto err_wr_buffer;

   }

..

]

Here my understanding is the "  bk_ptr->rp_rd_buffer_virt_addr " and "  bk_ptr->rp_wr_buffer_virt_addr " are allocated to store userspace customer DATA[x], so it will be accessed by A10 AVMM-DMA. It uses "pci_alloc_consistent()" function, thus the max allocation memsize is 4M Bytes as I tested on x86_64 PC. ( That means, if I increase " #define MAX_NUM_DWORDS   0x200000 //2M DWORDS " , my x86 linux kernel would crash when calling function of " rp_rd_buffer_virt_addr = pci_alloc_consistent(dev, PAGE_SIZE*bk_ptr->numpages, &bk_ptr->rp_rd_buffer_bus_addr); ", as following :

[

[ 391.240669] Altera PCIE 0000:01:00.0: RD_DMA_Desc: rd_cpu_virt_addr 0xffff8806dea68000, rd_phys_addr 0x6dea68000, rd_bus_addr 0x6dea68000, length 4608

[ 391.240670] Altera PCIE 0000:01:00.0: WR_DMA_Desc: wr_cpu_virt_addr 0xffff8806df6ae000, wr_phys_addr 0x6df6ae000, wr_bus_addr 0x6df6ae000, length 4608

[ 391.240672] Altera PCIE 0000:01:00.0: PAGE_SIZE = 0x00001000, MAX_NUM_DWORDS = 0x00200000, numpages 2048

[ 391.240673] ------------[ cut here ]------------

[ 391.240676] WARNING: CPU: 7 PID: 2753 at mm/page_alloc.c:2902 __alloc_pages_slowpath+0x6f/0x724

[ 391.240676] Modules linked in: altera_pcie(OE+) vfat fat intel_powerclamp coretemp intel_rapl kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul snd_hda_codec_hdmi glue_helper ablk_helper snd_soc_ssm4567 snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel cryptd snd_hda_codec snd_soc_rt5640 snd_soc_rl6231 snd_soc_core snd_hda_core joydev ppdev pcspkr snd_compress regmap_i2c snd_hwdep snd_seq snd_seq_device sg snd_pcm snd_timer snd mei_me mei parport_pc parport soundcore shpchp acpi_pad snd_soc_sst_acpi i2c_designware_platform i2c_designware_core snd_soc_sst_match ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic i915 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm ahci libahci libata crct10dif_pclmul crct10dif_common crc32c_intel

[ 391.240697] 8139too r8169 8139cp serio_raw sdhci_acpi iosf_mbi mii sdhci mmc_core video i2c_hid i2c_core dm_mirror dm_region_hash dm_log dm_mod

[ 391.240704] CPU: 7 PID: 2753 Comm: insmod Tainted: G          OE ------------  3.10.0-693.el7.x86_64 #1

[ 391.240704] Hardware name: Gigabyte Technology Co., Ltd. Z97-HD3/Z97-HD3, BIOS F6 06/11/2014

[ 391.240705] 0000000000000000 00000000acf44fa8 ffff8806dc7b3868 ffffffff816a3d91

[ 391.240706] ffff8806dc7b38a8 ffffffff810879c8 00000b568116b92d 0000000000000020

[ 391.240707] 0000000000008020 ffff88082fdd8000 0000000000000000 0000000000028020

[ 391.240709] Call Trace:

[ 391.240712] [<ffffffff816a3d91>] dump_stack+0x19/0x1b

[ 391.240714] [<ffffffff810879c8>] __warn+0xd8/0x100

[ 391.240716] [<ffffffff81087b0d>] warn_slowpath_null+0x1d/0x20

[ 391.240717] [<ffffffff8169f723>] __alloc_pages_slowpath+0x6f/0x724

[ 391.240718] [<ffffffff8108a7e4>] ? vprintk_emit+0x3c4/0x510

[ 391.240721] [<ffffffff8118cd85>] __alloc_pages_nodemask+0x405/0x420

[ 391.240724] [<ffffffff81030f8f>] dma_generic_alloc_coherent+0x8f/0x140

[ 391.240726] [<ffffffff81064341>] x86_swiotlb_alloc_coherent+0x21/0x50

[ 391.240728] [<ffffffffc067f238>] dma_alloc_attrs.constprop.15+0x85/0x87 [altera_pcie]

[ 391.240729] [<ffffffffc003a857>] altera_pci_probe+0x857/0xc22 [altera_pcie]

[ 391.240732] [<ffffffff81280e75>] ? sysfs_link_sibling+0xb5/0xe0

[ 391.240734] [<ffffffff812817c2>] ? sysfs_addrm_finish+0x42/0xe0

[ 391.240735] [<ffffffff812815f1>] ? __sysfs_add_one+0x61/0x100

[ 391.240738] [<ffffffff8136a535>] local_pci_probe+0x45/0xa0

[ 391.240739] [<ffffffff8136bbe9>] pci_device_probe+0x109/0x160

[ 391.240742] [<ffffffff8143fbf2>] driver_probe_device+0xc2/0x3e0

[ 391.240743] [<ffffffff8143ffe3>] __driver_attach+0x93/0xa0

[ 391.240744] [<ffffffff8143ff50>] ? __device_attach+0x40/0x40

[ 391.240746] [<ffffffff8143d7c3>] bus_for_each_dev+0x73/0xc0

[ 391.240747] [<ffffffff8143f56e>] driver_attach+0x1e/0x20

[ 391.240748] [<ffffffff8143f010>] bus_add_driver+0x200/0x2d0

[ 391.240750] [<ffffffff81440674>] driver_register+0x64/0xf0

[ 391.240751] [<ffffffff8136b425>] __pci_register_driver+0xa5/0xc0

[ 391.240752] [<ffffffffc003ac22>] ? altera_pci_probe+0xc22/0xc22 [altera_pcie]

[ 391.240754] [<ffffffffc003ac7b>] altera_pcie_init+0x59/0x3de [altera_pcie]

[ 391.240756] [<ffffffff810020e8>] do_one_initcall+0xb8/0x230

[ 391.240758] [<ffffffff81100734>] load_module+0x1f64/0x29e0

[ 391.240760] [<ffffffff8134bbf0>] ? ddebug_proc_write+0xf0/0xf0

[ 391.240761] [<ffffffff810fcdd3>] ? copy_module_from_fd.isra.42+0x53/0x150

[ 391.240762] [<ffffffff81101366>] SyS_finit_module+0xa6/0xd0

[ 391.240764] [<ffffffff816b4fc9>] system_call_fastpath+0x16/0x1b

[ 391.240765] ---[ end trace a3bc8f940255d4b4 ]---

]

==>

Our problem is the max "4MB" coherent dma memory is not enough for our application. We need al least 64MBytes dma coherent memory for both READing buffer and Writing buffer, to send and receive our data to/from FPGA, .Is there any way we can allocate bigger dma coherent memory for FPGA AVMM-DMA engine to read / write to ?

I think these pre-allocated "rp_rd_buffer_bus_addr" and "rp_wr_buffer_bus_addr" are filled into A10 DMA Descriptor Table -> [ RD_LOW / HIGH_SRC_ADDRESS ] for read and [ WR_CTRL_LOW / HIGH _DEST_ADDRESS ] for writing. HOW can we allocate large memory for these two ADDRESSES ?

maybe it's a stupid question, but I hope you (or someone else) can help to answer . Thanks in advance ~

/best regards

0 Kudos
SengKok_L_Intel
Moderator
989 Views

Hi

 

I apologize that I can't help much in terms of how to edit the driver. But base on my understanding of this IP, each descriptor can support a maximum of 1M bytes, and there is a total of 128 descriptors. It depends on if the user application layer can handle those transfer size or not. The example design uses on-chip memory so this is too small to handle the maximum transfer size. You might need to modify the design to use external memory in order to have more memory buffer.

 

Regards -SK

0 Kudos
JET60200
New Contributor I
989 Views

Thanks @ SK, appreciate your help and suggestion ~ ​

0 Kudos
Reply