FPGA Intellectual Property
PCI Express*, Networking and Connectivity, Memory Interfaces, DSP IP, and Video IP

cacheable PCIe BAR

Altera_Forum
Honored Contributor II
4,154 Views

Is it possible to make BAR cacheable for PCIe hard macro on Stratix IV attached to standard PC with Intel processor? 

 

Anything special needs to be done in the driver for such case? 

 

Thanks!
0 Kudos
8 Replies
Altera_Forum
Honored Contributor II
2,956 Views

 

--- Quote Start ---  

Is it possible to make BAR cacheable for PCIe hard macro on Stratix IV attached to standard PC with Intel processor? 

 

Anything special needs to be done in the driver for such case? 

 

--- Quote End ---  

There's no such thing as a cacheable BAR (per any PCI/PCIe specification). So you'll need to explain what you mean. 

 

Which OS are you using? 

 

Cheers, 

Dave
0 Kudos
Altera_Forum
Honored Contributor II
2,956 Views

By cacheable BAR I mean BAR that can be cached by Intel processor cache. 

 

Typically, BARs are not cached by processor cache, however, in this case caching is desirable. 

 

I am using Linux, CentOS 5 (2.6.18). 

 

I modified MTRR settings to exclude the BAR from uncached regions. Also, I wrote a driver that creates bin_attribute in /sys/... with custom mmap() function that maps the BAR into user space without setting _PAGE_PCD | _PAGE_PWT page flags. 

 

When the BAR is mmaped into user space I can issue reads to it and observed caching behavior, i.e. 2nd read to same address does not go to FPGA. 

 

However, when I am trying to issue a write the same BAR, the system reboots without any message on the screen or in the logs. 

 

So, I am wondering whether I am doing something wrong in the driver/settings or Stratix IV PCIe implementation does not support some feature, which is needed for this to work properly?
0 Kudos
Altera_Forum
Honored Contributor II
2,956 Views

 

--- Quote Start ---  

By cacheable BAR I mean BAR that can be cached by Intel processor cache. 

 

--- Quote End ---  

That is not a function of the PCIe device, its a function of the Intel processor. The only PCIe bus feature you can control via the configuration registers is whether the memory region is read prefetchable or not. There are some cacheline registers, but they have an effect during DMA, and for bridges (at least under PCI). 

 

 

--- Quote Start ---  

 

Typically, BARs are not cached by processor cache, however, in this case caching is desirable. 

 

I am using Linux, CentOS 5 (2.6.18). 

 

I modified MTRR settings to exclude the BAR from uncached regions. Also, I wrote a driver that creates bin_attribute in /sys/... with custom mmap() function that maps the BAR into user space without setting _PAGE_PCD | _PAGE_PWT page flags. 

 

When the BAR is mmaped into user space I can issue reads to it and observed caching behavior, i.e. 2nd read to same address does not go to FPGA. 

 

However, when I am trying to issue a write the same BAR, the system reboots without any message on the screen or in the logs. 

 

So, I am wondering whether I am doing something wrong in the driver/settings or Stratix IV PCIe implementation does not support some feature, which is needed for this to work properly? 

--- Quote End ---  

Writes should always go through to the PCIe endpoint. If you want higher performance for your writes, trying to manipulate the processor cache is the wrong way to go. You need to implement a DMA master at the PCIe end-point and have it DMA from the main host memory. 

 

Trying to play games with caches is asking for trouble, since you cannot snoop the cache and basically keep it consistent. 

 

Cheers, 

Dave
0 Kudos
Altera_Forum
Honored Contributor II
2,956 Views

Here's some comments in an old PCI mmap driver routine I wrote. Perhaps this still applies in the latest kernels ... 

 

/* Flags to stop the processor treating PCI memory as * cacheable (see asm-ppc/pgtable.h) * * (thanks to Travis Sawyer from the ppc-embedded list) * * I could have used '#ifdef CONFIG_44x', but 40x uses * these flags too, as do other processors. So just check * whether the flag exists. * * TODO: * p425 Rubini; use pgprot_noncached() * * asm-ppc/pgtable.h defines it as setting these two flags * * So, that appears to be the 'portable' way to do it. * * drivers/char/mem.c uses pgprot_noncached() */# ifdef pgprot_noncached vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);# endif Cheers, 

Dave
0 Kudos
Altera_Forum
Honored Contributor II
2,956 Views

I know about DMA and that it would be more efficient. However, in this case the goal is different - neither efficiency, no consistency are the issues. 

 

pgprot_noncached() is equivalent to setting _PAGE_PCD | _PAGE_PWT flags in CentOS 5 for x86: 

 

include/asm-x86_64/pgtable.h:314:#define pgprot_noncached(prot) (__pgprot(pgprot_val(prot) | _PAGE_PCD | _PAGE_PWT)) 

 

I verified that these flags are not set when mmap() is called/executed. 

 

 

I think it might be a HW issue. I found the following document that says that a particular PCIe device implementation is not cacheable due to implementation limitation: 

 

TI document "KeyStone Architecture Peripheral Component Interconnect Express (PCIe)": 

 

 

--- Quote Start ---  

No support for addressing modes other than incremental for burst transactions. 

Thus, the PCIe addresses cannot be in cacheable memory space 

--- Quote End ---  

 

 

 

Apparently, PCI had support for cache line access: 

 

Wikipedia article about Conventional_PCI#Burst_addressing 

 

 

So, I am wondering whether Stratix IV implementation of PCIe has any limitations/issues that does not allow it to work correctly with processor caching. 

 

 

P.S. Mmapping /dev/mem at PCIe BAR address was also tried with the same result: reads work as expected, writes cause system reboot.
0 Kudos
Altera_Forum
Honored Contributor II
2,956 Views

 

--- Quote Start ---  

I know about DMA and that it would be more efficient. However, in this case the goal is different - neither efficiency, no consistency are the issues. 

 

--- Quote End ---  

 

 

Ok. 

 

 

--- Quote Start ---  

 

I think it might be a HW issue ... Mmapping /dev/mem at PCIe BAR address was also tried with the same result: reads work as expected, writes cause system reboot. 

--- Quote End ---  

 

 

In this thread 

 

http://www.alteraforum.com/forum/showthread.php?t=35678 

 

there is a zip file containing a program called pci_debug. Could you try that. I've used it on the Cyclone IV GX kit, the Stratix IV GX kit, and the DE4, and it works fine on all of them. If you find writes causing your system to crash still, then it definitely sounds like a hardware issue. 

 

Cheers, 

Dave
0 Kudos
Altera_Forum
Honored Contributor II
2,956 Views

Thanks for the pointer to the tool! We have similar thing - dll for tcl which is quite flexible and convenient. 

 

I tried to use CentOS 6 (2.6.32), which actually has support for PAT, write-combining and has ioremp_cache() function. 

 

Write-combining/non-temporal writes (AVX/SSE) work as expected.  

 

Reads work as expected when PCI BAR is mmaped as cached. Non-temporal full cache line writes work as expected in this case. However, simple writes still cause the system to freeze and then to reboot. 

 

I also tried to do ioremp_cache() and then iowrite32() in the driver code and this also causes system freeze/reboot. 

 

The system Sandy Bridge i7.
0 Kudos
Altera_Forum
Honored Contributor II
2,956 Views

I got the same problem. 

Did you resolve it yet? 

Thanks!
0 Kudos
Reply