FPGA Intellectual Property
PCI Express*, Networking and Connectivity, Memory Interfaces, DSP IP, and Video IP
6414 Discussions

PCIe hprxm_master doesn't handle unaligned reads correctly

FvM
Valued Contributor III
1,583 Views
Hello,
we have an Arria 10 PCIe design with MM DMA interface, utilizing hprxm_master (bursting BAR2 rxm) for non-DMA access. Platform designer IP does a good job translating between different clock domains and interface widths for various slave components. There seems to be however a problem with unaligned (crossing 256 bit boundary) hprxm reads from slow slaves. In the appended Signaltap screenshot, I'm performing a 64 bit read at offset 0x1c. hprxm_master is splitting it into two read reads (burstcount = 2). hprxm read is requested at clock 0, first readdatavalid at clock 55, second at clock 100. The large latency is caused by a clock crossing bridge and interface width translation. Unfortunately, hprxm is sending completion message with data already at clock 63 without waiting for the second read, respectively it delivers arbitrary wrong data for the second word. The same unaligned read action performed per DMA gives correct result. Also unaligned read from fast (core clock domain + wide interface) slave, see second signaltap screenshot. Here both reads are occurring without intermediate latency.

I consider the reported hprxm_master behaviour as bug, DMA proves that unaligned read can work correctly with slow slaves. IP version is QPP 22.4 design suite.

Best regards
Frank
0 Kudos
12 Replies
wchiah
Employee
1,564 Views

Hi,


There is known issue reported for the MCDMA in Quartus v22.4

https://www.intel.com/content/www/us/en/docs/programmable/683821/22-4/known-issues.html

is it sound related to what you are facing now ?


Regards,

Wincent_Intel


0 Kudos
FvM
Valued Contributor III
1,551 Views

Hello,
thanks for linking the issue report. I don't see the reported problem directly related. The basic problem is premature read completion of hprxm_master. It's resulting in expected read completion, but partly wrong data content.

I'll try to record some internal hprxm_master data to understand why this happens.

Best regards,
Frank

0 Kudos
wchiah
Employee
1,496 Views

Hi,


Thanks for trying to understand why that happen.

If you have any found out, you can share with me if you dont mine.


Regards,

Wincent_Intel


0 Kudos
wchiah
Employee
1,466 Views

Hi,

 

I wish to follow up with you about this case.

Do you have any further questions on this matter ?

​​​​​​​Else I would like to have your permission to close this forum ticket

 

Regards,

Wincent_Intel


0 Kudos
FvM
Valued Contributor III
1,451 Views

Hi,

the problem is not solved!

My present understanding is that hprxm IP has no provisions to handle unaligned read requests correctly. I'm waiting for a statement from IP design team.

Regards
Frank

0 Kudos
wchiah
Employee
1,418 Views

Hi,


Do you means that you are contacting our internal IP design team ?

Is there anything else that you think that I can help you meanwhile ?


Regards,

Wincent_Intel


0 Kudos
FvM
Valued Contributor III
1,394 Views
Hi,
my answer was phrased ambitiously. I didn't yet contact design team, I'd need to make the contact through FAE. I was rather hoping that the discussion is monitored by support and forwarded respectively. If there's a formal procedure to issue a support request, we might choose the option.

Presently we designed around the issue by rearranging a memory map.

Best regards
Frank
0 Kudos
wchiah
Employee
1,373 Views

Hi,


You can file an IPS case via the FAE, then you can share the .qar file there for further debug.

If I understand correctly this is the design that you get from the IP catalog right ?

or this is a custom design ?


Regards,

Wincent_Intel


0 Kudos
FvM
Valued Contributor III
1,342 Views

Hi,
I could reduce the test setup reproducing the error to a 32-bit Avalon MM Slave (onchip memory) connected to hprxm master. An unaligned 64-bit read crossing the 256 bit address boundary gets wrong data. I put the test into PCIe_fundamental demonstration project for TR10a-HL development kit.
onchip_memory_2_1 is the 32-bit slave, onchip_memory_2_2 a 256-bit slave that doesn't show the error.

Regards
Frank

0 Kudos
wchiah
Employee
1,278 Views

Hi Frank,

 

Thanks for sharing with me your finding.
Do you test it in Arria 10 device as well ?

Regards,
Wincent_Intel

0 Kudos
FvM
Valued Contributor III
1,267 Views

Hi Wincent,
thanks for your continuous interest in the matter.

I'm doing all tests on Arria 10 development board TR10a-HL. The observed effect with my latest demo code is basically the same as documented in SignalTap screenshots appended in initial post. The basic difference is that I stripped of a clock-crossing bridge used in the original design and any application code not related to the issue. 

A new signal tap recording is contained in .qar if you want to check.

I'm performing an unaligned 64-bit read which crosses a 256-bit word border of the hprxm master, e.g. reading from address 0x1c. Respectively a burst read is with burst count=2 is generated, reading first 256 bit from 0x00 and second 256 bit from 0x20. hprxm master logic is expected to assemble read data to a single completion TLP with data.

I tested two cases:
- hprxm reads from a 32 bit MM slave, respectively a 256-bit read is performed in 8 consecutive read cycles, unaligned read takes 16 cycles + slave latency.
- hprxm reads from 256 bit MM slave, unaligned read takes only 2 cycles + latency.

I can see that hprxm assembles read data correctly, unfortunately it doesn't wait for burst read completion in case 1, sending arbitrary data from a previous read instead. Case 2 returns correct data, the completion TLP is send-off after burst read has finished.

Looking at internal hprxm read processing, I see the same state sequence triggered in both cases. I didn't yet a recognize a provision to wait for burst read finish before sending completion TLP. Either it's not there or it's not triggered due to wrong assumptions.

Why are we using bursting hprxm interface instead of default 32-bit BAR interface?
- It's necessary to support native memory access of a 64-bit processor.
- We want to implement effective access to 256-bit slaves in DMA and non-DMA mode.

As previously mentioned, we have adjusted application register map to avoid unaligned reads, so it's not an urgent problem any more. Nevertheless I'd appreciate a correction of the seeming hprxm bug.

Best regards

Frank    

0 Kudos
wchiah
Employee
1,237 Views

Hi Frank,

Thanks for sharing with us the solution. the best help I can do on this is to submit an internal ticket for this issue.
Hope the related IP design team will look at this issue.

Therefore following our support policy, I have to put this case in close status.

Hence, This thread will be transitioned to community support.

If you have a new question, feel free to open a new thread to get support from Intel experts.

Otherwise, the community users will continue to help you on this thread. Thank you

If your support experience falls below a 9 out of 10, I kindly request the opportunity to rectify it before concluding our interaction. If the issue cannot be resolved, please inform me of the cause so that I can learn from it and strive to enhance the quality of future service experiences. 

 

Regards,

Wincent_Intel



0 Kudos
Reply