Programmable Devices
CPLDs, FPGAs, SoC FPGAs, Configuration, and Transceivers
20789 Discussions

Arria 10 HPS EMIF DDR3 Calibration Failure

gschuell
Novice
2,793 Views

I am bringing up a custom Arria 10 SX board and have been running into issues with getting my DDR3 SODIMM to calibrate when connected to the HPS EMIF conduit. 

The EMIF example design calibrates properly using the same timing parameters as I have in the HPS project.

However, when I program the FPGA via JTAG and then run the U-Boot SPL, the DDRCAL always fails.  I started with the GHRD for the Arria 10 SoC Dev Kit and simply changed the EMIF settings from DDR4 to the DDR3 configuration that passes calibration as a standalone example design.

Most of the debug guidelines suggest getting the example design to calibrate successfully and then use those settings for the HPS project, but what other things could cause the calibration failure only on the HPS side?

Are there any debug interfaces that could be useful in gathering additional calibration status information?

 

Thanks

19 Replies
AdzimZM_Intel
Employee
2,750 Views

Hi!


"The EMIF example design calibrates properly using the same timing parameters as I have in the HPS project."

  • Can you share the pin connection of the EMIF interface in the EMIF example design?


"Most of the debug guidelines suggest getting the example design to calibrate successfully and then use those settings for the HPS project, but what other things could cause the calibration failure only on the HPS side?"

  • After confirming with FPGA EMIF that the memory and device is working properly, then need to check on U-Boot side.


"Are there any debug interfaces that could be useful in gathering additional calibration status information?"

  • Unfortunately we don't have a debug interface to debug the HPS EMIF IP directly.
  • To debug the HPS EMIF IP, we instantiated a FPGA EMIF interface at the HPS IO Bank and checked the calibration report from EMIF Debug Toolkit.


Regards,

Adzim


0 Kudos
gschuell
Novice
2,742 Views

Hello Adzim,

 

Thank you for your reply. 

I've attached the pin connections for the EMIF example design for your review. 

Please let me know if this is what you were looking for.

 

Thanks,

Greg

0 Kudos
AdzimZM_Intel
Employee
2,706 Views

Hi Greg,


Thank you for providing the pin assignment.

  • The pins are placed at HPS EMIF IP.
  • Calibration is successful at this pin assignment and with IP timing parameter.
    • Can you provide the EMIF IP setting from the Parameter Editor GUI?
    • Please provide the memory datasheet as well for checking purpose.


But there is an issue in the U-Boot when using HPS EMIF IP.

  1. Can you share some snapshots of the errors?
  2. Is there any timing violation occurred after compiling the design?
  3. May I know which Quartus version and OS environment that you used?


Thanks!

Regards,

Adzim


0 Kudos
gschuell
Novice
2,673 Views

Hi Adzim,

 

I will gather this information and send it as soon as I get a chance.

 

Thanks,

Greg

0 Kudos
AdzimZM_Intel
Employee
2,638 Views

Hi Greg,


Is there any update in this thread?


Thank you.


0 Kudos
gschuell
Novice
2,625 Views

Hi Adzim,

Sorry for the delayed response.  I have been working on creating a clean version of the project as I have attempted a number of permutations during debug and wanted to start everything from scratch.

One of the issues I've been struggling with related to your questions above is not being clear on a supported combination of versions for Quartus, ARM DS, and U-Boot to use for this project.  When I run newer versions of U-Boot  (such as 2022.10 per the Bootloader for Arria10 instructions on RocketBoards), I get errors complaining about the symbol file format (I believe it is DWARF5 vs the previously used DWARF4 symbols) such as this when running the "run-u-boot.ds" debugger script:

[target] Starting debug server
[target] Waiting for debug server to start accepting connections
[target] Debug server started successfully
[target]  
Connected to running target Cortex-A9_0
Execution stopped in ABT mode at S:0xFFE00080
S:0xFFE00080   B        {pc} ; 0xffe00080
WARNING(CMD315): Target is not running
Target has been reset
Execution stopped in ABT mode at S:0xFFE00080
S:0xFFE00080   B        {pc} ; 0xffe00080
Restoring Binary file /$MY_PROJECT_FOLDER/a10_soc_devkit_ghrd/software/bootloader/u-boot-socfpga/spl/u-boot
-spl-dtb.bin into memory
   Restoring section S:0x00000000 to S:0x00012A44 into memory S:0xFFE00000 to S:0xFFE12A44
ERROR(CMD685-IMG54-IMG33):  
# in /$MY_PROJECT_FOLDER/a10_soc_devkit_ghrd/software/bootloader/run-u-boot.ds:11 while executing: symbol-f
ile u-boot-socfpga/spl/u-boot-spl
! Failed to load symbols for "u-boot-spl"
! Failed to demand load DWARF debugging information: section .debug_info, offset 0xc
! Section .debug_line offset 0x0: Invalid line table version number 5

I'm currently working on Linux with Quartus 23.1 Pro, ARM DS 2022.2, and U-Boot 2022.10.  Is there a recommended combination of versions that I should be using for this effort?

 

Thanks,

Greg

0 Kudos
AdzimZM_Intel
Employee
2,569 Views

Hi Greg,


You're already using the latest version. This should be fine.


Since you're using a custom board, can you confirm below points:

  • The board skew parameter has been set to right configuration for EMIF IP in custom board?
  • The EMIF example design can pass calibration and working properly in custom board?
  • In Arria 10 SoC Dev Kit, the EMIF example design and GHRD are working properly?


Can you take some snapshots of the EMIF IP and share it here?

Also can you let me know the memory part that you used? You can share the memory datasheet if possible.


Thanks!

Regards,

Adzim


0 Kudos
gschuell
Novice
2,551 Views

Hi Adzim,

 

Thanks for the confirmation on the versioning.  Please see answers to your questions below:

 

"The board skew parameter has been set to right configuration for EMIF IP in custom board?"

-I do not have accurate skew information at the moment and am using defaults here.

 

"The EMIF example design can pass calibration and working properly in custom board?"

-The EMIF example design passes calibration (but does not provide a result (pass/fail/timeout) for the traffic generator for some reason). 

I have added an ISSP block to toggle the global reset and monitor the EMIF memory status bits.  I've attached screenshots showing the ISSP result as well as the EMIF Toolkit calibration report information.

 

"In Arria 10 SoC Dev Kit, the EMIF example design and GHRD are working properly?"

-I have created DDR4 HILO and DDR3 HILO EMIF example designs and GHRD designs for the Arria 10 SoC Dev Kit and these all work properly.

 

"Can you take some snapshots of the EMIF IP and share it here?"

-Please see attached.

 

"Also can you let me know the memory part that you used? You can share the memory datasheet if possible."

-I've attached the SODIMM and memory component datasheets for your review.

 

Thanks,

Greg

 

 

 

 

0 Kudos
AdzimZM_Intel
Employee
2,496 Views

Hi Greg,


Thank you for the information.


I have checked the EMIF setting with the given memory datasheet.

Some of your setting may not be accurate as stated in the datasheet.

I may suggest you to compare with the snapshot that I will attach later.


For HPS EMIF design in custom board, do you see any timing issue in the timing report?


I like to suggest you to perform the board simulation first to get the accurate skew information because this part can be a potential cause of the calibration issue.


Regards,

Adzim


0 Kudos
gschuell
Novice
2,444 Views

Hi Adzim,

 

Thank you for sending the updated parameters.  I have applied the suggested settings and retested with similar results (i.e. EMIF example design passes local calibration with similar margins and HPS fails to calibrate and/or hangs on U-Boot SPL).

I did have some additional questions:

1.  We are currently testing with a memory clock of 400MHz (DDR3-800) with the thought that slowing down the memory interface would give us the best chance of successful calibration.  Is this a good idea, or should we run at the rated DDR3-1600 speed of the SODIMM?

2.  Are the settings you provided taking into account our memory clock of 400MHz, or should we make additional adjustments (such as dropping our CAS latency to 6 and write CAS latency to 5)?

3.  For tRRD, tWTR, and tRTP, you changed the values from 4 cycles to 3.  Is this correct per pg. 81 of the memory datasheet?  Also, would you set tFAW to 50ns for DDR3-800 or 40ns for DDR3-1600 since we're using x16 devices?

4.  Since the EMIF example design passes calibration, are there things specific to the GHRD that we should be looking at (e.g. reset, PLL lock status, any debug registers, etc) to let us know the status of the calibration?

 

Thanks,

Greg

0 Kudos
gschuell
Novice
2,442 Views

Hi Adzim,

 

I also got some skew information from the board designer and have attached it here.  Did not notice a major difference in the results when applied.

 

Thanks,

Greg

0 Kudos
AdzimZM_Intel
Employee
2,358 Views

Hi Greg,


Thank you for the update.


1. We are currently testing with a memory clock of 400MHz (DDR3-800) with the thought that slowing down the memory interface would give us the best chance of successful calibration. Is this a good idea, or should we run at the rated DDR3-1600 speed of the SODIMM?

  • Yes it's sometimes can pass the calibration at lower frequency.
  • Have you tried to run at 800MHz? This may need to change some settings that related to memory clock frequency.


2. Are the settings you provided taking into account our memory clock of 400MHz, or should we make additional adjustments (such as dropping our CAS latency to 6 and write CAS latency to 5)?

  • Yes I have used the memory clock of 400MHz to set the tIS, tIH, tRRD, tFAW, tWTR, tRTP.
  • The CAS latency and write CAS latency have been taken from Speed Bin table for DDR3-1600.


3. For tRRD, tWTR, and tRTP, you changed the values from 4 cycles to 3. Is this correct per pg. 81 of the memory datasheet? Also, would you set tFAW to 50ns for DDR3-800 or 40ns for DDR3-1600 since we're using x16 devices?

  • My understanding on the datasheet that you have given that the base device is configured with 256Meg x 8.
  • Then the page size for this configuration is 1KB.
  • Therefore I have referred to DDR3-1600 Speed grade column and 1KB page size row.
  • The cycle values are been calculated based on the memory clock of 400MHz.



4. Since the EMIF example design passes calibration, are there things specific to the GHRD that we should be looking at (e.g. reset, PLL lock status, any debug registers, etc) to let us know the status of the calibration?

  • The HPS EMIF is a harden IP that is not exposed to user logic.
  • The status signal is not provided from the IP to analyze the calibration status.
  • That's why we used FPGA EMIF that placed at similar pin location and have similar EMIF IP configuration to debug the interface.
  • But currently the calibration is okay at the FPGA EMIF.


Since you have the skew information, are you seeing any timing violation in the design after finishing compilation process?


Thanks.

Regards,

Adzim


0 Kudos
gschuell
Novice
2,284 Views

Hi Adzim,

 

Thank you for the information. 

 

I don't see any timing violations in the HPS or EMIF example designs with or without the board skew being applied.

 

Given that the EMIF example design passes calibration (at least per the local_cal_status_pass signal), can you think of anything that would cause the HPS to fail other than something in the HPS design structure, timing errors, or the U-Boot SPL bootloader configuration? 

 

Does the HPS calibration do anything different that what occurs during the EMIF example design calibration (e.g. the EMIF always exercises the same address, but the HPS does a more exhaustive memory test)?

 

Or should any DDR3 memory configuration that passes calibration for the EMIF example design always pass when combined with the HPS EMIF conduit?

 

For the HPS test, I am programming the full FPGA bitstream (i.e. not using Early Release mode) via JTAG and then running the ARM DS bootloader script from RocketBoards to load and execute the SPL initialization code.  Does this seem like the best way to test the HPS calibration, or should I try another method?

 

Thanks,

Greg

0 Kudos
AdzimZM_Intel
Employee
2,268 Views

Hi Greg,


"Given that the EMIF example design passes calibration (at least per the local_cal_status_pass signal), can you think of anything that would cause the HPS to fail other than something in the HPS design structure, timing errors, or the U-Boot SPL bootloader configuration?"

  • From the EMIF perspective, we always check the HPS EMIF design by implementing FPGA EMIF interface with HPS EMIF pinout. So then we can check the issue in between the memory and FPGA device.
  • But then there is no issue at EMIF part.


"Does the HPS calibration do anything different that what occurs during the EMIF example design calibration (e.g. the EMIF always exercises the same address, but the HPS does a more exhaustive memory test)?"

  • No, it's run through similar process as describe in the User Guide.


"Or should any DDR3 memory configuration that passes calibration for the EMIF example design always pass when combined with the HPS EMIF conduit?"

  • It's should be passed calibration as well as FPGA EMIF design.


"For the HPS test, I am programming the full FPGA bitstream (i.e. not using Early Release mode) via JTAG and then running the ARM DS bootloader script from RocketBoards to load and execute the SPL initialization code. Does this seem like the best way to test the HPS calibration, or should I try another method?"


Regards,

Adzim


0 Kudos
gschuell
Novice
2,229 Views

Hi Adzim,

 

We did some more testing with the EMIF example design with various configurations on both our custom board and the Arria 10 SoC Development Kit.  I've attached a spreadsheet detailing the results as well as a document describing the board skew information (trace lengths are shown in mm and skew is calculated in ps).

Based on these results, it appears that we pass the "local_cal" in all configurations on both boards with what to our eyes appear to be equivalent (or in some cases better) timing margins. 

But for the dev kit, in all configurations (regardless of PLL, memory clock, or Half/Quarter rate user clock), we always additionally pass the traffic generator test.

On our custom board, it looks like we pass the traffic generator test when we are operating in quarter rate user clock mode, but in half rate user clock mode we don't get any result from the traffic generator test (no pass, fail, or timeout).  Do you have any idea what could cause this?  I would at least expect a fail or timeout if there was an issue detected during the calibration.

Also, when I step through the U-Boot SPL code in the debugger on the HPS design, I am seeing the code jump to an error handler that says to "hang forever" very close to a comment that says that the MPFE hang workaround should be complete.  This looks like it would be just before or possibly during DDR calibration.  The MPFE seems to refer to the "Multi-port Front End" memory arbiter.  Any ideas what could cause this?

Thanks,

Greg

0 Kudos
AdzimZM_Intel
Employee
2,195 Views

Hi Greg,


Thank you for the update on your tests.


On our custom board, it looks like we pass the traffic generator test when we are operating in quarter rate user clock mode, but in half rate user clock mode we don't get any result from the traffic generator test (no pass, fail, or timeout). Do you have any idea what could cause this? I would at least expect a fail or timeout if there was an issue detected during the calibration.

  • I'm not aware of this issue that occur when using half rate clock mode.
  • Maybe you can try to run with a few seed and check if that can give another result.
  • But so far the calibration is always passing in all board which is means that the memory interface should be working fine on the board.


Also, when I step through the U-Boot SPL code in the debugger on the HPS design, I am seeing the code jump to an error handler that says to "hang forever" very close to a comment that says that the MPFE hang workaround should be complete. This looks like it would be just before or possibly during DDR calibration. The MPFE seems to refer to the "Multi-port Front End" memory arbiter. Any ideas what could cause this?


Regards,

Adzim


0 Kudos
gschuell
Novice
1,776 Views

Hi Adzim,

 

Thank you for your response.  To the best of my knowledge, I am creating the bootloader per the Rocketboards guidelines but it would be very helpful to communicate directly with someone that is familiar with this flow (especially for custom hardware).

You mentioned that you were in contact with someone from the Embedded team.  Would it be possible to have them contact me directly at this point? 

We have been fighting this problem for quite a while and we can not make any further progress with our board bringup or software development until this is resolved. 

 

Thanks,

Greg

 

0 Kudos
AdzimZM_Intel
Employee
1,759 Views

Hi Greg,


I think it's better for you to open another thread for checking on bootloader part.

So then the embedded expert can directly help you on that part.

Please let me know on your concern. Thanks!


Regards,

Adzim


0 Kudos
Reply