Community
cancel
Showing results for 
Search instead for 
Did you mean: 
MHada
Beginner
563 Views

DDR3 - CYCLONE V SOC FAILING ON HARDWARE

We have designed a hardware in which we have developed three boards using CYCLONE V SOC using DDR3 - 8Gb [3 numbers on every board] . In this we have found that 2 hardwares are working perfectly fine and in the third hardware the Uboot and Linux is not booting. Doubting the assembly we have replaced all three DDR3. But we found that result is same. I am not sure whether replacing the CYCLONE V SOC will help. 

 

When I have now done further diagnosis I find following messages :- 

 

U-Boot SPL 2013.01.01 (Sep 09 2019 - 16:53:19)

BOARD : Altera SOCFPGA Cyclone V Board

CLOCK: EOSC1 clock 25000 KHz

CLOCK: EOSC2 clock 10000 KHz

CLOCK: F2S_SDR_REF clock 0 KHz

CLOCK: F2S_PER_REF clock 0 KHz

CLOCK: MPU clock 925 MHz

CLOCK: DDR clock 300 MHz

CLOCK: UART clock 100000 KHz

CLOCK: MMC clock 50000 KHz

CLOCK: QSPI clock 370000 KHz

RESET: WARM

INFO : Watchdog enabled

SDRAM: Initializing MMR registers

SDRAM: Calibrating PHY

SEQ.C: Preparing to start memory calibration

SEQ.C: DQS Enable ; Group 0 ; Rank 0 ; Start VFIFO 6 ; Phase 2 ; Delay 13

SEQ.C: DQS Enable ; Group 0 ; Rank 0 ; End  VFIFO 7 ; Phase 2 ; Delay 8

SEQ.C: DQS Enable ; Group 0 ; Rank 0 ; Center VFIFO 6 ; Phase 6 ; Delay 11

SEQ.C: Read Deskew ; DQ 0 ; Rank 0 ; Left edge 22 ; Right edge 27 ; DQ delay 2 ; DQS delay 9

SEQ.C: Read Deskew ; DQ 1 ; Rank 0 ; Left edge 17 ; Right edge 27 ; DQ delay 0 ; DQS delay 9

SEQ.C: Read Deskew ; DQ 2 ; Rank 0 ; Left edge 17 ; Right edge 27 ; DQ delay 0 ; DQS delay 9

SEQ.C: Read Deskew ; DQ 3 ; Rank 0 ; Left edge 18 ; Right edge 27 ; DQ delay 0 ; DQS delay 9

SEQ.C: Read Deskew ; DQ 4 ; Rank 0 ; Left edge 21 ; Right edge 27 ; DQ delay 2 ; DQS delay 9

SEQ.C: Read Deskew ; DQ 5 ; Rank 0 ; Left edge 19 ; Right edge 27 ; DQ delay 1 ; DQS delay 9

SEQ.C: Read Deskew ; DQ 6 ; Rank 0 ; Left edge 17 ; Right edge 27 ; DQ delay 0 ; DQS delay 9

SEQ.C: Read Deskew ; DQ 7 ; Rank 0 ; Left edge 18 ; Right edge 27 ; DQ delay 0 ; DQS delay 9

SEQ.C: Write Deskew ; DQ 0 ; Rank 0 ; Left edge 31 ; Right edge 18 ; DQ delay 6 ; DQS delay 4

SEQ.C: Write Deskew ; DQ 1 ; Rank 0 ; Left edge 31 ; Right edge 20 ; DQ delay 5 ; DQS delay 4

SEQ.C: Write Deskew ; DQ 2 ; Rank 0 ; Left edge 31 ; Right edge 22 ; DQ delay 4 ; DQS delay 4

SEQ.C: Write Deskew ; DQ 3 ; Rank 0 ; Left edge 31 ; Right edge 23 ; DQ delay 4 ; DQS delay 4

SEQ.C: Write Deskew ; DQ 4 ; Rank 0 ; Left edge 31 ; Right edge 19 ; DQ delay 6 ; DQS delay 4

SEQ.C: Write Deskew ; DQ 5 ; Rank 0 ; Left edge 31 ; Right edge 21 ; DQ delay 5 ; DQS delay 4

SEQ.C: Write Deskew ; DQ 6 ; Rank 0 ; Left edge 31 ; Right edge 23 ; DQ delay 4 ; DQS delay 4

SEQ.C: Write Deskew ; DQ 7 ; Rank 0 ; Left edge 31 ; Right edge 21 ; DQ delay 5 ; DQS delay 4

SEQ.C: DM Deskew ; Group 0 ; Left edge 31; Right edge 24; DM delay 3

SEQ.C: Read after Write ; DQ 0 ; Rank 0 ; Left edge 25 ; Right edge 22 ; DQ delay 2 ; DQS delay 10

SEQ.C: Read after Write ; DQ 1 ; Rank 0 ; Left edge 21 ; Right edge 22 ; DQ delay 0 ; DQS delay 10

SEQ.C: Read after Write ; DQ 2 ; Rank 0 ; Left edge 21 ; Right edge 22 ; DQ delay 0 ; DQS delay 10

SEQ.C: Read after Write ; DQ 3 ; Rank 0 ; Left edge 20 ; Right edge 22 ; DQ delay 0 ; DQS delay 10

SEQ.C: Read after Write ; DQ 4 ; Rank 0 ; Left edge 24 ; Right edge 22 ; DQ delay 2 ; DQS delay 10

SEQ.C: Read after Write ; DQ 5 ; Rank 0 ; Left edge 22 ; Right edge 22 ; DQ delay 1 ; DQS delay 10

SEQ.C: Read after Write ; DQ 6 ; Rank 0 ; Left edge 20 ; Right edge 22 ; DQ delay 0 ; DQS delay 10

SEQ.C: Read after Write ; DQ 7 ; Rank 0 ; Left edge 21 ; Right edge 22 ; DQ delay 0 ; DQS delay 10

SEQ.C: DQS Enable ; Group 1 ; Rank 0 ; Start VFIFO 6 ; Phase 2 ; Delay 9

SEQ.C: DQS Enable ; Group 1 ; Rank 0 ; End  VFIFO 7 ; Phase 2 ; Delay 4

SEQ.C: DQS Enable ; Group 1 ; Rank 0 ; Center VFIFO 6 ; Phase 6 ; Delay 7

SEQ.C: Read Deskew ; DQ 8 ; Rank 0 ; Left edge 22 ; Right edge 27 ; DQ delay 2 ; DQS delay 9

SEQ.C: Read Deskew ; DQ 9 ; Rank 0 ; Left edge 17 ; Right edge 27 ; DQ delay 0 ; DQS delay 9

SEQ.C: Read Deskew ; DQ 10 ; Rank 0 ; Left edge 18 ; Right edge 27 ; DQ delay 0 ; DQS delay 9

SEQ.C: Read Deskew ; DQ 11 ; Rank 0 ; Left edge 18 ; Right edge 27 ; DQ delay 0 ; DQS delay 9

SEQ.C: Read Deskew ; DQ 12 ; Rank 0 ; Left edge 21 ; Right edge 27 ; DQ delay 2 ; DQS delay 9

SEQ.C: Read Deskew ; DQ 13 ; Rank 0 ; Left edge 18 ; Right edge 27 ; DQ delay 0 ; DQS delay 9

SEQ.C: Read Deskew ; DQ 14 ; Rank 0 ; Left edge 18 ; Right edge 27 ; DQ delay 0 ; DQS delay 9

SEQ.C: Read Deskew ; DQ 15 ; Rank 0 ; Left edge 19 ; Right edge 27 ; DQ delay 1 ; DQS delay 9

SEQ.C: Write Deskew ; DQ 8 ; Rank 0 ; Left edge 31 ; Right edge 21 ; DQ delay 5 ; DQS delay 4

SEQ.C: Write Deskew ; DQ 9 ; Rank 0 ; Left edge 31 ; Right edge 21 ; DQ delay 5 ; DQS delay 4

SEQ.C: Write Deskew ; DQ 10 ; Rank 0 ; Left edge 31 ; Right edge 24 ; DQ delay 4 ; DQS delay 4

SEQ.C: Write Deskew ; DQ 11 ; Rank 0 ; Left edge 31 ; Right edge 22 ; DQ delay 5 ; DQS delay 4

SEQ.C: Write Deskew ; DQ 12 ; Rank 0 ; Left edge 31 ; Right edge 20 ; DQ delay 6 ; DQS delay 4

SEQ.C: Write Deskew ; DQ 13 ; Rank 0 ; Left edge 31 ; Right edge 21 ; DQ delay 5 ; DQS delay 4

SEQ.C: Write Deskew ; DQ 14 ; Rank 0 ; Left edge 31 ; Right edge 23 ; DQ delay 4 ; DQS delay 4

SEQ.C: Write Deskew ; DQ 15 ; Rank 0 ; Left edge 31 ; Right edge 23 ; DQ delay 4 ; DQS delay 4

SEQ.C: DM Deskew ; Group 1 ; Left edge 31; Right edge 25; DM delay 3

SEQ.C: Read after Write ; DQ 8 ; Rank 0 ; Left edge 25 ; Right edge 22 ; DQ delay 2 ; DQS delay 10

SEQ.C: Read after Write ; DQ 9 ; Rank 0 ; Left edge 20 ; Right edge 22 ; DQ delay 0 ; DQS delay 10

SEQ.C: Read after Write ; DQ 10 ; Rank 0 ; Left edge 21 ; Right edge 22 ; DQ delay 0 ; DQS delay 10

SEQ.C: Read after Write ; DQ 11 ; Rank 0 ; Left edge 21 ; Right edge 22 ; DQ delay 0 ; DQS delay 10

SEQ.C: Read after Write ; DQ 12 ; Rank 0 ; Left edge 24 ; Right edge 22 ; DQ delay 2 ; DQS delay 10

SEQ.C: Read after Write ; DQ 13 ; Rank 0 ; Left edge 21 ; Right edge 22 ; DQ delay 0 ; DQS delay 10

SEQ.C: Read after Write ; DQ 14 ; Rank 0 ; Left edge 21 ; Right edge 22 ; DQ delay 0 ; DQS delay 10

SEQ.C: Read after Write ; DQ 15 ; Rank 0 ; Left edge 23 ; Right edge 22 ; DQ delay 1 ; DQS delay 10

SEQ.C: CALIBRATION FAILED

SEQ.C: Calibration Summary

SEQ.C: Calibration Failed

SEQ.C: Error Stage : 1 - VFIFO

SEQ.C: Error Substage: 1 - GUARANTEED READ

SEQ.C: Error Group : 2

### ERROR ### Please RESET the board ###

 

 

Now If we analyze the above, I find that calibration is failing for Group 2 [DQ16 to DQ23]. But whether it is for DDR3 or whether it is due to FPGA side. Whether it is an assembly issue at all? Remaining two identical boards are working perfectly fine.

 

Please let us know.

 

0 Kudos
10 Replies
HBhat2
New Contributor I
290 Views

Hi,

 

I faced some similar issues in the past. With my experience, some of the terminations registers (for control/address signals) are gone bad/not properly soldered to the pads. Apart from that, there may be solder balls shorting the adjacent data lines could be another cause.

 

With Regards,

HPB

MHada
Beginner
290 Views

Hi HBhat2,

 

Thanks for the reply. I have also faced similar issues which have been tackled through assembly corrections on address lines.

 

In this case I have checked all and even replaced the components once even decoupling caps etc apart from term resistors, still no improvement.

 

I even bared VIAs below DDR3 to check for DDR3 data lines. All are showing an impedance value similar around 5.6K on a 20K DMM scale and so I am sure there is neither a DDR3 data lines shorting among each other nor there is a short with ground. Even, during calibration which eventually fails, all the data lines are showing toggling of data which removes my confusion that FPGA is disconnected with DDR3 on data path.

 

We have actually delivered 6 boards without any such problem. But this one has stuck the whole development for last 25 days.

 

I want to further read into this debugging messages to understand that whether we are actually on some timing margin issue. How to understand this further? Whether I can use EMI toolkit - I don't think so.

MHada
Beginner
290 Views

One more small thing, it is not that the board never booted, it boots once in a while and then gets stuck... this has also happened at times.

NurAida_A_Intel
Employee
290 Views

Hi MHada,

 

You can also go through this calibration checklist to see if anything might be applicable in the issue --> https://www.intel.com/content/www/us/en/programmable/support/support-resources/support-centers/devic...

 

I would also suggest you to generate an example design with same device and all the memory setting and see if the calibration pass/fail. If it passed, then you know that issue is possibly related to board issue or any other build issue. If it failed, then highly it’s DDR3 interface issue. 

 

Thanks

 

Regards,

NAli1

 

 

MHada
Beginner
290 Views

Hi NAli1,

 

Does your suggestion work on the HPS side of the Cyclone V SOC as well.

 

My failure is HPS side DDR3 interface which is working on remaining 6 boards.

 

Thanks

NurAida_A_Intel
Employee
290 Views

Hi MHada,

 

I'm sorry that I misunderstood your issue at the first place.

 

Let me try to re-explain the issue again to make sure that we are on same page:

  • You are having three boards using CYCLONE V SOC using DDR3 - 8Gb [3 numbers on every board].  
  • 1 out of 3 boards having calibration failure.
  • Replaced all three DDR3 but still facing the issue.

 

If the above understanding is correct, then seems like there is no issue with the IP since the DDR3 is working fine with the other boards. If it is having issue, then high possibility it will also failed in the other 2 boards.

 

I see this is regarding board to board variation. Please re-check all the board setting and connection. If it is good, high possibility the issue is due to the FPGA side. I suggest you to contact your sales FAE and proceed for ERMA process.

 

Thanks

 

Regards,

NAli1

 

 

 

MHada
Beginner
290 Views

Hi NAli1,

 

You have understood the whole thing perfectly.

 

The only thing left is to replace the FPGA. Rest all the components and connections have been verified. Even most have them have been assembled twice including DDR3.

 

Now, when I discussed here in INDIA with FAE from Cytech Macnica, they are telling that this can be due to timing issues due to board to board variation. They are telling that it may be so that

other boards have just passed marginally and this board is failing on the timing due to board to board variation. If that is the case of being so marginal, my customers are using my boards for last 6 months, at least once there should be a complaint on working boards.

 

We in my last 15 years have never seen a board working on 6 numbers [4 numbers is first stage and 2 numbers in second stage, with one left which is failing, I have to deliver total - 7] and then failing suddenly on one due to PCB variation and marginal timing issue. The difference between two stages was just an addition of a mounting hole at a remote corner.

 

If at all it is a marginal timing issue then how do I debug further, if at alll!!!

MHada
Beginner
290 Views

If I have confused a little.

 

We had two stages of the board -

 

4 delivered in first stage and 3 in the second stage. In second stage only one mounting hole was added on one corner, no track modification. In second stage, 2 boards out of three are working. But overall DDR interface is untouched for all the 7 boards of which 6 are working and one left on which I have raised this question.

 

NurAida_A_Intel
Employee
290 Views

Dear MHada,

 

I totally understand the pain your are facing.

 

Yes, we did experienced this timing issue that also caused the calibration failure. But in this case, I assume you did check and verify that the DDR3 settings and timing parameters are match the device specs on the datasheet. In other word, the DDR timing in Quartus is clean as per you said, you did check and verify all the components and connections. Plus, the DDR3 is passing on the other boards which make me strongly believed it is not due to the DDR3 IP. There are two possibilities that I can think of here is either the issue due to board/board setting entered in Quartus or the FPGA itself. Before you swap the FPGA, you may want to try to test on below recommendation and see if this make any changes to your issue.

 

I see the calibration failure you are facing based on the report is at Stage 1.

SEQ.C: Error Stage : 1 - VFIFO

SEQ.C: Error Substage: 1 - GUARANTEED READ

which is usually due to an address and command skew issue, with respect to the memory clock (CK), at the memory device.

A recommended measurement is:

 

  • At the memory device, using balanced scope probes, measure the memory clock rising edge and the chip select low pulse.
  • Verify the setup and hold time is reasonably balanced.
  • If not, there may be setup up or hold violations on the address/command which typically results in a “Guaranteed Read Failure” error message.
  • If the HPS SDRAM GUI has Dynamic ODT set to an RZQ/# value, set it to disabled and set the Nominal ODT to the required termination value, recompile, and see if this makes any difference to your tests.

 

Also, just a quick check, make sure you run board simulation and enter the correct board skew. If no, please use the latest board skew parameter tool to accurately calculate the board skews:

https://www.intel.com/content/www/us/en/programmable/solutions/technology/memory/estimator/board-ske...

 

 

Regarding the debug, unfortunately, the Cyclone V HPS EMIF do not support the external memory interface toolkit. To debug the HPS EMIF, you can change the settings inside the preloader software to enable Runtime Calibration Report and Debug Level info. In addition, you can use the preloader software to check the status of HPS SDRAM PLL. Refer to Using the Preloader To Debug the HPS SDRAM on page 64 for more information. -->https://www.intel.com/content/dam/www/programmable/us/en/pdfs/literature/an/an-cv-av-soc-ddg.pdf

 

Hopefully this helps.

 

Thanks

 

Regards,

NAli1

 

 

MHada
Beginner
290 Views

Hi NAli,

 

Thanks a lot for the detailed reply.

 

I will check all the points and reply in a couple of days.

 

Thanks a lot!!

Reply