Programmable Devices
CPLDs, FPGAs, SoC FPGAs, Configuration, and Transceivers
Announcements
FPGA community forums and blogs on community.intel.com are migrating to the new Altera Community and are read-only. For urgent support needs during this transition, please visit the FPGA Design Resources page or contact an Altera Authorized Distributor.
21615 Discussions

Arria II GX Altmemphy DDR3 Bit Error problem

Altera_Forum
Honored Contributor II
1,417 Views

Using Quartus II, 10.0, sp 1.191 (yes, I installed the .191 patch for the memory controller problem). Arria II GX EP2AGX45DF29C5 

 

I really hope I can explain this error, it has me pulling my hair out for the past month. 

 

My design has been working great for 2 years. Its a video design, where Im using the Altmemphy DDR3 SDRAM high performance controller in the design, interfacing with Micron MT41J64M16LA-15E (1GB) RAM externally. The SDRAM is essentially a video frame buffer, where Red, Green, Blue (RGB) pixel data is stored as 24 bit data in a 32 bit word (the upper 8 bits are not used, while the lower 24 is RGB data). As I mentioned for two years, the design has worked perfectly writing and reading images out from the SDRAM with no unwanted artifacts. That is, until a necessary, recent architecture change. 

 

Due to larger image sizes and frame rate (60 images per second), I've been forced to pack data into SDRAM more efficiently. So now instead of ignoring the upper byte in each 32 bit word, I am packing the next pixel component. So where the first three pixels used to be 

 

0x00RRGGBB 

0x00RRGGBB 

0x00RRGGBB 

 

The data is now written as: 

 

0xRRGGBBRR 

0xGGBBRRGG 

0XBBRRGGBB 

 

So the new problem with this, is artifacts in my image. I've traced it through and it is indeed happening on the Write to SDRAM. It only happens in certain patterns of data where each write into the controller might be all '1's one cycle then all '0's the next. The data interface to the SDRAM controller is a 4 beat burst, which equates to two beats of 64 bytes. What is really confusing is, upon first glance it seems that using that previously unused lane of data might indicate a slam dunk that its that one lane of data going to the SDRAM. But the Physical data bus is 16 bits wide. So if it were a pad to out problem or something like that, you'd see this on every 16 bit boundry. So I have to exonerate the phyical interface to the RAM, especially, since the old, unpacked architecture never exhibits this problem.  

 

Here is kind of what I'm seeing in actual ram. Data that should look like: 

 

0xFFFFFF00 

0x0000FFFF 

0xFF000000 

0xFFFFFF00 

 

looks like: 

 

0xFFFFFF00 

0x00007FFF 

0xFF000000 

0x7FFFFF00 

 

 

My second inclination is that, since, in order to satisfy the 64 bit wide writes to the controller, I have to run my data through a 32 to 64 bit fifo, the upper nibble of the FIFO output has a long line or some long path to the guts of the memory controller, which is RIGHT on a hairy edge. 

 

Please to not ask me architecture questions about doing this or that, it has to be done this way, given the limitations of our hardware. We are experts at the video we do, but not of the details of Altera fabric. Can anyone offer any path for me to troubleshoot this, or how to fix using timing analyzer? I have never been able to take any timing closure classes, but this seems like a far detailed issue that I may have to dig down to the nuts and bolts.  

 

for the most part the design works great, in that the video is relatively uncorrupted, but certain data patterns reveal this issue. The last pieces of the puzzle are that these designs work about 75% of the time, and if I cool the FPGA only, the problem goes away, but returns as soon as the device heats up again. 

 

Any help is greatly appreciated, I realize this is throwing mud at the wall but I'll take any help, even if they're high level suggestions on where to get started. I installed the .191 patch, and regenerated all my cores, is there anythign else to get the memory contoller patch in .191 to take?
0 Kudos
2 Replies
Altera_Forum
Honored Contributor II
530 Views

Hi Mike, 

 

I was wondering if you have been able to find the root cause and a solution to your problem.  

It seems I am having similar problems, also with the Arria II GX and DDR3. 

Hope to hear from you!
0 Kudos
Altera_Forum
Honored Contributor II
530 Views

As of today, 6/13, still have not found the problem, Jheij. I have fallen back to a simple DDR3 only system, writing and reading canned 32 bit data and doing compares. So far I am not seeing any bit errors in simulation or in my design.  

 

I suspect the real issue you and I are seeing is due to a long line somewhere, and my new test system is so small and compact, the fitter has a much easier job. I'll let you know what I find beyond that.
0 Kudos
Reply