VIP Scaler II Megacore: Garbage at bottom of image

Altera_Forum · ‎01-23-2015

Hi,

I'm trying to upscale a 1024x768 image to 1280x1024 using the Altera VIP Scaler II IP Core.

I have set up the Clocked Video Input IP core to receive the image signal, then the image is upscaled by the VIP Scaler II IP Core and finally converted back using the Clocked Video Output IP core.

Using nearest neighbor algorithm this works great, the image is beeing displayed completely on the connected TFT display.

But when I change to a better algorithm like bilinear or polyphase interpolation, approximately 5 lines at the bottom of the screen show garbage and the unterflow signal goes high. All other pixels are beeing displays correctly.

I have tried all kinds of different settings in all 3 IP cores like changing FIFO depth to the maximum possible, but there is absolutely no change in the output image.

It looks like the scaling algorithm is waiting for more data at the end of the image frame to finish the upscaling process and therefore the fifo in the Video Output IP core is running empty, repeating the last available pixel for the rest of the image.

Sine I do not have any external memory connected, I can't add a frambuffer. But I don't think this is the actual problem, because changing the fifo settings does not change anything.

Does anybody have other suggestions of what might be the problem and how to solve it?

I have attached screenshots of the VIP Scaler II and Clocked Video Output IP core settings. Probably I did miss something simple.

Altera_Forum · ‎01-23-2015

The bilinear and polyphase filters require access to multiple pixels vertically. You have 8 taps vertically so that should require 4 lines to be buffered in the scaler before output data can begin. I'm guessing that is the cause of your underflow. In your video output there are 42 lines of vertical blanking so if you trigger the process that feeds data into the scaler near the start of the output blanking (vs. at the start of the active portion of the output) then you should have plenty of margin to have the scaler spitting out pixels before the active portion of the display output.

So I think this is probably just an issue of when you kick off the scaler input process relative to the output display timing. I've been using the VIP scaler (I & II) for the last 5+ years and never had any problems with it.

Altera_Forum · ‎01-24-2015

Thank you for your answer!

The main problem is probably my unterstanding of how the Clocked Video Output IP core exactly works.

How do I change the output-input timing relation?

Since the input timing is fixed, I can only change the output timing. And since I do not have a framebuffer, the output framerate must match the input framerate exactly.

My unterstanding from the VIP Suite User Guide is the following:

As soon as a new input frame starts, the scaler starts upscaling the video data delivered by the CVI core. When the fifo at the CVO core has reached the set threshold, output of the active image data starts. This provides some synchronization to lock both framerates, but it violates the set timing values for the output if the framerates do not match otherwise.

Is this correct or did I miss an important point?

Altera_Forum · ‎01-24-2015

I have not used the clocked video input or output cores so I'm not sure how they work. The easiest way to figure this out may be to simulate the whole pipeline so you can see exactly what's happening. Either that or use SignalTap.

On a different topic, are you aware that the aspect ratio of your input and output video are not the same? 1024/768 = 4/3 whereas 1280/1024 = 5/4. This means your images are being distorted slightly (compressed horizontally I think). That may not matter in your application, but if it does then you need to change your output resolution to something like 1280x960.

Altera_Forum · ‎01-24-2015

--- Quote Start ---

My unterstanding from the VIP Suite User Guide is the following:

As soon as a new input frame starts, the scaler starts upscaling the video data delivered by the CVI core. When the fifo at the CVO core has reached the set threshold, output of the active image data starts. This provides some synchronization to lock both framerates, but it violates the set timing values for the output if the framerates do not match otherwise.

Is this correct or did I miss an important point?

--- Quote End ---

Your understanding is correct. CVO underflow is indicating data is leaving the FIFO faster than it arrives. The first pixel leaves the CVO when the threshold is hit, and then over the course of your first ~1020 lines it is draining down to zero. You should be able to observe this fact with signaltap or simulation. It's been a while, but I believe the root cause will be a clock/line timing mismatch between the input and the output.

I believe you should have been able to change the FIFO size and threshold and seen corresponding effect on location/size of your garbage. Try doubling/halving the threshold and see if it changes. (if it doesn't, the behavior described above is not why you're getting underflow and you'll have to figure out what is the reason).

I'm also not sure why nearest neighbor worked fine and 8 vertical tap filters didn't. The difference between the two should have simply been additional (8) lines of latency between the first pixel coming into the scaler and the first output pixel being emitted.

I agree with the suggestion to rig up a simple simulation to sanity check your clocks/timings. (as you can expect, the simulation execution itself will take some time..)

Altera_Forum · ‎01-25-2015

I tried simulating the scaler chain.

First I have changed the input and output timing to exactly the same framerates:

1024x768 is beeing upscaled to 1280x1024, so the scaling factors are 1.25 and 1.3333 for horizontal and vertical direction.

Input pixel clock is 60MHz, output clock is 60MHz x 1.25 x 1.3333 = 100MHz. The input size including blanking is 1248x801, the output size is 1248 x 1.25 and 801 x 1.3333 = 1560x1068. Both do not only produce exactly the same framerates of 60.0211Hz but also have the same effective data rates (output datarate = input data rate x 1.25 x 1.3333), so there should be a minimal buffer requirement.

Both fifos at CVI and CVO cores are set to 16380 pixels (16 lines for a width of 1024 and 12.8 lines for 1280).

But the result on the display is exactly the same as before: The last 5 lines are wrong. They are filled with the same color, probably from some pixels at the end of the previous line.

In the simulation sooner or later underflow or overflow or both signals get set, depending on the CVI and CVO fifo settings.

A CVO start threshold of around 2000 pixels seems to be the best value giving the earliest starting point without underflow getting set after a few lines. The output of the active frame starts arround line 5 of the active input data. At input line 22 overflow gets set. It seems strange to me, because of the large buffer size.

Measuring the horizontal output timing it matches the settings: 1560 pixels per line. Therefore I have absolutely no idea why the fifo under or overflows. 18 lines of fifo buffer should be enough for a perfectly matched timing. It is really strange. The result on the display is always the same regardless of the fifo and timing settings unless I use something totally off. Then I get either no display at all, or I can clearly see a fifo over/underflow somewhere in the middle of the sreeen. But this looks totally different than the problem at the last 5 lines.

From the output image it looks like the overflow and underflow signals are set spuriously, because up to the last 5 lines the image is completely fine.

I have attached screenshots of the first few lines from the 1st and from the 3rd. frame. The almost randomly changing signals on the Avalon interface look suspicious.

Can you give me any hints at what signals I should look at to identify the problem?

Altera_Forum · ‎01-26-2015

Seems like you should be able to get this to work. So you have the CVI running at 60MHz and the CVO at 100MHz. What clock is the scaler running off of? If the scaler is running at 100MHz I can't see how this would not work with all the buffering you have in the pipeline.

At 1248x801 the active portion of an input frame lasts 1248*768/60e6 = 15.97ms. At 1560x1068 the active portion of an output frame lasts 1560*1024/100e6 = 15.97ms (exactly the same). As long as the scaler can keep up with the CVO's demand for pixels (it should if running at 100MHz) then to me this should be working for you.

I don't advocate designing by trial and error, but if you already have the scaler at 100MHz then you might try speeding it up to 150MHz and see what happens.

Also, an underflow makes sense if the scaler can't keep up with the CVO, but I definitely don't understand the overflow condition. The backpressure between the VIP cores should prevent that (I think).

Altera_Forum · ‎01-26-2015

--- Quote Start ---

Also, an underflow makes sense if the scaler can't keep up with the CVO, but I definitely don't understand the overflow condition. The backpressure between the VIP cores should prevent that (I think).

--- Quote End ---

Overflow is from the clocked video input. It can't stop accepting data from the pins, and backpressure only controls Scaler reading from the FIFO.

In other words: the scaler isn't accepting data fast enough for the CVI, and isn't producing data fast enough for the CVO.

Altera_Forum · ‎01-26-2015

Right, that makes sense. I forgot about the inability to backpressure the input source. In my application I pull images from a frame buffer in SDRAM, which makes overflow impossible.

Still sounds like the scaler isn't running fast enough.

Altera_Forum · ‎01-26-2015

I've tried 150MHz system clock for the scaler instead of 100MHz, but it did change absolutely nothing. The output image shows the same problem and also all signals at the simulation look identical including overflow getting set during input line 22, except the alt_vip_cl_scl_0_dout_ready signal going periodically low since the datarate is now lower than the clockrate.

Altera_Forum · ‎01-27-2015

You could try to use (for testing purpose) the FrameBuffer before and after the scaler and or try to achieve better results with a faster CVI/CVO and not only with the faster Scaler.

Altera_Forum · ‎01-27-2015

Since I do not have any memory connected to the FPGA, I can not use a framebuffer. Instead I have replaced the CVI with the Test Pattern Generator. This should be similar to the framebuffer since it can deliver data whenever requestet.

Now the image is displayed correctly and the underflow signal stays inactive, as expected.

The next thing I have tried was replacing CVI and CVO with CVI II and CVO II. But since those are limited to 4k FIFO depth the problems in the output image start around line 25 instead of 1019.

So I went back to CVI but kept using CVO II. Now it looks same as before but after a couple of minutes the image changed from false data in the last 5 lines to false data in the last ~1015 lines.

Altera_Forum · ‎01-28-2015

bktemp - Good idea to plug in the test pattern generator and verify that the pipeline functions correctly without the CVI. Based on everything you've described here I am stumped as to what the problem is. If you figure is out please post back here and let us know.

Good luck!

Altera_Forum · ‎01-29-2015

Have you tried to also use a faster clock for the CVI/CVO?

Altera_Forum · ‎01-29-2015

You mean the main clock for CVI/CVO not the pixelclock? All 3 main clocks are connected together and run at 150MHz. The input pixel clock is 60MHz, the output pixel clock is 100MHz.