Re: msgdma stuck in busy state

Altera_Forum · ‎07-04-2016

Hello :)

I have a problem with the msgdma. When I submit a descriptor to it and set the go bit it keeps in busy state.

I'm using the SoCKit with Cyclone V and Quartus 14.1

The msgdma is connected to a f2h_sdram port as an avalon-MM master for read and write

I used the following setup for the msgdma:

MM to MM

Data Width: 32

Data Path FIFO Depth: 32

Descriptor FIFO Depth: 128

Response Port: disabled

max Transfer length: 1 KB

aligned access

everything other disabled.

To initialize the msgdma I reset it by setting the "stop dispatcher" bit and after this the "dispatcher reset" bit. The status after this is "Response buffer empty" and "descriptor buffer empty".

After the initialization I send "stop descriptors" and create the descriptors and send them to the descriptor register. When I send the control part of the descriptor I have the "go" bit set. After this i unset "stop descriptors".

Immediately after this the status changes to "Response buffer empty" and "busy" and stuck at this...

Does anybody have an idea why the msgdma could keep in busy?

EDIT: If you need any additional information just tell me :)

Thx Tom

Altera_Forum · ‎07-05-2016

It looks like there is a problem with the f2h_sdram interface.

I added two on-chip memory to my design and configured one as rom and one as ram and transfered data between them without any problem.

I have also seen that the msgdma shows the behaviour, that after I have send a descriptor to it it keeps in busy. I can reset this by setting the reset dispatcher bit of control register. But I can't get the msgdma into stopped status, even if I have done a reset before. It seems like the msgdma module does stuck in a transfer even after I have reset it. When I transfer another descriptor after the reset this one keeps in the descriptor buffer, another sign that the old descriptor is still in progress after the reset.

Is there a known issue with connection between the msgdma and the hps f2h sdram? Or could anybody provide me a qsys configuration example?

Altera_Forum · ‎07-05-2016

Tom -

What do you have the burst size configured to in the mSGDMA for the f2h_sdram interface? I had a problem with the mSGDMA that I've been meaning to open a SR about. In my case it was an Arria 10 SoC design and the mSGDMA was MM to streaming vs. your MM to MM. But my problem was also on the f2h_sdram MM interface. What I saw was that if the mSGDMA MM burst size (f2h_sdram interface) was smaller than the transfer size then the f2h_sdram interface would hang. If the MM burst size is smaller than the transfer size then the mSGDMA must submit more than one read request to the MM interface to complete the transfer. Let's say the burst size is 256 bytes and the transfer size is 1024 bytes. The mSGDMA will submit 4 256-byte read requests to the MM interface in rapid succession (the MM interface buffers the requests). The MM interface then completes the first 256-byte burst transfer and stops. The mSGDMA is stuck waiting for the remaining three bursts that never come.

I saw this in simulation using an Avalon BFM to model the f2h_sdram interface. What I described happened 100% of the time whenever the transfer size was larger than the MM burst size. I was never able to clearly capture this happening in hardware with SignalTap but other than that the hardware was behaving the same as simulation: Whenever the transfer size exceeded the burst size the mSGDMA was stuck busy waiting for data from the HPS.

So my advice is to configure your mSGDMA burst size to be the same size as your configured maximum transfer size and see what happens. Or at least test with a transfer size that is <= to your configured burst size.

I always hesitate to submit SRs for complicated problems like this because the SR process gets bogged down in an endless back and forth between time zones that are 10-12 hours apart. But I'll get around to it eventually. I still don't know 100% that the problem in hardware was exactly what I saw in simulation, but from what I could see it looked like it was.

Good luck. Please post back what you see.

Bob

Altera_Forum · ‎07-05-2016

--- Quote Start ---

Tom -

What do you have the burst size configured to in the mSGDMA for the f2h_sdram interface? I had a problem with the mSGDMA that I've been meaning to open a SR about. In my case it was an Arria 10 SoC design and the mSGDMA was MM to streaming vs. your MM to MM. But my problem was also on the f2h_sdram MM interface. What I saw was that if the mSGDMA MM burst size (f2h_sdram interface) was smaller than the transfer size then the f2h_sdram interface would hang. If the MM burst size is smaller than the transfer size then the mSGDMA must submit more than one read request to the MM interface to complete the transfer. Let's say the burst size is 256 bytes and the transfer size is 1024 bytes. The mSGDMA will submit 4 256-byte read request to the MM interface in rapid succession (the MM interface buffers the requests). The MM interface then completes the first 256-byte burst transfer and stops. The mSGDMA is stuck waiting for the remaining three bursts that never come.

--- Quote End ---

Bob, I've been having my own issues with the mSGDMA and park mode, and am already 2+ months in to an SR about it. I'm a fan of getting SRs in, at least to get it recorded with Altera.

I haven't seen this issue on a streaming-to-MM in 15.1.2 on a Cyclone V with Max burst=32 and max transfer much larger, with a 128-bit-wide interface.

Altera_Forum · ‎07-05-2016

Thanks, derim. As I said, what I saw in simulation could be a BFM problem. But the hardware never worked with transfer size > burst size. I would like to go back and do more SignalTapping to make sure the behavior in sim and hardware was the same, but hard to justify the time now that we have it working.

Are you using the prefetcher or just feeding descriptors into the descriptor slave interface? Our data interface is also 128 bits.

Bob

Altera_Forum · ‎07-05-2016

We're actually using the prefetcher as when I was first setting the mSGDMA up I ran in to a Quartus bug with bad values being written to the descriptor slave interface-- this was apparently fixed in 15.1, I believe. We're running in park mode, so the prefetcher actually provides us with better control, especially as we're trying to avoid using any additional kernel modules.

Altera_Forum · ‎07-05-2016

Maybe the prefetcher is the difference (we're not using it yet, nor is the OP), but the MM transactions should be the same either way. Time will tell. Curious to see what the OP (Tom) comes back with.

Bob

Altera_Forum · ‎07-06-2016

I don't think that the f2h_sdram interface supports queuing up multiple burst transfers.

Altera_Forum · ‎07-06-2016

--- Quote Start ---

So my advice is to configure your mSGDMA burst size to be the same size as your configured maximum transfer size and see what happens. Or at least test with a transfer size that is <= to your configured burst size.

--- Quote End ---

Hello Bob,

Thanks for your post.

I followed your advice but this didn't solve the problem. You can see my configuration at the picture (https://www.dropbox.com/s/d7174gyidtpgzqf/qsys_msgdma.png?dl=0, I need to do it that way because the forum does resize it when I upload it... unreadable...).

I will keep on trying to get the msgdma to work... I'm open for ideas and hints:confused:

Tom

Altera_Forum · ‎07-06-2016

--- Quote Start ---

I don't think that the f2h_sdram interface supports queuing up multiple burst transfers.

--- Quote End ---

In simulation the Avalon-MM BFM did swallow four read requests before the first burst of read data was returned. In the real system it's sometimes hard to tell (for me anyway) what Qsys injects between entities on the Avalon interconnect. I actually have two mSGDMAs, one for each direction, and they share the f2h_sdram port.

I'm going to have to invest some time into fully understanding what was happening in the hardware before I open a SR on this.

Altera_Forum · ‎07-06-2016

--- Quote Start ---

Hello Bob,

Thanks for your post.

I followed your advice but this didn't solve the problem. You can see my configuration at the picture (https://www.dropbox.com/s/d7174gyidtpgzqf/qsys_msgdma.png?dl=0, I need to do it that way because the forum does resize it when I upload it... unreadable...).

I will keep on trying to get the msgdma to work... I'm open for ideas and hints:confused:

Tom

--- Quote End ---

Interesting, Tom. Your mSGDMA settings are almost identical to mine except that my max transfer size is 4096 bytes, so max burst count is 256. Sorry this didn't work for you.

I'll post back with whatever I learn about the problem I was having. More testing required.

Bob

Altera_Forum · ‎07-06-2016

Hey Bob

I got the Problem...

It wasn't my Qsys configuration, it was the environment variables in u-boot. The variable fpga2sdram_handoff was set to 0x0, i changed it to 0x00003fff to activate the f2h_sdram bridge. Now it works, or let me say I got other problems :rolleyes: :D

I will try if I have the same problem with transfer size and burst size like you.

Altera_Forum · ‎07-06-2016

--- Quote Start ---

Hey Bob

I got the Problem...

It wasn't my Qsys configuration, it was the environment variables in u-boot. The variable fpga2sdram_handoff was set to 0x0, i changed it to 0x00003fff to activate the f2h_sdram bridge. Now it works, or let me say I got other problems :rolleyes: :D

I will try if I have the same problem with transfer size and burst size like you.

--- Quote End ---

That's great to hear. Hopefully you don't have any other issues!

Altera_Forum · ‎07-07-2016

I think that the Avalon-MM BFM simulates the IP generated by Qsys, not the hard IP used by the ARM core. It may not be a valid simulation.

Altera_Forum · ‎07-07-2016

--- Quote Start ---

I think that the Avalon-MM BFM simulates the IP generated by Qsys, not the hard IP used by the ARM core. It may not be a valid simulation.

--- Quote End ---

I've been using the mSGDMA Qsys IP-- not the Arm DMA IP. I haven't run a simulation, but the Arm DMA IP is the Corelink DMA-330-- this is a very different beast.

Altera_Forum · ‎08-15-2016

After consulting with Altera support, the mSGDMA core does have a bug and slightly different operation from the manual. According to them, the SW_RESET bit needs to be set and then cleared BY SOFTWARE to correctly reset, due to some bugs in the FIFO resets. This means that the reset bit doens't automatically return to zero to indicate a reset complete when in certain states. I can post the Altera support reply if anyone is interested. Hopefully the IP core bug and documentation can be synchronized.

Altera_Forum · ‎05-18-2017

--- Quote Start ---

After consulting with Altera support, the mSGDMA core does have a bug and slightly different operation from the manual. According to them, the SW_RESET bit needs to be set and then cleared BY SOFTWARE to correctly reset, due to some bugs in the FIFO resets. This means that the reset bit doens't automatically return to zero to indicate a reset complete when in certain states. I can post the Altera support reply if anyone is interested. Hopefully the IP core bug and documentation can be synchronized.

--- Quote End ---

I know it's an year later, but I'd appreciate the Altera support reply.

Altera_Forum · ‎05-18-2017

Me too, actually. The reply was basically that there was a bug. If you read the release notes for Quartus, it sounds like the FIFO reset bug is fixed. I haven't tested it though as I built a workaround in RTL to manage the FIFO bug. The mSGDMA has been working pretty well.

Altera_Forum · ‎05-18-2017

--- Quote Start ---

Me too, actually. The reply was basically that there was a bug. If you read the release notes for Quartus, it sounds like the FIFO reset bug is fixed. I haven't tested it though as I built a workaround in RTL to manage the FIFO bug. The mSGDMA has been working pretty well.

--- Quote End ---

Thanks. Not sure what is the issue that I'm running into then. When I set the reset bit in the control register, it clears itself automatically, but the resetting bit in the status register never gets deasserted. The whole thing hangs up, stuck in a resetting state.

Altera_Forum · ‎05-18-2017

--- Quote Start ---

Thanks. Not sure what is the issue that I'm running into then. When I set the reset bit in the control register, it clears itself automatically, but the resetting bit in the status register never gets deasserted. The whole thing hangs up, stuck in a resetting state.

--- Quote End ---

I remember having this issue as well, at one point. It depended on what mode the mSGDMA was already running in. If it was in a state that it couldn't reset from (ie certain "park" modes), then resetting was troublesome.