Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Altera_Forum
Honored Contributor I
2,925 Views

Mp3 decoder too slow on NiosII

Hi,all 

 

I'm now designing a mp3 player for DE2 board, which use libmad for mp3 decoding. But currently, the speed is too slow for real time playing. The core is NiosII fastest core,i cache=8kb, d cache=16kb, cpu clk is set to 100MHz. It still require 380s to decode a 274s mp3 file. Can anyone tell me, is it reasonable? 

 

I have also tried to add custom instruction to replace the macro "mad_f_mul" in libmad, but the decoding speed remain the same or even worse. I can't understand it. 

 

The platform I'm using is QuartusII 9.1. 

 

any guidelines, thx!
0 Kudos
20 Replies
Altera_Forum
Honored Contributor I
76 Views

did you tell the gcc that there are new CI used ?

Altera_Forum
Honored Contributor I
76 Views

I just use the new CI marco in system.h. The gcc will not compile the new CI automatically? But it still give the correct results.

Altera_Forum
Honored Contributor I
76 Views

Have you enabled -O2 or -O3 ? 

 

If you rip out all calls into libc, can you fit the whole code into internal memory? 

(It might be that the resident set fits in the cache though.)
Altera_Forum
Honored Contributor I
76 Views

No, I get the result with optimization off. Does that make so much difference? 

I tired to enable -O3 once, but the program didn't work correctly, it seems the program stopped at some place and a lot of codes can't be executed, so I change it back to no optimization. 

 

I don't quite understand the meaning of "If you rip out all calls into libc, can you fit the whole code into internal memory? (It might be that the resident set fits in the cache though.)". I use the 512K sram as memory and the program size is 167K.
Altera_Forum
Honored Contributor I
76 Views

Definitely use -O3 - the speed-up is huge. Also, create onchip RAM and find the most often called functions and move them to this RAM until it's full. 

 

Bill
Altera_Forum
Honored Contributor I
76 Views

Thank you. I shall try -O3 or -O2 again.  

I don't know how to create onchip RAM and move any function to it. I just know how to add the onchip RAM componet to the system using SOPC, what else should I do?
Altera_Forum
Honored Contributor I
76 Views

 

--- Quote Start ---  

Thank you. I shall try -O3 or -O2 again.  

I don't know how to create onchip RAM and move any function to it. I just know how to add the onchip RAM componet to the system using SOPC, what else should I do? 

--- Quote End ---  

 

 

Use this before functions going into onchip ram: 

 

void function(void) __attribute__((section(".onchip_mem"))); void function(void) { } Of course onchip_mem has to match the name used in SOPC. 

 

Bill
Altera_Forum
Honored Contributor I
76 Views

 

--- Quote Start ---  

cpu clk is set to 100Hz.  

--- Quote End ---  

 

I imagine cpu is set to 100MHz. If you are really using 100Hz cpu clk that might be the problem.
Altera_Forum
Honored Contributor I
76 Views

You are right, the cpu is set to 100MHz.

Altera_Forum
Honored Contributor I
76 Views

FYI here is a MP3 player that handles the audio in real time that might be useful: http://www.nioswiki.com/index.php?title=nios2embeddedevaluationkit/mp3_player&highlight=mp3

Altera_Forum
Honored Contributor I
76 Views

Thanks to you all. I use the -O1 and it can run realtime now.

Altera_Forum
Honored Contributor I
76 Views

I recommend tuning the system clock back to find out how much slack you have. For example I wouldn't rely on it being 'realtime' if it fails to keep up if you drop the frequency 10%. Also -O2 should give you better performance.

Altera_Forum
Honored Contributor I
76 Views

I have tried -O2 and -O3, but the same problem occurs which is the program don't enter the decoding progress. I use an external interrupt to set a begin decoding flag and poll this flag in the main function. When using -O2 or -O3, the beginning flag is set correctly, but the polling didn't work. If I set a breakpoint there, the program shall never go to that part of the main function. I don't know what does optimization really do under different level. Can you give some idea about it?

Altera_Forum
Honored Contributor I
76 Views

It might be how the polling is implemented. Are you familiar with the keyword 'volatile'? http://en.wikipedia.org/wiki/volatile_variable Variables that should be volatile are one of the many things that you can look out for once you start increasing the optimization level. When you declare a variable volatile you are basically telling the compiler that the value can change at any time without the CPU being involved (a key characteristic of a register in a slave port). If that doesn't help maybe you can copy the code you think is the culprit into this post and one of us can figure out why the optimization level is causing problems. *Usually* these problems are caused by the application code and corner cases that the developer hasn't thought of, I say usually since -O3 can sometimes bring in some surprises :)

Altera_Forum
Honored Contributor I
76 Views

You are right. Declaring the variables 'volatile' solved the problem immediately. The decoding time for 274s mp3 decreased to 127s under -O3. Thanks a lot.

Altera_Forum
Honored Contributor I
76 Views

@ BadOmen 

correct changing from -O2 to -O3 might introduce hardcore bugs with -O3 we had to add -fno-rename-registers to get rid of some functional differencies between DEBUG and RELEASE version
Altera_Forum
Honored Contributor I
76 Views

I am also doing this, 

Can share the code?
Altera_Forum
Honored Contributor I
76 Views

Yes, but the final program include not only the mp3 decoding. You may need to find the part you need. And the program is not with me right now. I'm not sure I still have it. How can I share the code to you?

Altera_Forum
Honored Contributor I
76 Views

i am glad to hear from you. 

 

Altera_Forum
Honored Contributor I
14 Views

bravefjz (http://www.alteraforum.com/forum/member.php?u=34987), 

Please advice did you able to find it? 

If not can you explain how to do it? 

Thanks You
Reply