- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,all
I'm now designing a mp3 player for DE2 board, which use libmad for mp3 decoding. But currently, the speed is too slow for real time playing. The core is NiosII fastest core,i cache=8kb, d cache=16kb, cpu clk is set to 100MHz. It still require 380s to decode a 274s mp3 file. Can anyone tell me, is it reasonable? I have also tried to add custom instruction to replace the macro "mad_f_mul" in libmad, but the decoding speed remain the same or even worse. I can't understand it. The platform I'm using is QuartusII 9.1. any guidelines, thx!Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
did you tell the gcc that there are new CI used ?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I just use the new CI marco in system.h. The gcc will not compile the new CI automatically? But it still give the correct results.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Have you enabled -O2 or -O3 ?
If you rip out all calls into libc, can you fit the whole code into internal memory? (It might be that the resident set fits in the cache though.)- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
No, I get the result with optimization off. Does that make so much difference?
I tired to enable -O3 once, but the program didn't work correctly, it seems the program stopped at some place and a lot of codes can't be executed, so I change it back to no optimization. I don't quite understand the meaning of "If you rip out all calls into libc, can you fit the whole code into internal memory? (It might be that the resident set fits in the cache though.)". I use the 512K sram as memory and the program size is 167K.- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Definitely use -O3 - the speed-up is huge. Also, create onchip RAM and find the most often called functions and move them to this RAM until it's full.
Bill- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you. I shall try -O3 or -O2 again.
I don't know how to create onchip RAM and move any function to it. I just know how to add the onchip RAM componet to the system using SOPC, what else should I do?- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
--- Quote Start --- Thank you. I shall try -O3 or -O2 again. I don't know how to create onchip RAM and move any function to it. I just know how to add the onchip RAM componet to the system using SOPC, what else should I do? --- Quote End --- Use this before functions going into onchip ram:
void function(void) __attribute__((section(".onchip_mem")));
void function(void)
{
}
Of course onchip_mem has to match the name used in SOPC. Bill
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
--- Quote Start --- cpu clk is set to 100Hz. --- Quote End --- I imagine cpu is set to 100MHz. If you are really using 100Hz cpu clk that might be the problem.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You are right, the cpu is set to 100MHz.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
FYI here is a MP3 player that handles the audio in real time that might be useful: http://www.nioswiki.com/index.php?title=nios2embeddedevaluationkit/mp3_player&highlight=mp3
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks to you all. I use the -O1 and it can run realtime now.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I recommend tuning the system clock back to find out how much slack you have. For example I wouldn't rely on it being 'realtime' if it fails to keep up if you drop the frequency 10%. Also -O2 should give you better performance.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have tried -O2 and -O3, but the same problem occurs which is the program don't enter the decoding progress. I use an external interrupt to set a begin decoding flag and poll this flag in the main function. When using -O2 or -O3, the beginning flag is set correctly, but the polling didn't work. If I set a breakpoint there, the program shall never go to that part of the main function. I don't know what does optimization really do under different level. Can you give some idea about it?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It might be how the polling is implemented. Are you familiar with the keyword 'volatile'? http://en.wikipedia.org/wiki/volatile_variable Variables that should be volatile are one of the many things that you can look out for once you start increasing the optimization level. When you declare a variable volatile you are basically telling the compiler that the value can change at any time without the CPU being involved (a key characteristic of a register in a slave port). If that doesn't help maybe you can copy the code you think is the culprit into this post and one of us can figure out why the optimization level is causing problems. *Usually* these problems are caused by the application code and corner cases that the developer hasn't thought of, I say usually since -O3 can sometimes bring in some surprises :)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You are right. Declaring the variables 'volatile' solved the problem immediately. The decoding time for 274s mp3 decreased to 127s under -O3. Thanks a lot.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@ BadOmen
correct changing from -O2 to -O3 might introduce hardcore bugs with -O3 we had to add -fno-rename-registers to get rid of some functional differencies between DEBUG and RELEASE version- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I am also doing this,
Can share the code?- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes, but the final program include not only the mp3 decoding. You may need to find the part you need. And the program is not with me right now. I'm not sure I still have it. How can I share the code to you?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
i am glad to hear from you.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
bravefjz (http://www.alteraforum.com/forum/member.php?u=34987),
Please advice did you able to find it? If not can you explain how to do it? Thanks You- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page