Nios® II Embedded Design Suite (EDS)
Support for Embedded Development Tools, Processors (SoCs and Nios® II processor), Embedded Development Suites (EDSs), Boot and Configuration, Operating Systems, C and C++
Announcements
Intel Support hours are Monday-Fridays, 8am-5pm PST, except Holidays. Thanks to our community members who provide support during our down time or before we get to your questions. We appreciate you!

Need Forum Guidance? Click here
Search our FPGA Knowledge Articles here.
12437 Discussions

Increasing the speed of consecutive IOWR functions

Altera_Forum
Honored Contributor II
850 Views

Hi everyone, 

 

I have a dsp builder system and registers on it. 

When I want to write these register from NIOS by using IOWR function, there are about 30 clock cycles between two consecutive command. 

IOWR(base,0,1); 

IOWR(base,1,2); 

 

Is there any method to increase the speed of NIOS to reach this registers? 

 

Thanks, 

 

Omer
0 Kudos
2 Replies
Altera_Forum
Honored Contributor II
116 Views

There could be several reasons. First the access from the NIOS CPU to the registers you write with IOWR could itself be slow, for example if you are using different clock domains without a clock crossing pipeline bridge. In this case the synchronization logic automatically added between the master and the slave is rather slow. 

Also the access from the NIOS CPU to the program memory could be slow too. In that case it can take several clock cycles for the CPU to read the second IOWR instruction. 

You could make it quicker by using memory caches, if they aren't enabled. An instruction cache will make the loading of the second IOWR instruction faster. If you have a data cache, then you can write your registers using regular pointers instead of IOWR. Then the values will be written in the data cache, and once you have written everything, trigger a cache flush and ell the registers will be written in a burst.
Altera_Forum
Honored Contributor II
116 Views

If the avalon slave isn't a 32bit slave you'll have a bus width adapter that converts the 32bit cycle into (say) four 8bit ones (with the byte enable set appropriately). If you have a clock crossing bridge as well it can get very slow. 

 

Also make sure you've compiled everything with optimisation enabled (-O2 or -O3) otherwise the code itself will get large.
Reply