Nios® II Embedded Design Suite (EDS)
Support for Embedded Development Tools, Processors (SoCs and Nios® II processor), Embedded Development Suites (EDSs), Boot and Configuration, Operating Systems, C and C++
Announcements
Intel Support hours are Monday-Fridays, 8am-5pm PST, except Holidays. Thanks to our community members who provide support during our down time or before we get to your questions. We appreciate you!

Need Forum Guidance? Click here
Search our FPGA Knowledge Articles here.

IOWR/IORD latency

Altera_Forum
Honored Contributor II
894 Views

Hi all! 

 

Sorry for prvious empty thread, I have no idea what happened. 

 

 

I'm experiencing some strange latency when using IOWR macros. I've added added custom 8-bit slave to QSYS and got huge number of cycles to read/write its registers. I thought that this issue related to some mistakes in my peripherial but then I've tried to read on-chip memory and got the same result! 

 

 

Here is the code, I'm using performance counter: 

 

 

int main() { PERF_RESET(PERFORMANCE_COUNTER_0_BASE); PERF_START_MEASURING(PERFORMANCE_COUNTER_0_BASE); PERF_BEGIN(PERFORMANCE_COUNTER_0_BASE,1); IORD_8DIRECT(ONCHIP_MEMORY2_0_BASE, PRER_LO); PERF_END(PERFORMANCE_COUNTER_0_BASE,1); PERF_BEGIN(PERFORMANCE_COUNTER_0_BASE,2); IORD_16DIRECT(ONCHIP_MEMORY2_0_BASE, PRER_LO); PERF_END(PERFORMANCE_COUNTER_0_BASE,2); PERF_BEGIN(PERFORMANCE_COUNTER_0_BASE,3); IORD_32DIRECT(ONCHIP_MEMORY2_0_BASE, PRER_LO); PERF_END(PERFORMANCE_COUNTER_0_BASE,3); perf_print_formatted_report(PERFORMANCE_COUNTER_0_BASE,50000000,3,"IORD_8","IORD_16","IORD32"); return 0; } 

 

And what I get: 

 

--Performance Counter Report-- Total Time : 10 usec (532 clock-cycles) +---------------+-----+------------+---------------+------------+ | Section | % | Time (usec)| Time (clocks)|Occurrences | +---------------+-----+------------+---------------+------------+ | IORD_8| 9 | 1 | 51 | 1 | +---------------+-----+------------+---------------+------------+ | IORD_16| 9 | 1 | 50 | 1 | +---------------+-----+------------+---------------+------------+ | IORD32| 8 | 0 | 47 | 1 | +---------------+-----+------------+---------------+------------+  

Ok, timer adds some time to this, as I measured, 30 clock cycles. So we have about 20 clock cycles per word, still bad. What could I do wrong? 

I'm using Quartus 15.0 Web Edition and Nios 2 Gen 2 /e.
0 Kudos
2 Replies
Altera_Forum
Honored Contributor II
134 Views

You will probably find that the performance counters represent considerable overhead. Try reading the registers 10,000 times in one performance counter. Then divide the time by 10,000 to get a result that isn't lost in noise.

Altera_Forum
Honored Contributor II
134 Views

Nios 2 /e is a slow processor that required 5 clock cycles minimum to complete 1 instruction. 

 

The overhead from the performance counter is larger in comparison to the total measured time per IORD instruction (30/50). I agree with Galfonz that you should perform iterations of IORD for more accuracy. 

 

Also try to look at simulations. This would definitely be a better way to understand the behavior of the RTL.
Reply