- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
in my project, i make a loop to blink a led:
int main(void) { int i=0; for(i=0;i<100;i++) { IOWR_ALTERA_AVALON_PIO_DATA(QD_PIO_0_BASE,0xff); delay(); IOWR_ALTERA_AVALON_PIO_DATA(QD_PIO_0_BASE,0x0); delay(); // printf("Hello NIOS II! %d\n",i); } return 0; } void delay(void) { alt_u32 i =0; while(i < 100000) { i++; } } the output wave is 30ms period, and my clk frequency is 100M, so it looks like the "i++", takes 15 clock cycles, is it right? how can i know each command takes how many cycles in my project?Link Copied
3 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If you are worried about how long NIOS instructions take, be aware that NIOS is a very slow processor. The free one is staggeringly slow and inefficent. If this is a concern, use one of the SoC chips with built in ARM processor or write your algorithm in Verilog or VHDL. Almost any external micro will be faster than NIOS as well.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
thanks for your reply.
i am not worried about it, i just want to know how long NIOS instructions take, this may be helpful. i read the objdump file, i looks like a assembly language, about the i++, it shows: void delay(void) { alt_u32 i =0; while(i < 100000) 80031c: e0ffff17 ldw r3,-4(fp) 800320: 008000b4 movhi r2,2 800324: 10a1a7c4 addi r2,r2,-31073 800328: 10fff92e bgeu r2,r3,800310 <__reset+0xff7f8310> { i++; } } 80032c: e037883a mov sp,fp 800330: df000017 ldw fp,0(sp) 800334: dec00104 addi sp,sp,4 800338: f800283a ret does each line cost one clock?- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
https://www.altera.com/content/dam/altera-www/global/en_us/pdfs/literature/hb/nios2/n2cpu_nii5v1.pdf
See "Instruction Performance" on page 5-11, 5-19, or 5-21 depending on what core you're using. Your question was asking about instruction performance, but if you really just care about higher-level C function/loop execution times, AN391 is a good read: https://www.altera.com/content/dam/altera-www/global/en_us/pdfs/literature/an/an391.pdf Especially the Performance Counter IP block is very useful. Many things can be done in a single cycle. But getting the compiler to emit the best code, and constructing optimized hardware, can all become a small research project by themselves. For example, if you just rewrote your delay() in a form that GCC likes just a little bit better, it looks like it would average (3) cycles per loop iteration on an "F" core.
void delay(void)
{
register int i =0;
const register int limit = 100000;
for(i=0; i < limit; i++) {
}
}
And the assembly (gcc -S foo.c): (.L3 is the loop iterator increment, followed by the .L2 "blt" compare against the 100000)
delay:
addi sp, sp, -12
stw fp, 8(sp)
stw r17, 4(sp)
stw r16, 0(sp)
addi fp, sp, 8
mov r17, zero
movhi r16, 2
addi r16, r16, -31072
mov r17, zero
br .L2
.L3:
addi r17, r17, 1
.L2:
blt r17, r16, .L3
addi sp, fp, -8
ldw fp, 8(sp)
ldw r17, 4(sp)
ldw r16, 0(sp)
addi sp, sp, 12
ret
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page