Nios® V/II Embedded Design Suite (EDS)
Support for Embedded Development Tools, Processors (SoCs and Nios® V/II processor), Embedded Development Suites (EDSs), Boot and Configuration, Operating Systems, C and C++
12748 Discussions

read and write timings on avalon interface

Altera_Forum
Honored Contributor II
1,671 Views

Hello! 

 

I'm having some issues with the timings of read and write operations to my custom peripherals. 

 

I am using a Nios 2 processor a PIO peripheral for testing and my custom peripheral. The problem is the duration of the read and write operations. I am going to refer only to write operation. 

 

The testing procedure involved making writes to my peripheral and writing 1s and 0s to the PIO and watching the whole mess on a scope. I also examined the objdump from Nios 2 IDE to see that whose writes are the only thing that it is doing. 

 

My peripheral works at 50MHz and I have tested the following working configurations (among others...): 

- Processor at 100MHz and my peripheral at 50MHz. In this case the write operation took 140ns or 14!!! clock cycles. 

- Processor and peripheral at 50MHz. In this case the write operation has a variable(!!!) length, namely for writing 0 it needs 2 clock cycles and for writing something else about 6 cycles. (I must say that I am referring only to the execution of iowr operation, without other register loads). 

 

Writing to the PIO always took 2 clock cycles (so 40ns or 20ns, depending on the clock speed). 

 

The timings for read and write operations of the avalon MM slave interface of my peripheral are all 0 so no delay there. 

 

Could someone explain this behaviour? 

 

I would like my Nios 2 core to run at 100MHz so the 14 cycle operation is unacceptable. I was thinking of using a memory or a fifo to communicate with my peripheral... 

 

Thank you!
0 Kudos
4 Replies
Altera_Forum
Honored Contributor II
686 Views

Hello, 

 

What is the time you give (14 cycles), is it the time between two write operations ? 

 

Because think that each time you write a data to your peripheral, the processor needs before to get the data from somewhere and it can take some times, and it depends on how you code your access. 

 

For example, in my case, in a function I read several registers one after another and it takes 9 cycles between each read. In another function in a loop I read continuously one register until a flag be set, and here the reads are spaced by 23 cycles. 

In another function, in a loop I write values from a large table to the peripheral, and it takes 100 cycles between each write. 

 

So, I am not so surprised about your results, but maybe people more expert on this can give more precise explanations. 

 

Jérôme
0 Kudos
Altera_Forum
Honored Contributor II
686 Views

I guess that the 14 clock cycles are due to the clock domain crossing logic between the CPU and the peripheral. But still, 14 seems a lot!

0 Kudos
Altera_Forum
Honored Contributor II
686 Views

Are you running code out out internal (tightly coupled M9K) memeory? 

If you are running from anything else (via the i-cache) that could easily cause 'random' timings. 

If you avoid calls into libc, and sort out the fubar of main() v int_main() you should be able to get a very small object image. 

 

You also want to have all other data areas tightly coupled - exept for anything that has to be external (in which case the d-cache may also be pointless). Otherwise cycle times will not be deterministic. 

 

Also, you peripheral needs to expose a 32bit interface, otherwise there are likely to be multiple cycles - even for a byte access. 

 

The difference between writing zero and non-zero could well be in the object code, the non-zero value needs to be generated, the zero is in r0. 

 

14 cycles sound more like the time for a sdram access - especially a full burst for a cache line. 

 

With no other avalon bus cycles, I believe the cpu will continue exection following the write until another avalon cycle is requested. Similarly it will continue after issuing a read until an instruction needs the value. 

 

A 'lateral thought' option is to use custom instructions to interface with the peripheral, not the Avalon bus .....
0 Kudos
Altera_Forum
Honored Contributor II
686 Views

Thank you for your answers! 

 

I've done some digging and the the 14 clock cycles appear to be caused by the clock domain crossing logic... 

 

My code is run from internal M9K memory. 

 

Just to be clear, I measured the 14 cycles this way: 

write 1 to pio 

write value to peripheral 

write 0 to pio 

write 1 to pio 

write 0 to pio 

write 1 to pio 

write same value to peripheral 

write 0 to pio 

 

This way all the required values were stored in registers and the only operations made were writes (checked objdump). Writing to pio needed 2 cycles, and writing to peripheral then pio required 16 cycles... 

 

 

Next week I want to try to run the avalon interface logic at 100MHz so it will have no delays and the rest of my peripheral on a clock divided by 2.
0 Kudos
Reply