Can I accelerate?

Altera_Forum · ‎03-31-2005

http://forum.niosforum.com/work2/style_emoticons/<#EMO_DIR#>/wink.gif hello,everyone!

My NiosII's C code is:

int main()

{

unsigned count;

while(1)

{

count=IORD_ALTERA_AVALON_PIO_DATA(PIO_IN_BASE);

IOWR_ALTERA_AVALON_PIO_DATA(PIO_OUT_BASE,count);

}

return 0;

}

In my project,I use a counter module to count clock numbers.Nios read 32bit PIO data from the counter module,then write the data back.The counter record the back data. After analyse the record,I found each while loop cost 14 clocks .It's too slow!

How can I reduce the clock costed?

May assemble code do better?

Altera_Forum · ‎03-31-2005

It means that 7 clocks must be cost to read from PIO or write to PIO.

I did the test also, Using Standard/Fast NIOS, 7 clocks are used, even 64K bytes Instruction cache is configured , When using economic nios, It will take even more clocks.

who can analyse the behavior of avalon bus in this 7 clocks?

Altera_Forum · ‎03-31-2005

For a simple example like this, assembler code probably won't do better than optimised C. I assume you are using a release build - if not then please select one.

But if you want to check then please open up a Nios II shell, change to <project dir>/Release and type `make obj/<filename>.s`. The .s file shows the assembly language code which the compiler has generated.

Please post the assembler code and we can comment more usefully on what's going on.

Altera_Forum · ‎04-01-2005

Thanx.

I did this test again under release build mode,then read and write use 10 clocks.the assembly language code is:

.file "count_pio.c"

.section .text

.align 2

.global main

.type main, @function

main:

movhi r3, %hiadj(67584)

addi r3, r3, %lo(67584)

.L2:

ldwio r2, 0(r3)

stwio r2, 0(r3)

br .L2

.size main, .-main

.ident "GCC: (GNU) 3.4.1 (Altera Nios II 1.1 b137)"

Altera_Forum · ‎04-01-2005

I didn't using a release build ,but my assembly language code just the same as Where200's.

Altera_Forum · ‎04-01-2005

Thank you,wombat.

I do this example again under release build,Using Standard/Fast NIOS.reading and writting only use 6 clocks.

Altera_Forum · ‎04-01-2005

One point of interest - in the code which is generated:

.L2:
ldwio r2, 0(r3)
stwio r2, 0(r3)
br .L2

the processor will stall for two cycles after the ldwio because it can't use the value read from memory until two cycles later. It can use other registers though so the compiler will usually be able to insert other instructions here to keep the CPU busy.

Altera_Forum · ‎04-01-2005

Uncached memory accesses are generally slow with Nios II. For fast I/O you can use custom instructions that access the hardware directly. With that you can come down to 2 to 3 cycles per access.

Regards,

Thomas

Altera_Forum · ‎04-01-2005

The custom instruction idea is interesting.

You should in addition consider making, a dedicated hardware unit (interfaced as custom instruction or avalon or PIO) which takes the workload off the nios core and only returns precomputed results to the nios core. In that way you will be less reliant on the IO speed.

I dont know your application ofcourse, but often some rethinking of the architecture can put more functionality into HW, and the speed increase can be dramatic.

Given the details I am sure many people from the nios forum could give suggestions in that direction as well,

regards

henning