Nios® V/II Embedded Design Suite (EDS)
Support for Embedded Development Tools, Processors (SoCs and Nios® V/II processor), Embedded Development Suites (EDSs), Boot and Configuration, Operating Systems, C and C++
12603 Discussions

HI! I don't know instruction cycle

Altera_Forum
Honored Contributor II
987 Views

http://forum.niosforum.com/work2/style_emoticons/<#EMO_DIR#>/huh.gif HI!  

 

I want fast CopyMemory function and write inline asm code.  

(under line) 

I don&#39;t know " LDW, STW" machine cycle.. 

Why very very slow "LDW, STW" ? 

 

Test KIT : Cyclone 50Mhz STD  

 

 

void CopyMemory(void *IN_pDS, void *IN_pSC, int IN_iSZ) 

IN_iSZ >>= 2; //INF: DIV 4 , B.OF: 4BYTE MOVE 

 

asm 

(  

"LABEL_1: \n\t"  

"ldw r7, 0(%1) \n\t" //??Cycle STD  

"stw r7, 0(%0) \n\t" //??Cycle STD 

 

"addi %1, %1, 4 \n\t" //1Cycle STD 

"addi %0, %0, 4 \n\t" //1Cycle STD 

"addi %2, %2, -1 \n\t" //1Cycle STD 

"bne %2, %3, LABEL_1 \n\t" //2Cycle STD  

:  

: "r"(IN_pDS), "r"(IN_pSC), "r"(IN_iSZ), "r"(0) 

); 

}
0 Kudos
3 Replies
Altera_Forum
Honored Contributor II
322 Views

Hi, 

 

are you using Nios II/fast ? 

there seems to be dependencies into your instruction: your are losing some cycles... 

ld/stw should take one instruction only when they use cache memory. 

 

maybe also you would prefer to use DMA. 

 

 

 

what about printing a dump of assembly code; instead of C inline assembly... 

 

Sylvain
0 Kudos
Altera_Forum
Honored Contributor II
322 Views

The cycle times for LDW and STW are in the Nios II processor reference handbook, chapter 16. 

 

There is a two cycle load-use stall on the results of LDW so you should move your STW two cycles down. 

 

You could also consider calculating (IN_pDS + IN_iSZ) before the loop as that would save one add per loop. 

 

And finally you will gain by unrolling the loop if you are moving lots of data.
0 Kudos
Altera_Forum
Honored Contributor II
322 Views

Is the memory SDRAM or onchip RAM or what? 

 

If it&#39;s sdram, then the bus turnaround (or read and write addresses in different collumns, etc.) may be killing you. 

 

Try using another register or so to do multiple LDW&#39;s and then multiple STW&#39;s. 

 

Of course it would be nice if NiosII supported post or preincrement addressing. That would eliminate the ADDI&#39;s. 

 

You can also try DMA as Sylvain suggests. I&#39;m not sure if it does multiple reads and then multiple writes or not. It does have an internal fifo. 

 

The key is to do two or more reads back to back and then writes. 

 

Ken
0 Kudos
Reply