Nios® V/II Embedded Design Suite (EDS)
Support for Embedded Development Tools, Processors (SoCs and Nios® V/II processor), Embedded Development Suites (EDSs), Boot and Configuration, Operating Systems, C and C++
12599 Discussions

When compiling with Nios II SBT, the optimize option (-O2) causes it to stop working.

mametarou963
Beginner
1,862 Views

Hi.

I am currently developing on Nios II Cyclone 10,
but when I add the optimization option, it stops working.
Is there any possible cause for this?

Development environment is as follows:
* FPGA:Cyclone10 10CL016YE144C8G 
* ROM:EPCQ16ASI8N
* Quartus Prime:Ver18.0

 

Details are as follows Sorry this is so long.

-----detail-----

This is the source code around the area where the problem seems to be occurring.

 

```C
#define GG_MMIO_OUTPUT_STS_ROW_MAX 3
#define GG_REG_IOWR_ADDRESS_DECODER(base, offset, data) IOWR_16DIRECT(base, offset * 4, data)
#define EXT_BUFFERS_BASE 0x3400000
#define GG_REG_ADDRESS_DECODER_BASE EXT_BUFFERS_BASE

unsigned char ggMmioInitialize(void)
{
unsigned char result = GG_MMIO_RET_OK;
GG_TIM_ATTRIBUTE_ST timParamSt;
unsigned char idx;
for( idx=0; idx<GG_MMIO_OUTPUT_STS_ROW_MAX; idx++ )
{
GG_REG_IOWR_ADDRESS_DECODER( GG_REG_ADDRESS_DECODER_BASE, idx, 0x00 ); // stops here.
}

memset( ggMmioInputSettingsTable, GG_SYS_DEFAULT_BYTE, sizeof(GG_MMIO_INPUT_PARAM_ST) * GG_MMIO_INPUT_SETTINGS_MAX );

// ...
}
```

 

* When the following code is compiled with optimization option level2(-O2) in Debuger and then executed, it stops at the comment line `stops here`.
* If the optimization option is set to level 0(-O0), it runs without problems.
* If you apply the optimization option level2 and add `#pragma GCC optimize ("O0")` so that only the function `ggMmioInitialize` is not optimized, it will run without problems.

I believe these are problems with the compile-time optimization of the function `ggMmioInitialize`.
To investigate the cause in detail, I compared the assembler code WITHOUT OPTIMIZATION with the assembler code WITH OPTIMIZATION.

The two codes are shown below, respectively.

 

* assembler code WITHOUT OPTIMIZATION

 

```
unsigned char ggMmioInitialize(void)
{
0: defff704 addi sp,sp,-36
4: dfc00815 stw ra,32(sp)
8: df000715 stw fp,28(sp)
c: df000704 addi fp,sp,28
unsigned char result = GG_MMIO_RET_OK;
10: e03ffb05 stb zero,-20(fp)
GG_TIM_ATTRIBUTE_ST timParamSt;
unsigned char idx;
for( idx=0; idx<GG_MMIO_OUTPUT_STS_ROW_MAX; idx++ )
14: e03ffb45 stb zero,-19(fp)
18: 00000b06 br 48 <ggMmioInitialize+0x48>
{
GG_REG_IOWR_ADDRESS_DECODER( GG_REG_ADDRESS_DECODER_BASE, idx, 0x00 );
1c: e0bffb43 ldbu r2,-19(fp)
20: 1085883a add r2,r2,r2
24: 1085883a add r2,r2,r2
28: 1007883a mov r3,r2
2c: 0080d034 movhi r2,832
30: 1885883a add r2,r3,r2
34: 0007883a mov r3,zero
38: 10c0002d sthio r3,0(r2)
GG_TIM_ATTRIBUTE_ST timParamSt;
unsigned char idx;
for( idx=0; idx<GG_MMIO_OUTPUT_STS_ROW_MAX; idx++ )
3c: e0bffb43 ldbu r2,-19(fp)
40: 10800044 addi r2,r2,1
44: e0bffb45 stb r2,-19(fp)
48: e0bffb43 ldbu r2,-19(fp)
4c: 108000f0 cmpltui r2,r2,3
50: 103ff21e bne r2,zero,1c <mmioInputStatusSamplingTimerExpired+0xfffff4b8>
GG_REG_IOWR_ADDRESS_DECODER( GG_REG_ADDRESS_DECODER_BASE, idx, 0x00 );
}
memset( ggMmioInputSettingsTable, GG_SYS_DEFAULT_BYTE, sizeof(GG_MMIO_INPUT_PARAM_ST) * GG_MMIO_INPUT_SETTINGS_MAX );
54: 01800504 movi r6,20
58: 01403fc4 movi r5,255
5c: 01000034 movhi r4,0
60: 21000004 addi r4,r4,0
64: 00000000 call 0 <ggMmioInitialize>

// ...
```

 

* assembler code WITH OPTIMIZATION

 

```
unsigned char ggMmioInitialize(void)
{
c: defff804 addi sp,sp,-32
10: dfc00715 stw ra,28(sp)
14: dc000615 stw r16,24(sp)
unsigned char idx;
for( idx=0; idx<GG_MMIO_OUTPUT_STS_ROW_MAX; idx++ )
{
GG_REG_IOWR_ADDRESS_DECODER( GG_REG_ADDRESS_DECODER_BASE, idx, 0x00 );
18: 0080d034 movhi r2,832
1c: 1000002d sthio zero,0(r2)
20: 10800104 addi r2,r2,4
24: 1000002d sthio zero,0(r2)
28: 0080d034 movhi r2,832
2c: 10800204 addi r2,r2,8
30: 1000002d sthio zero,0(r2)
memset( ggMmioInputSettingsTable, GG_SYS_DEFAULT_BYTE, sizeof(GG_MMIO_INPUT_PARAM_ST) * GG_MMIO_INPUT_SETTINGS_MAX );
memset( ggMmioInputSts, GG_SYS_DEFAULT_BYTE_ZERO, sizeof(ggMmioInputSts) );
34: 00c00034 movhi r3,0
...
```

There does not appear to be anything wrong with either.
The following are among the codes that have been optimized.

```
sthio zero,0(r2)
```

Assuming we are stopping at this instruction.
To prevent the operand `zero` from being taken for the opcode `sthio`,
I modified the source as follows

 

```C
#define GG_MMIO_OUTPUT_STS_ROW_MAX 3
#define GG_REG_IOWR_ADDRESS_DECODER(base, offset, data) IOWR_16DIRECT(base, offset * 4, data)
#define EXT_BUFFERS_BASE 0x3400000
#define GG_REG_ADDRESS_DECODER_BASE EXT_BUFFERS_BASE
unsigned char ggMmioInitialize(void)
{
unsigned char result = GG_MMIO_RET_OK;
GG_TIM_ATTRIBUTE_ST timParamSt;
volatile int write_data = 0x00; // fixed
unsigned char idx;
for( idx=0; idx<GG_MMIO_OUTPUT_STS_ROW_MAX; idx++ )
{
GG_REG_IOWR_ADDRESS_DECODER( GG_REG_ADDRESS_DECODER_BASE, idx, write_data ); // fixed
}

memset( ggMmioInputSettingsTable, GG_SYS_DEFAULT_BYTE, sizeof(GG_MMIO_INPUT_PARAM_ST) * GG_MMIO_INPUT_SETTINGS_MAX );

// ...
}
```

 

The comment line `fixed` is the modified line.

The compiled assembler code sequence with optimization (-O2) looks like this

 

```
unsigned char ggMmioInitialize(void)
{
c: defff704 addi sp,sp,-36
GG_TIM_ATTRIBUTE_ST timParamSt;
unsigned char idx;
volatile int write_data = 0x00;
10: d8000615 stw zero,24(sp)
for( idx=0; idx<GG_MMIO_OUTPUT_STS_ROW_MAX; idx++ )
{
GG_REG_IOWR_ADDRESS_DECODER( GG_REG_ADDRESS_DECODER_BASE, idx, write_data );
14: d8c00617 ldw r3,24(sp)
unsigned char ggMmioInitialize(void)
{
18: dfc00815 stw ra,32(sp)
1c: dc000715 stw r16,28(sp)
unsigned char idx;
volatile int write_data = 0x00;
for( idx=0; idx<GG_MMIO_OUTPUT_STS_ROW_MAX; idx++ )
{
GG_REG_IOWR_ADDRESS_DECODER( GG_REG_ADDRESS_DECODER_BASE, idx, write_data );
20: 0080d034 movhi r2,832
24: 10c0002d sthio r3,0(r2)
28: d8c00617 ldw r3,24(sp)
2c: 10800104 addi r2,r2,4
30: 10c0002d sthio r3,0(r2)
34: d8c00617 ldw r3,24(sp)
38: 0080d034 movhi r2,832
3c: 10800204 addi r2,r2,8
40: 10c0002d sthio r3,0(r2)
```

 

The operand `r3` is changed to take the operand `r3` for the opcode `sthio`.
This change allows us to run.

At length, my questions are as follows.

* It stopped working when I put on the optimize option. Is there any possible cause for this?
* Based on my investigation, it appears that the `sthio zero ... `assembler code appears to be stuck when outputting

Does this sequence of assembler code have anything to do with the inability to execute?
Or is there another cause?

Best regards.

0 Kudos
12 Replies
EBERLAZARE_I_Intel
1,832 Views

Hi,


May I know where did you get the source code?


I am looking into the issue, please allow me some time to get back to you.


0 Kudos
mametarou963
Beginner
1,801 Views

Thank you for your response.

 

The source code is the code that belongs to the company.
Therefore, it is not open.

 

In this case, we are releasing some excerpts to the extent that it is not a problem.

0 Kudos
EBERLAZARE_I_Intel
1,774 Views

Hi,


I am unsure as well regarding the sthio zero, I couldn't find any related info on that.


Was there any erros in Quartus or Nios II build and compilations?


0 Kudos
mametarou963
Beginner
1,762 Views

There were no problems with build or compile.
However, when I run it, execution stops near the following code

 

```
sthio zero,0(r2)
```

If you do not take `zero` as an operand as shown above, you can execute the function without any problem.
Specifically, I tried the following two
* Using pragma for the function in question and not optimizing it so that it takes a non-zero register as its operand
* using `volatile` for a variable that is an operand, so that it takes a non-zero register as an operand

If I build and compile in these ways, it runs without any problem.

 

Therefore, we thought that the problem might be related to the execution of the operand `zero` for the opcode `sthio`.

0 Kudos
EBERLAZARE_I_Intel
1,745 Views

Hi,


Here is the doc from our Nios II ref guide, the sthio is "store halfword to memory or I/O peripheral":

https://www.intel.com/content/www/us/en/docs/programmable/683836/current/sth-sthio.html


Can you confirm if the description is what you are experiencing.


0 Kudos
mametarou963
Beginner
1,733 Views

You are right.
This is exactly what I am experiencing.

I see the following in this document.

 

```
Operation: Mem16[rA + σ(IMM16)] ← rB15. .0
Assembler syntax:sthio rB, byte_offset(rA)
```

 

I checked the syntax of `shiio zero ...`
According to the Nios II Processor Reference Guide, the zero is in register r0, so the syntax seems to be as documented.

0 Kudos
EBERLAZARE_I_Intel
1,717 Views

Hi,


Are you still facing any roadblocks?


or Would you have any other questions?



p/s: If any answer from the community or Intel Support are helpful, please feel free to give best answer or rate 4/5 survey.


0 Kudos
mametarou963
Beginner
1,668 Views

Hi.

I am facing obstacles.
The problems indicated in the text have not been resolved in any way.

 

What you have shown me in your documentation is
I believe it is the assembler code I am having problems with.

 

However, as a matter of fact, the "assembler code WITHOUT OPTIMIZATION" code shown in the text can be executed, while the "assembler code WITH OPTIMIZATION" code containing `sthio zero,0(r2)` stops execution. The code with `sthio zero,0(r2)` stops execution.

 

Is there any problem with the code shown in `assembler code WITH OPTIMIZATION'?

I am having trouble because I can run the code without optimization and cannot run the code with optimization.

0 Kudos
EBERLAZARE_I_Intel
1,617 Views

Hi,


Got it, I will need to check with our internal team on this.


Just to check, if its possible if you share the example design for this? Also which version again are you using and your machine environment that you tested this previously?


0 Kudos
EBERLAZARE_I_Intel
1,598 Views

Hi,


Do you have new update? Could we get the example design somehow?


0 Kudos
EBERLAZARE_I_Intel
1,557 Views

Hi,


If you could provide us the code for us to do some replication on our side that would be helpful.


Alternatively, you could provide a simple design that would show the issue itself would also work for us.


As we do not receive any response from you on the previous question/reply/answer that we have provided. If you have a new question, Please login to ‘https://supporttickets.intel.com’, view details of the desire request, and post a feed/response within the next 15 days to allow me to continue to support you. After 15 days, this thread will be transitioned to community support. The community users will be able to help you on your follow-up questions.



p/s: If any answer from the community or Intel Support are helpful, please feel free to give best answer or rate 4/5 survey.


0 Kudos
mametarou963
Beginner
1,486 Views

Sorry for the delay.

I have tried many things and as a result the simplest sample code is as follows.

 

* The following source is executable

```
#pragma GCC optimize ("O0")
static void SampleFunction( )
{
IORD_16DIRECT( 0x3400000 , 12 );
}
```

* On the other hand, the following source is not executable.
(Specifically, if you place a breakpoint on `IORD_16DIRECT` in the debugger and step through it after stopping, it will fail.)

````
static void SampleFunction( )
{
IORD_16DIRECT( 0x3400000 , 12 );
}
```


Since the entire compilation option is subject to -O2, -O0 is applied to the former source and -O2 to the latter source.

Is there any possible problem?

 

0 Kudos
Reply