- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi!
I've been fooling with stack and data corruption for a long time. Now I've decided to run the eCos tests, and I've seen that some of them don't work, and I've found stack corruption in one of them (one, at the moment). "stack base not word aligned". Also I've found some timer tests that don't work, but I've not investigated them yet. Anyone has experienced something similar? Any idea? AlexLink Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Here's another example. While running IN THE ISS (simulator) the test "stress_threads.c", I've got the following error:
" INFO:<Stress threads test compiled on Feb 14 2005> State dump 1 (0 hours, 1 minutes) [numbers >>0] Handler-invocations: 4 4 4 4 3 3 3 3 2 2 2 2 1 0 0 0 0 0 0 malloc()-tries/failures: -- 3700 0 client_makes_request: 0 Memory system: Total=0x000ba6a0 Free=0x000ba47c Max=0x000ba47c Stack usage: Error! : Failed memory access in component cpu - Reading data 0x12f1d3 from uninitialised memory (addr = 0x8ff000) cpu::nios2ModelRun: ERROR: [19761461] load signed byte access to address 0x8ff000 returned uninitialized memory (valid mask=0x0) " The Nios I am using only has a timer, flash and RAM. The eCos packages are ONLY the default from the "Nios II development board", "default". So one would expect that the eCos tests, with the minimum Nios2 configuration and the default kernel worked, isn't it? Any idea? Alex- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Part of the release process for the Nios II eCos snapshots is that we ensure that all these tests pass. To understand why they're failing for you, can you provide more details on how you are running them and what hardware you are using?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi!
I've been testing with different hardware settings, but for the last try I have used the standard 1c20 Altera example (altera/kits/nios2/examples/vhdl/niosII_cyclone_1c20/standard/standard) leaving only the cpu, ram, flash, timer and jtag_uart. I open the ptf with the nios2configtool, and change the text, rodata and rwdata regions to ext_ram. I also uncheck "Work with a ROM monitor". I also set the startup type to ROMRAM Then, I start the eCos kernel building, and the eCos tests building. It works OK. In order to test the tests, I have created a new custom project, and copied the elf executable test files to my project tree. Then I configure the project with the ptf of the evaluation board, and I try to run it with the ISS emulator, and the result is: " Info : Successfully read SOPC Builder PTF file 'C:\altera\NIOS2_Projects\NIOSII_standard\std_1c20.ptf' Info : The SOPC Builder system contains the following modules: Info : Bus module 'cpu_instruction_master_bus' - avalon.dll Info : Bus module 'cpu_data_master_bus' - avalon.dll Info : Master module 'cpu' - altera_nios2.dll Info : Slave module 'ext_flash' - altera_avalon_flash.dll Info : Address span: 0x0000-0x7FFFFF (cpu_instruction_master_bus) Info : Address span: 0x0000-0x7FFFFF (cpu_data_master_bus) Info : Slave module 'ext_ram' - altera_memory.dll Info : Address span: 0x800000-0x8FFFFF (cpu_instruction_master_bus) Info : Address span: 0x800000-0x8FFFFF (cpu_data_master_bus) Info : Slave module 'sys_clk_timer' - altera_avalon_timer.dll Info : Address span: 0x920800-0x92081F (cpu_data_master_bus) Info : Slave module 'jtag_uart' - altera_avalon_jtag_uart.dll Info : Address span: 0x920820-0x920827 (cpu_data_master_bus) Warning : SOPC Builder system component reconfig_request_pio is not supported by the simulator. Simulation may be incorrect if your software attempts to access it Info : Slave module 'reconfig_request_pio' - altera_avalon_pio.dll Info : Address span: 0x920890-0x92089F (cpu_data_master_bus) Info : Slave module 'uart1' - altera_avalon_uart.dll Info : Address span: 0x9208A0-0x9208BF (cpu_data_master_bus) Warning : SOPC Builder system component sysid is not supported by the simulator. Simulation may be incorrect if your software attempts to access it Info : Slave module 'sysid' - altera_avalon_sysid.dll Info : Address span: 0x920828-0x92082F (cpu_data_master_bus) Info : Configuring 'std_1c20' model Info : PTF Setting jtag_uart/SYSTEM_BUILDER_INFO/Iss_Launch_Telnet="0" detected Info : 'jtag_uart' character stream will be displayed in this window Info : The host communication device for stdin is jtag_uart Info : The host communication device for stdout is jtag_uart Info : The host communication device for stderr is jtag_uart Info : PTF Setting uart1/SYSTEM_BUILDER_INFO/Iss_Launch_Telnet="0" detected Info : 'uart1' character stream will be displayed in this window Info : Running 'std_1c20' model INFO:<Stress threads test compiled on Feb 15 2005> State dump 1 (0 hours, 1 minutes) [numbers >>0] Handler-invocations: 4 4 4 3 3 3 3 2 1 1 1 1 1 1 0 0 0 0 0 malloc()-tries/failures: -- 3200 0 client_makes_request: 0 Memory system: Total=0x000b9fb0 Free=0x000b9d8c Max=0x000b9d8c Error! : Failed memory access in component cpu - Reading data 0x12f1db from uninitialised memory (addr = 0x8ff000) cpu::nios2ModelRun: ERROR: [14781352] load signed byte access to address 0x8ff000 returned uninitialized memory (valid mask=0x0) Stack usage: Info : Component cpu's program has terminated Info : Instructions executed = 14781352 Info : cpu simulation return code 10 Info : Exiting std_1c20 model with return code 10 (0 fatal errors, 1 error, 2 warnings) " I have not been able yet to run it in the development board because we have some problems with it, but the fact is that we have had similar results with other hardware configurations, so we don't know what the reason could be. Also notice that when we run it in the board, it also hangs, and the debug does not respond. The fact is that other tests work, but I think this test should also work, as it checks the stability of the system. If you know any thing I can try... Thanx Alex- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I've got to confess that I always test eCos on hardware, rather than through the ISS. However after a cursory glance through the code, it seems that there is an assumption that all memory is initialised to zero within the stack checking part of this test. This is what's probably causing the problem in the ISS, because it's (rightly) saying that the memory is actually uninitialised.
It looks like you might be able to solve the problem by turning on CYGFUN_KERNEL_THREADS_STACK_CHECKING. This should make sure that the stack contents are explicitly initialised. The code to look at is contained in stackmon.h.- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I usually also test in hardware, but as the tests didn't korked, I tried to do it in the ISS simulator, in order to see if the problem was software or hardware.
The fact is that if I run it in the evaluation board (now I can), all the tests (that didn't worked before) work except "stress_threads". If I debug it, it runs until "main", and then continues. When I stop it, the value of PC is 0x10000 (more or less), which is not a valid address for executing code, isn't it? Obviously, the program doesn't works I've seen also that sometimes the console shows "Bad byte in chunk". NOW. I have activated xxx_STACK_CHECKING, and also CYGPKG_INFRA_DEBUG, because the previous option needs asserts activated. Now, when I run the "stress_threads", it says " ASSERT FAIL: <1>stream.cxx[585]Cyg_ErrNo Cyg_StdioStream::write() Stream object is not a valid stream! ASSERT FAIL: <1>stream.cxx [ 585] Cyg_ErrNo Cyg_StdioStream::write() " I have seen asserts like these many times while running my code. Asserts that, if a line of code is added (e.g. a printf) , disappear (sometimes, not always). I thought it was some part of my code that was wrong, but looking at this, I think there's another problem. Notice that this assert is happennig before "main", because I have the debugger configured to stop when the program user starts, and the asserts appears before that. Any idea? Alex- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Although it's a little painful, all I can suggest at this point is that you remove the asserts (since that just seems to be causing new problems), and start stepping through the code to see where it's blowing up. That may help to pinpoint the cause of the problem.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The fact is I don't think ASSERT is the problem, because before activating the "use asserts" option, the test didn't worked at all. I only did it in order
Also, if I now simulate with ISS the SAME program, it does not show any assert, only the same error " Failed memory access in component cpu - Reading data 0x12f1db from uninitialised memory (addr = 0x8ff000) cpu::nios2ModelRun: ERROR: [14859983] load signed byte access to address 0x8ff000 returned uninitialized memory (valid mask=0x0) " So it's very strange that the execution and the ISS differ that way. And now... I have made "make clean" in eCos, rebuild (without changing anything), and the error in the stress_threads has changed. Now it says: " ASSERT FAIL: <1>malloc.cxx[144]void* malloc() Allocator has returned badly aligned data! ASSERT FAIL: <1>malloc.cxx [ 144] void* malloc() " So... it's strange...again. I don't know exactly what to test next, because I've removed the asserts, and I'm the same point that I began: the program hangs. I can hardly debug, and when I get the program to stop (for some reason I CAN'T add breakpoints) the PC is at 0x1013c, or similar, which is not goog, having the RAM at 0x800000 ¿¿?? Alex Alex- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
OK. I've got one thing with sense.
I've modified the "stress_threads.c" file, in order to reduce the number of threads of the system. If I reduce the number of threads to 5 MAX_HANDLERS, and one listener and one client, it works OK. BUT if I increase the number of listeners and clients to 4, it sometimes says "Bad byte in chunk". Looking at the code, there is a task that gets a series of memory spaces, fill them with patterns, and later free them checking if the pattern is the same. So the point is that the checking of the patterns is failing. Why? I don't know. If I increase the number of tasks, it hangs. But the fact is that there's enough memory free " Memory system: Total=0x000c5df8 Free=0x000c5bd4 Max=0x000c5bd4 " so I don't understand what's the problem... ¿¿?? Alex- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The test you are running makes a lot of calls to malloc. To understand exactly what the memory requirements are here, would require checking what block size is being used by the memory allocator.
The fact that this test will succesfully run if given more memory (e.g. if you run out of SDRAM on the Altera reference designs), then I could believe you are simply running out of memory somewhere.- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I've looked at the mallocs, and the test only calls malloc one time. The program counts the times malloc fails, and I have added a check in order to see if malloc returns null.
The result is that my check doesn't appars (it seems that all mallocs succeed), and the tracing of the test is: " State dump 1 (0 hours, 1 minutes) [numbers >>0] Handler-invocations: 3 3 2 2 2 1 1 1 1 1 0 0 0 0 0 0 0 0 0 malloc()-tries/failures: -- 1700 0 client_makes_request: 0 Memory system: Total=0x000b9e60 Free=0x000b9c40 Max=0x000b9c34 Stack usage: Interrupt : Interrupt stack used 4096 size 4096 Idle : Idlethread stack used 440 size 2048 Main : stack used 1140 size 5688 Handler : stack used 2616 size 7128 Handler : stack used 2616 size 7128 Handler : stack used 2680 size 7128 Handler : stack used 2688 size 7128 Handler : stack used 2664 size 7128 Handler : stack used 2616 size 7128 Handler : stack used 2692 size 7128 Handler : stack used 2616 size 7128 Handler : stack used 2680 size 7128 Handler : stack used 2688 size 7128 Handler : stack used 0 size 7128 Handler : stack used 0 size 7128 Handler : stack used 0 size 7128 Handler : stack used 0 size 7128 Handler : stack used 0 size 7128 Handler : stack used 0 size 7128 Handler : stack used 0 size 7128 Handler : stack used 0 size 7128 Handler : stack used 0 size 7128 Listener : stack used 1120 size 5688 Listener : stack used 1048 size 5688 Listener : stack used 1080 size 5688 Listener : stack used 900 size 5688 Client : stack used 344 size 5688 Client : stack used 368 size 5688 Client : stack used 300 size 5688 Client : stack used 292 size 5688 State dump 2 (0 hours, 2 minutes) [numbers >>0] Handler-invocations: 47 40 36 29 23 16 13 12 9 8 2 1 0 0 0 0 0 0 0 malloc()-tries/failures: -- 23588 0 client_makes_request: 0 Memory system: Total=0x000b9e60 Free=0x000b9c40 Max=0x000b9c34 Stack usage: Interrupt : Interrupt stack used 4096 size 4096 Idle : Idlethread stack used 440 size 2048 Main : stack used 1140 size 5688 Handler : stack used 2688 size 7128 Handler : stack used 2712 size 7128 Handler : stack used 2688 size 7128 Handler : stack used 2712 size 7128 Handler : stack used 2688 size 7128 Handler : stack used 2696 size 7128 Handler : stack used 2692 size 7128 Handler : stack used 2688 size 7128 Handler : stack used 2692 size 7128 Handler : stack used 2696 size 7128 Handler : stack used 2664 size 7128 Handler : stack used 2632 size 7128 Handler : stack used 0 size 7128 Handler : stack used 0 size 7128 Handler : stack used 0 size 7128 Handler : stack used 0 size 7128 Handler : stack used 0 size 7128 Handler : stack used 0 size 7128 Handler : stack used 0 size 7128 Listener : stack used 1160 size 5688 Listener : stack used 1152 size 5688 Listener : stack used 1136 size 5688 Listener : stack used 1152 size 5688 Client : stack used 376 size 5688 Client : stack used 368 size 5688 Client : stack used 400 size 5688 Client : stack used 436 size 5688 " So it seems that there's still 0x000b9c34 memory free, so ... I am working on trying the test on the Evaluation board with all the modules, included the SDRAM... Alex
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page