Nios® V/II Embedded Design Suite (EDS)
Support for Embedded Development Tools, Processors (SoCs and Nios® V/II processor), Embedded Development Suites (EDSs), Boot and Configuration, Operating Systems, C and C++

EPCS Controller Woes....

Altera_Forum
Honored Contributor II
2,870 Views

Hello, 

 

Trying to port a NiosI project to NiosII. To get started I've successfully created my custom board, and made a minimal project with a NiosII/f and a NiosII/e and SDRAM. 

 

One key new feature for us is the ability to use the new Flash Bootloader so I've included a EPCS controller as well. Problem is that Quartus is unable to fit this design whenever the EPCS controller is included even though A&S shows only 52% LE and 59% memory useage in my EP1C12. 

 

The error is "Can't Place 254 RAM cells" with no other elaboration. 

 

As a secondary question: Can someone comment on the likelyhood of using an addtional ST serial flash device within the NII Flash Programming facility? (our board has only a 4Mb EPCS and an M25P80 8Mb serial device - no CFI type flashes) 

 

Thanks, 

Ken
0 Kudos
11 Replies
Altera_Forum
Honored Contributor II
1,329 Views

Maybe I should rephrase the question. I have a dual processor NiosI system on my custom board that I would like to port to NiosII and would like some pointers. 

 

Currently the main Nios_0 is the "main" cpu and Nios_1 is a small secondary processor that acts as a high priority IRQ handler and pre-processor. 

 

Nios_0 runs and resets at the beginning of 16MB of sdram. (the only external mem in the system) Nios_1 runs and resets in a 10K chunk of on-chip ram. 

 

Each processor has a very small but separate chunk of on chip ram that holds its vtable. 

 

Another small piece of on chip ram acts as a "mailbox" for the two processors to talk through, but bulk data is processed in sdram buffers. 

 

In deciding how to port to NiosII and the bootloader, it is not apparent the best settings for Reset and Exception addresses for the two processors. Do I reset to the epcs_controller or sdram? Can the epcs_controller be the reset for both processors? Can the sdk handle both processor running out of sdram? (at different offsets) 

 

Are there any multi-processors examples available that demonstrate the features I require? Are there any pointers for making use of more that 60% of the onchip ram? (I had inched up to 78% on my NiosI system) 

 

Any help would be appreciated. We've been at this for some time now and everthing always works in isolation, (USB, streaming, processing, realtime IRQ) but putting together the final commercial quality system is proving elusive. 

 

Thanks, 

Ken
0 Kudos
Altera_Forum
Honored Contributor II
1,329 Views

Hello, 

 

(is this thing on? http://forum.niosforum.com/work2/style_emoticons/<#EMO_DIR#>/smile.gif  

 

Isn&#39;t there anyone interested in more than hello_world type activities?  

 

Hint - Forums work best when there is a mix of questions *and* answers http://forum.niosforum.com/work2/style_emoticons/<#EMO_DIR#>/smile.gif  

 

I&#39;ll answer my own questions here (eventually) and participate by helping others where I can for awhile longer, but some expert injections are really needed to get this forum hopping. 

 

IMO of course. 

 

Ken
0 Kudos
Altera_Forum
Honored Contributor II
1,329 Views

Hi Kenland, 

 

It is best to reset to your flash epcs_controller, not SDRAM. This way, the reset vector will have a known valid address at reset. The flash can be used to hold reset vectors for both processors, and should probably contain 2 separate reset vectors, one for each Nios II procesor. The the bootloaders located in flash will copy code to be executed out of SDRAM. 

 

Does this help? 

 

Best regards, 

Stephen 

 

Altera Embedded Applications Engineering
0 Kudos
Altera_Forum
Honored Contributor II
1,329 Views

Hi Stephen, 

 

So I can set both processors to Reset from the EPCS_controller but I&#39;ll have to supply different Offsets? 

 

Will the SDK be smart enough to block these two areas for proper operation of malloc() say? 

 

So say my auxilary processor only needs 8KB for its entire world and my main proc needs an unknown (but ~1MB) for its code with the balance of the 16MB for shared data. How would I set this up? 

 

Are there any multi-proc examples that would demonstrate this? An Altera FAE had told me there was an 8 proc example floating around, but no mention of the use of the Flash Programmer/Bootloader. 

 

I&#39;ll be happy to summarize and share my results once I&#39;m up and running. 

 

Thanks, 

Ken
0 Kudos
Altera_Forum
Honored Contributor II
1,329 Views

Hi Ken, 

 

Sorry, we didn&#39;t mean to leave you out in the cold. I see you have a number of things going on. Stephen gives one suggestion for boot-loading (where to put the reset vectors). Really the answer is "where ever the code is"....so if you have a boot loader in EPCS flash at a known offset, then put the reset vector there. Likewise, if you&#39;re using an on-chip ram that is configured with the boot-loader, you&#39;d reset there. 

 

About resetting/booting from EPCS:  

When you choose to put your prog/data memory into, say, SDRAM for a software project (in the IDE), and your reset vector is pointing to some other memory (flash, EPCS chip), the IDE, during software build, will assume that you want to boot out of the non-volatile memory. It will create flash programming files that place a small boot loader before your code + initialized data, and when you use the flash programmer it will program this data into flash starting at the reset address you specified in SOPC Builder. This boot-loader is Altera-provided and gets placed in without any user code. In the case of a conventional flash chip the boot loader prepends the data going into flash, and for the EPCS flash, the boot loader will live in a small onchip memory (more detail on this below). 

 

About resetting/booting from on-chip memory:  

If you take this approach you&#39;re on your own to write a boot loader. You&#39;d do so by setting up a small software project in the IDE and specifying that program/data memory be housed in that on-chip memory. As part of the software build process, the IDE will generate memory initialization files that Quartus will pick up and use to initialize the memory with during place & route. The "small" example designs and hello_led software app demonstrate this. So... this path isn&#39;t nearly as automated as our flash boot-loaders. However, it can provide you with additional flexibility. For example, you can setup an onchip memory that Nios boots from, and then once that boot procedure is done, use that same memory for some other useful purpose like your vector table, scratch pad RAM, or the mailbox between CPUs.  

 

On multiple CPUs: 

You are correct that you can run multiple CPUs out of the same physical memory, provided that their program/data spaces won&#39;t clash with each other. The caveat here is that with the first release of Nios II, we don&#39;t provide a clean way of doing this (for example, in the IDE you can choose a memory device for program/data space in a given application -- not a memory *range* within a memory device). Really what this sets up is a linker script which tells the linker where it can and cannot use memory. So, for multiple CPUs sharing a single memory you&#39;ll need to modify a linker script to segregate a single memory into two areas, and then use this linker script for each processor&#39;s software build. Writing linker scripts is not what I&#39;d call exciting, so if you really do need to share a common memory device between two Nios&#39;, I&#39;d suggest creating a software project using the auto-generated linker script (which will make a file called &#39;generated.x&#39; in your syslib), and then tweak this linker script to dice up the memory between the two CPUs, and finally, use the new scripts for building software for each CPU. Just so you know, supporting multiple CPUs in a clean manner (from the IDE GUI) is definitely on the road map as far as things to come! 

 

On the no-fits due to memory: 

It sounds like the error about not placing the 254 ram blocks is a separate issue -- is this still a problem? Instantiating the EPCS controller will use some memory - 1Kbyte of memory in the form of a M4K blocks if I remember correctly, which is used for the boot-loader code that I mentioned above. If I remember correctly a RAM block is (at smallest) 512 bits of memory (M512)… so the message that 254 could not be placed is a bit weird. I&#39;ve seen this problem when, for example, I&#39;d done something silly like make a 2K ram and tell SOPC Builder to use an M-RAM block that my device didn&#39;t have. If the problem is beyond these basics, please let me know and perhaps we can talk about it over email. 

 

edit: Sorry, I just realized you want to boot both CPUs out of the same EPCS chip. I&#39;m not sure if this will fly -- time for an experiment. If Stephen doesn&#39;t know off hand one of us will probably try an experiment out. I suppose whether this can be done depends on how flexible the boot loader is that I described above.
0 Kudos
Altera_Forum
Honored Contributor II
1,329 Views

Hi Jesse, 

 

Thanks for the extended response http://forum.niosforum.com/work2/style_emoticons/<#EMO_DIR#>/smile.gif  

 

My initial thought/fear was that the EPCS_controller had MRAM blocks specified or something, because I had selected Cyclone as my device. Anyway, it seems to be doing much better now. 

 

I don&#39;t necessarily want to do anything a particular way. I&#39;m more asking what the best practice is for someone wanting to build a EPCS bootable, multiple cpu system. (easy debugging very important) 

 

I wish the multi-cpu enhancements were here already! Doing this on NiosI means I&#39;ve got one cpu debugging off of JTAG and one off of the serial port. I&#39;d much rather have a unified/shared JTAG connection for the whole system. I may very well wind up needing a third cpu as well. 

 

I notice in my dual NiosII system that the epcs_controller address space spans only 2047 bytes. (7FFH) Is this correct for a 4Mb EPCS? Or is that memory range something other than the bits in the device. I would expect to have at least 2Mbits or 250KB at my disposal. 

 

My minimal dual N2 system now builds and so I&#39;m ready to dive into the IDE. Wish me luck. 

 

Ken
0 Kudos
Altera_Forum
Honored Contributor II
1,329 Views

Luck! 

 

How did it go?
0 Kudos
Altera_Forum
Honored Contributor II
1,329 Views

Hi Kerri! 

 

Thanks for the encouragement! 

 

Well, with a shaking hand I hit the Nios IDE button in SOPC builder and started a Hello_World project. My .ptf was all selected for me and the syslib was set to SDRAM and of course I chose to let the IDE manage my makefile. Great Start. 

 

Well...snag. The build fails with the error that a gcc-lib won&#39;t fit in onchip ram. The onchip ram location is a very small chunk I have my Exception Address set to for my COMM processor. I assumed this was like the vtable in NiosI? This is my main proc and I have it set to reset to the epcs_controller. Should I change my Exception Address to point to SDRAM? 

 

I don&#39;t see any interface to select where anything is to reside other than the syslib. Does the custom board settings have any bearing here? I assumed it was only to specify the chip, the MHz, the Flash sources, and to provide a temp config to do boot loading. (not sure about the last part) So my custom board system is very bare. Can&#39;t seem to open any of the devkit boards to compare. 

 

I&#39;m probably just missing something simple. This was in the wee hours last night.  

 

Any obvious mistakes/misunderstandings? I&#39;m about to start working on it again here. 

 

Thanks, 

Ken
0 Kudos
Altera_Forum
Honored Contributor II
1,329 Views

Hello All, 

 

The snag turned out to be that my onchip exception memory was too small. I had it set to 512 bytes which was just a guess. NiosI requires only 256 bytes. 

 

Anyway a search on Altera&#39;s Find Answers turns up info that doesn&#39;t specify this size.  

 

As a temporary fix, I&#39;ve set the Exception Address to sdram. 

 

It would be nice to know how big this area needs to be and how it relates to ISR&#39;s. My system (and I imagine everyone elses?) needs to minimize IRQ latency. An app note or at least a FAQ probably wouldn&#39;t be out of the question. This very topic has been discussed at length on usenet. (related to register windows on NiosI) 

 

Ken
0 Kudos
Altera_Forum
Honored Contributor II
1,329 Views

Hi Ken, 

 

Just wanted to follow-up from what I wrote last week. We&#39;re working on some more official documentation concerning the stuff below, but I figured it might be useful in the interim to get some of this info up on the forum! 

 

Currently you cannot direct both CPUs to boot from the same EPCS controller. Rather than a memory-mapped device it operates in a &#39;poke a register, read a result&#39; basis.. as such, having two CPUs fight over the EPCS controller is a bad idea. I had a discussion with our engr who is responsible for that (and is now thinking about multi-CPU boot-up for the future) who advised me of this. The same EPCS controller *can* be used to house code for two processors, however, but it requires a bit of a kludgy process to get into place. If you have separate non-volatile storage on your board it will be an easier process (never the less, I outlined the steps below).  

 

As far as the IDE goes - the chief limitation as I alluded the other day is that currently you can only select program/data memories on a device by device basis (so multiple processors is easy if they all use different memory peripherals), but for splitting up a single SDRAM you&#39;ll need to either make linker script modifications to partition your memory, or do some creative things with vector table assignments (details below). 

 

Debugging multiple CPUs *is* supported in the IDE and that is what I had alluded to earlier as we&#39;d tested 8 CPUs at once. I can send you an email if you like with a brief write-up on how to do this. The two important tricks are to enable multiple simultaneous run/debug sessions in the "Nios II Run and Build Settings" are of the IDE preferences, and to choose separate TCP ports for each CPU that you&#39;re debugging in the debug setup window prior to starting the debug session (the GDB debugger running behind the scenes uses TCP ports to communicate to the debug IDE). 

 

some other notes:  

on making your code fit into a small on-chip memory: by default a lot of stuff will get linked into even a simple application. Such is life in a "hosted" environment with device drivers and C libraries. However, you can make it smaller. I suggest checking out hello_world.c in the software examples and reading the comments.. it will tell you how to make things much smaller. There was a thread on this very topic the other day on this forum, I believe. Also consider doing a "free-standing" application (look at the Hello Freestanding example program). These tools will allow you to compile tiny programs that run from onchip memory. 

 

on exception locations in sopc builder: These need to be segregated. As you allude too this isn&#39;t too clear. Really it depends on how big the exception handling code is. The default code we link in is a little over 0x300 hex long (0x308 to be specific - just check an .objdump file and look for the code starting at the specified interrupt vector). In addition to the basic interrupt setup and ISR &#39;funnel&#39; code, this software includes things like SW implementations of multiply & divide for Nios II systems that don&#39;t have HW multiply or divide enabled (i.e., the Nios II /e core). However, the software developer may very well wish to chuck our ISR code in favor of something more simple that takes fewer bytes. The only SOPC Builder requirement is that the exception address be aligned to a 0x20 boundary in the memory map. 

 

that said, here is a cool multi-processor trick that deals with the exception addresses: If you want to share a common memory between two CPUs, consider putting the exception location for CPU# 1 at the beginning of shared memory, and then the exception location for CPU# 2 at the location of the "end" of memory that is available to CPU# 1. The reason is that the linker script we automatically generate in the IDE for a software project will link your app just after the exception address (and exception code) for a given processor. It will mark the "top" of the available memory space at the exception location for the next CPU. So.. if I have an SDRAM from 0x1000000 to 0x1ffffff, and put the exception addr for CPU# 2 at 0x1800000, half the memory will be linked against for CPU#1, and half for CPU#2. Both CPUs can live blissfully unaware that there is another processor in the system, other than the decreased memory bandwidth of course! For additional customization of how memory is partitioned, you&#39;ll have to edit the linker scripts to tell the linker where it can and cant put code/data/stack. 

 

some conceptual steps for booting code for two cpus out of the same epcs chip: This is just a concept, we haven&#39;t tried it yet and it would require a bit of development to get right: 

 

1. Designate a &#39;master&#39; and &#39;slave&#39; CPU. 

2. Have the master boot out of EPCS, the slave boot out of some other on-chip memory which directs the slave to spin in a loop forever until a register is set (by the master CPU, at the appropriate time). This could be done via assembly language if you want to keep the slave&#39;s boot memory ultra small. Of course, after the boot process this memory&#39;s job is done, so you could use it for your processor-message box or something else that&#39;s useful. 

3. Ensure that the code/data spaces of the application code for each CPU don&#39;t trample each other (as outlined above, either with exception address placement, or custom linker scripts). 

4. Build the application software for the slave CPU, and get an .elf file for it 

5. (Here is where we diverge from "have done" into "could be done"): Use a tool such as nios2-elf-objcopy to create a hex representation of the .elf that was compiled for the slave CPU. Then, suck this data into some initialized memory *for the master CPU* via a perl script or something similar (really what would be done is to have the perl script create a .c file with a big C array of data which contains the hex dump of the other .elf file). 

6. Build the master CPU&#39;s code. Its .elf file will now load both master and slave CPUs&#39; code & data into memory! 

7. Program this .elf file into the EPCS chip and have the EPCS boot-loader copy it into memory. 

8. When the master CPU wakes up, have it poke a register somewhere that the slave is aware of -- when this happens, your slave&#39;s boot program should direct it to jump to its application code at whatever address you linked the software to. 

 

The big disadvantage in the above process (besides getting it going in the first place) is that any change in the slave CPU&#39;s code will require the process to be repeated in order to get the EPCS chip reprogrammed. 

 

Whew, that is a bit of info to slog through, just reading back to myself what I typed! Like I say the above will require a bit of work to get done. Again, this is all applicable only for the immediate future. Multi-CPU support in a nice clean manner is yet to come, although it is way to early to say whether we&#39;ll have a clean mechanism for booting multiple CPUs out of the same EPCS chip.
0 Kudos
Altera_Forum
Honored Contributor II
1,329 Views

Hey Jesse, 

 

Boy, I hope you can type 70+ wpm!  

 

Where there is a will there is a way. I think you guys are plenty smart to figure out a clean multi-cpu system. Softcores and fpga&#39;s cry out for multi-cpu solutions. 

 

Maybe you build and link the system much like you do a single cpu, but use compiler directives to assign cpu&#39;s. (much like you can assign code and variables to memories) 

 

Or just bite the bullet and add the necessary logic to the linker to place multiple builds into N shared (and some non-shared memories). You could start with a simple first come first served algorithm for partitioning - much like the manual linker script editting. 

 

Anyway, any company that can create the SOPC builder can crack this nut. Think about the complex interconnections and dependencies that are specified there with so little effort on the user&#39;s part. 

 

I think I&#39;ll stick with my multiple NiosI system architecture for now. It&#39;s basically what you suggested with a master running out of sdram with the slaves running super tight code within onchip mem. Although I guess I need a way to place it. I guess I build a cpu_2.hex, cpu_3.hex, etc. Then they check their mailbox&#39;s for commands and data. 

 

Could your engineers write a tight reset loop that consumes say one M4k block (like your cool bootloader) and loops there until the master loads their code somewhere and then writes a jumpto address to break that tiny loop? 

 

Actually I guess I could write it since I wrote an srec loader. That was in C and a lot more than 1 M4k! The loop and jumpto in asm would be nothing though. 

 

One thing you could pass on is that not everyone has the largest Stratix device to hold large full service footprints. http://forum.niosforum.com/work2/style_emoticons/<#EMO_DIR#>/smile.gif  

 

Thanks, 

Ken
0 Kudos
Reply