Nios® V/II Embedded Design Suite (EDS)
Support for Embedded Development Tools, Processors (SoCs and Nios® V/II processor), Embedded Development Suites (EDSs), Boot and Configuration, Operating Systems, C and C++
12621 Discussions

ISR driven UART is missing characters (even at 9600bps... same prob at 230400)

Honored Contributor II

Any ideas on what I'm missing here that is causing me to miss characters?  


I'm interfacing to a device that shoots out RS232 packets (up to 200 per second if the bps can handle itI can select the BPS, but it is always N-8-1). Right now, I have the device running (and the UART) at 9600bps and locked to 5 packets per second... although, this problem happens at 115200, 230400, and is terrible at 460800bps(which is where I need to get). 


The UART seems to be missing characters, especially towards the end of the 'packets' which are around 100bytes in length. towards the front of the packet I seem to get everything and towards the 'rear' it starts to miss them... the 'ISR' itself generally takes 500ns to execute, but that doesn't include the call latency... but I can't see that being a problem at 9600bps... Also the ring-buffer is 4k in size. I have set a breakpoint so that I can debug after it gets 1024 characters and the ring-buffer is missing them... so it can't be a wrapping/overflow problem.... grrrr. 


Here's how I'm setting up my code in main to test this problem: 

alt_irq_init (ALT_IRQ_BASE); 


unsigned char ch; 

unsigned int baudrate=9600; 



//register the ISR handler for UART 0) 




//register the ISR handler for UART 1 




//enable Interrupts for UART 0 



//enable Interrupts for UART 1 




There there's the big 'while(1)'in main(): 

while (1) 

{ //update heartbeat on JP5 header 

jp5tracker ^= 0x01;//D0 on L.A. 



{ //toggle LEDG1 when characters are processed from VN100 plugged into the HSMC-ICB 

greenLEDtracker ^= 0x02; 



jp5tracker ^= 0x02;//D1 on L.A. 


i++;//counter to set a breakpoint with 


here's the ISR handler and some helpers for the ring buffer. 


void UsartInit_2(unsigned int BaudRate_2) 

unsigned int divisor_2; 

divisor_2 = (ALT_CPU_FREQ/BaudRate_2)+1; 




void IsrUsart_2(void* context_2, unsigned int id_2) 

IOWR_ALTERA_AVALON_PIO_DATA(JP5_BASE, 0xFF);//inserted to time/track this routine 

//grab data byte and save it 


//move/wrap buffer pointer 

if (USART_RX_buffer_2 == '*') 


if (USART_RX_hd_2 == USART_BUFFER_SIZE_2) USART_RX_hd_2 = 0; 

USART_RX_full_2 = (USART_RX_hd_2==USART_RX_tl_2); 

IOWR_ALTERA_AVALON_PIO_DATA(JP5_BASE, 0x00);//inserted to time/track this routine 


unsigned char UsartIsEmpty_2() 

return ((!USART_RX_full_2) && (USART_RX_hd_2==USART_RX_tl_2)); 


//this grabs a single character from the RX circular buffer. 

unsigned char GetUsart_2(void) 

unsigned char rxChar_2; 

//should never call this w/o checkign status, but it may happen. this keeps 

//the ring buffer sane during that sort of abuse. 

if (((!USART_RX_full_2) && (USART_RX_hd_2==USART_RX_tl_2)))  

return 0x00; 



if (USART_RX_tl_2 == (USART_BUFFER_SIZE_2))  



return rxChar_2;  


Anyway... if I write some debug code on a PC to send data... as long as I insert some 'sleeps' in it between characters... never misses a character... Also, I can watch the ISR's debug 'flip' of JP5 and the ISR does get called, and the circular buffer does get populated... it is just missing characters. I have verified (with a PC's serial port) that the packets are being sent properly and contain all the data. 



Here's what I'm using: 

DE2-115 development board(cyclone IV) with HSMC-ICB (Altera's INK) 

Quartus13.0 SP1 with EDS, the UART is: 

UART (RS-232 Serial Port) 

Name altera_avalon_uart 



Some BSP settings: 

enable_small_c_library-'checked'.... 'nocheck' for enable_gprof, enable_reduced_device_drivers, or enable_sim_optimize 

'none' for sys_clk_timer, and timestamp_timer
0 Kudos
13 Replies
Honored Contributor II

Here is a screenshot of the problem occuring. The green trace is the input waveform at 9600bps and the ISR line is a logic analizer pod running on the line the ISR drives high and then drives low when it exits... so, not a lot of time is being spent there. Around 80% of the characters are captured into the circular buffer. There is a similar happening at 115200.

0 Kudos
Honored Contributor II

I can't hand-decode the serial data from the scope image (too small), but it might be worth doing it. 

Have you also checked that all the receive bytes are correct? 


It might be that the receiver is losing sync with the data stream (ie missing a start bit) and then treating a mid-byte bit as a start bit. 

One possibility is that the clock being fed to the uart is not accurate enough. Typically a UART uses a 4x or 16x clock in order to allow for differences in the clock speeds. 

For continuous receive a slightly fast clock is probably a lot better then a slightly slow one as it generates a small amount of line-idle between the stop-bit of one character and the start-bit of the next.
0 Kudos
Honored Contributor II

Thanks dsl.  


I think that the UART CORE is spotting the start-bits--mostly. (there's never garbage in the ring-buffer). But, I'd expect framing errors if it was seeing activity on the bus and not getting start/stop bits where the CORE expected them. But I don't get any "Frame Errors" in the status registers. And based on the documentation of the module, I don't see a way to disable that feature, so it should be capable of generating them if I unmask/configure them. 


What I'm observing is that hyperterminal in windows (and cutecom in linux) never miss anything from the device I'm trying to use with the FPGA... even up to ridiculous baud rates like over 900K... The DE2-115 board's level converters aren't rated for those speeds, but 9600bps... and 230400bps are well within the 250kbps rating of the devices... I'm really starting to lose hair on this...
0 Kudos
Honored Contributor II

Can you feed the IRQ line itself out? 

Or use signaltap to trace the information inside the fpga? 


It might be that something is disabling interrupts - although the long latencies at only 960 baud do look 'iffy'. 


I'd suggest using a hardware receive fifo instead of a software ring buffer - especially if you want to run at high baud rates. 

Depending on what else the nios cpu is doing you may not even need to take a rx interrupt - just look at the fifo status where the code currently looks if anything has been written to the ring buffer.
0 Kudos
Honored Contributor II

We are still evaluating, so still using the web-edition NIOSII and IP. So not much hardware debug info available. 


I've tried just pasting stuff into hyperterm at 115200 and it loses characters at roughly the same rate as the little device I'm trying to listen to. cutecom has a setting to put a delay in.. and if I set it to 2ms at 115200 it gets every character and if I set it to 230400 I have to adjust it to 3ms to ensure no probs. 



since the code that I've provided seems to work correctly at a lower rate... maybe the problem is in how I'm 'setting up' the hardware... I'm using the bsp/project template for 'hello world freestansing'... as part of that I call  


alt_irq_init (ALT_IRQ_BASE); 



and that is before I start my code: 




I think I'm going to try ripping out all the hardware except for the JTAG uart, the uart I'm using and one timer next.
0 Kudos
Honored Contributor II

OK, did the above... recompiled code... and now every time I run thru the ISR I check the flags and I'm getting this after I try to send a 500byte file via hyperterminal's 'send text file' at 115200bps(n-8-1): 



I decode this to be (right to left) 

ROE (receive overrun error) 

TMT (Transmit Empty) 

TRDY(Transmit Ready) 

RRDY(recieve Ready) 

E (exception) 


So, it appears that the NIOS2 has an IRQ latency of around 2-3ms?
0 Kudos
Honored Contributor II

The JTAG debugger uses interrupts, NFI how often or how long they'll run with interrupts disabled. 

The nios cpu itself is capabable of almost zero interrupt latency (use a separate register bank for the ISR). 

The Altera HAL interrupt entry/exit code is another matter entirely! There could be a lot of instructions between the cpu taking the interrupt and your ISR actually running. 

OTOH 2-3ms does seem an awful lot.
0 Kudos
Honored Contributor II

I have the same problem with that code in 'run' or 'debug' mode.  


Using the UART with FIFOs from the wiki pages and putting a 512char buffer on it solves the problem... and I swapped to just polling the RXRDY flag in a background loop... when now there's only like 800 'events' and no ISR overhead. I'm getting all the characters and still have quite a bit of idle time on the processor. whereas before it was always overwhelmed by ISR calls... even at 9600bps... The design now runs flawlessly at 230400bps. 


I was reading somewhere that the NIOS2 is actually quite slow compared to a lot of the MCU's that I'm used to using... ISR response on the order of 200 clocks vs much lower latency for things like AVR/PIC/TMS/etc... So, it would seem like I've re-discovered the wheel. 


I just can't believe that the ISR performance would be so terrible though...
0 Kudos
Honored Contributor II

If you configure the nios to have all its code and 'normal' data in tightly coupled memory, ensure everything is compiled with -O3, and avoid using any of the Altera HAL/libc code then you can do quite a lot with a 100MHz cpu (and run 1 instruction almost every clock). 

If does take a little effort to stop gcc generating code that has pipeline stalls following memory reads (I don't thing the model that gcc is working on is correct). 

There is also a hidden config menu for the nios cpu that lets you (amongst other things) generate a /f without the dynamic branch prediction login. This is needed if you really want to guarantee the execution time. 

The altera docs do not make it 100% clear that the cpu stalls for the full duration of any Avalon cycle - even writes are not 'posted'. 

The only undocumented stall I've found is a single cycle stall when a read from a tightly coupled data block immediately follows a write to the same block (both addresses would have to be presented at the same time!).
0 Kudos
Honored Contributor II

Oh - the ISR performance 'problem' is probably all in the code that runs after the cpu takes the interrupt and before your ISR runs. 

If you write your own ISR entry/exit you'll probably do a better job!
0 Kudos
Honored Contributor II

I'm just starting with these tools, but that does make sense... it is behaving like a stall/flush. I was trying to get the baseline running before I started optimizations to add more features. 


I'm wondering if I need to swap the processor to a Nios II/S and lower the amout of instruction cache... refilling that would slow ISR performance too, but the II/F runs really fast in a straight line... It takes about 30 minutes to Generate/Recompile and then I can run my timing tests, so I need to implement some other things and then I can come back to this one and optimize it.
0 Kudos
Honored Contributor II

Hi David, 

i dont understand your problem, but your code for uart was the first, which works me.  

Thank you. (i have had problem in Quartus 12, NIOS2 EDS 12 that the# include <sys/alt_irq.h> didnt know these commands 


alt_irq_register(UART1_IRQ,&context_uart1,IsrUart1 ); // install UART1 ISR 

alt_irq_enable (UART1_IRQ);  


When i use your init_uart and handle routine, it works: 

// obsluha preruseni od uartu1 

void IsrUart1(void* context, unsigned long id) 

int ch; 



// printf("Jsem v preruseni=%X\n",ch); 

// pokud prisel znak, precti jej 



// printf("Ctu znak=%X\n",ch); 


if (ch==GPS_PREFIX) rx_start=1; 

if (rx_start==1) 

{ rx_buf[rx_cnt]= ch; 

if (rx_cnt<RX_BUF) rx_cnt++; 

if (rx_cnt==RX_BUF) { rx_flag=1; rx_start=0; } 


// inicializace preruseni od uartu1 

void init_uart1() 

alt_irq_init (ALT_IRQ_BASE); 

// alt_sys_init(); //  


// nastav 9600Bd pro NIOS2 na 80MHz 




// povol preruseni od uartu1 






Jan, Czech Republic
0 Kudos
Honored Contributor II

Glad it could help you. And just like dsl said.... My problem was all in how the ISR was handled, the code was fine--I had effectively shot myself in the foot with the NIOSII configuraiton (lots of cache/etc..) That's what makes problems like I had so complicated to fix... there was nothing wrong with the 'C'... it was in how I had configured the NIOSII and the UART... enlarging the buffer and leaving the CPU alone, I was able to run all the way up to 250Kbps (which is the limit of the DE2-115 level converter--e.g. as fast as my hardware could support.. the FPGA/NIOSII/UART could have gone faster.)

0 Kudos