Nios® V/II Embedded Design Suite (EDS)
Support for Embedded Development Tools, Processors (SoCs and Nios® V/II processor), Embedded Development Suites (EDSs), Boot and Configuration, Operating Systems, C and C++

Custom Linux is 1000x slower

Honored Contributor II

I am using an Altera Cyclone V SoC dev board. 



I built a custom Linux platform using Yocto and I finally got everything working and in a state in which I could run my AOCL program... However my FPGA program now runs about 1000x slower than when I use a prebuilt image. I am getting the following error on boot and I am guessing that it is related to the poor performance: 

hwclock: can't open '/dev/misc/rtc': No such file or directory INIT: Entering runlevel: 5misc/rtc': No such file or directory  


There are no rtc files on the board as far as I can tell (find / -name "*rtc*" returns nothing). 



Two questions then: 


  1. Any ideas why I am taking a 1000x performance hit using my custom Linux build? 

  2. If my assumptions are correct in that the hwclock/rtc is the culprit, any ideas on how to rectify this? or add something to my Yocto build or device tree or something? 

0 Kudos
1 Reply
Honored Contributor II



I've modified my my Linux image and fixed the issue with the `rtc`, however this did not help the performance! 


Some interesting behaviour though, I launch my AOCL kernel multiple times, the first launch has execution times that appear to be in the ideal range of `0 < t < 1000ms`, and all subsequent launches report times of almost exactly `t=1000ms`. This leads me to believe that there is a `1s` clock somewhere that is responsible for the performance hit. My thinking is the first kernel launch could fall anywhere in the clock cycle, and returns on the next edge, hence `0 < t < 1000`, whereas subsequent launches will occur on the clock edge, hence `t=1000ms`. 


I also tested the `vector_add` example on my custom image and the prebuilt image (16.1); with my custom image it reports variable kernel time on the order of ` 0 < t < 1000ms` like I was seeing with my kernel. Using the prebuilt image I get consistent performance around `t=8.5ms`.  


I would love to hear your thoughts and suggestions! :)
0 Kudos