I'm looking to integrate my galileo into a publish-subscribe sensor framework I work on, and I'm using an arduino sketch to get the sensor measumements, then a system call to call a python script which will generate and publish the reading out to the network.
But, it seems quite slow on the python front. I appreciate its a slower processor than I might be used to, but I still thought Python would be pretty quick. Even just a hello world print takes a while (e.g. maybe a second). Publishing out the sensor reading on top of this adds about 2 seconds, so thats quite a long time per reading for me.
Wondered if i'm doing something wrong, or thats about right? I'm using a pretty recent IOT dev image, have done an opkg update and upgrade.
Wondered if i'm doing something wrong, or thats about right?
I think you are just about right. When developing one of my projects ("https://youtu.be/8kUqpHOWR6Q Arduino Car - Controlled by Intel Galileo" which you can find at http://fernando.bl.ee/embed-rt/index.html ftinetti - Real-Time & Embedded Systems) I had to change the HTTP-python by my own HTTP server (the car received each car control about 1, 2 or even 3 seconds later, which made it impossible to drive).
In general, I think the combination of python-Arduino sketch does not work very well from the point of view of handling events (e.g. sensor reading) in a proper time.
I believe the main reason why Python is delaying so much is because how it is called, making a system call from an Arduino sketch might slow it down a little, but it also depends on the complexity of the Python code.
If you wish to improve the performance of the code, my best suggestion is that you unify you code, if you can translate the Python code to Arduino or even better if you could translate the Arduino part to Python, this might cause a major performance improvement.
Yeah that would make sense, but, I'm doing my testing just calling python from the command line linux session via ssh. I even simplified down the test case to simply be a hello world print statement - and that takes about 1 second, maybe 2 seconds, just to print. Then things get worse the more of the full python code I run - and rdf lib to construct the message, mqtt to send, taking about 9 seconds when I profile the code.
Not sure yet whether its the performance of the board, or something odd specific to the IoT image. I've got the same functionality in Java code too, so I tried that last night, and that was also slow. Tried running the python code via cython incase that might help, still slow. My next thoughts are to try something similar compiled out of C# and see how that goes. Possibly trying it on the smaller 50mb linux image too.
Only other idea I have it to knock up a quick VM and try to limit the CPU there to see if its just CPU related.. open to ideas though!
I also see that it is not only the system call from Arduino code that makes it slow, but even in interactive mode it is also quite slow to respond (e. g. I was hoping to use openCV to capture images, but only calling the module, 'import cv/cv2' took more than 5 seconds in interactive mode). I think the IoT image has too many resources, making it too resource-intensive, with all the daemons running in the background. Maybe by uninstalling some default resources in this image we're not going to use (e. g.: Wyoliodrin, Node.js, OpenCV, etc) we're able to increase performance?
Yeah I wondered if might be something like that - the boards got less resources in general so maybe that image overstretches it. I looked at what was using CPU via top command, really the big hitters were the sketch being called, and node.js. I uninstalled a bunch of stuff, and I didn't really notice much performance difference. Java performance seemed worse, so I'm wondering if its something around those languages (going to try some compiled stuff) or the board spec itself. I'm not convinced yet that its CPU though - I can run JavaCV build of opencv (so even more overhead) on a standard Pi (so 700mhz) and get a few frame per second performance, OK not rocket fast but certainly quicker than 5 seconds just to get the imports.
Well, now we're narrowing it a bit; it doesn't seem to be the CPU. The remaining problem sources must be either the SD card access itself, since the great majority of the cards have a slower performance than a pendrive, or the memory management when running the IoT image. I suspect it is something related to those two; when I was running mjpg-streamer(webcam http streaming) in the IoT image, there were 7 seconds of delay in the stream - when I ran mjpg-streamer from a smaller image, there were only 3 seconds of delay. I'm going to get those images running up again and see if top shows anything different.
Run the command:
systemctl list-unit-files | grep enabled
The output will be all the services that are currently enabled on your Galileo. You can disable all that you are not using, that'll save you CPU time that can be used on your script.
Also, have you run top while your script is in the background? How does it behave? Does it make a difference if you change the priority of the script using top?
This is the output in the IoT image:
This output is already without Node.js, which I uninstalled along with its dependencies and dependents right before. It made an enormous difference, at least when concerning the C program running in conjunction with my script, which now has only 3 seconds of delay. In top, it shows that the CPU usage by the process ranges from 46-70% maximum if I maintain its niceness at 0. If I change its niceness to -20, top shows that CPU usage goes as far as 96% by the process, but the delay stays the same. For me it's evident that Node.js does have some intense CPU usage which interferes with C programs. Python scripts, on the other hand, seem to have some other issue, since their performance stays pretty much the same, which must obviously be because of its interpreted-language nature.
I believe you are right, Python has that disadvantage since it is an interpreter. But you might be able to improve the Galileo's performance a little bit more if you disable all the services that you don't need from the list above. For example, I believe your project does not apply Bluetooth, so you can disable the Bluetooth service with the following command:
systemctl disable bluetooth.service
You can do that with all the services that you don't need while your script is running.
Sorry been a bit slow, a few other projects on the go, and a few hurdles along the way.
I tried xbolshe's image, I was curious if a newer kernel and lightweight setup made much difference. Quite substantial, just calling the python print "hello" test was quicker, hard to be precise but around 0.5s rather than the 1s before. The bigger test that would create an RDF message and publish it via MQTT, took around 9 seconds on the previous image, and takes around 5 seconds now.
Checking top while this ran, the python process used between 80% to 90% of CPU during that time (unfortunately RDF messages seem quite expensive - I can use JSON but its not a bad benchmark to see the difference).
In this image, there's very little enabled in systemctl, looks to me quite a bare minimum (sketch, getty, network).
I'll try to compile a C# executable that does something similar, my guess is with the interpretor out of the way, it might be pretty quick.
Scrub that, C# needs .NET and running via Mono is probably going to be even worse performance.
-Edit 2, I've gone back to the standard IoT image I was using before, having to setup all the buildtools was going to drive me crazy! I turned off alot/nearly all non essential services, and checked top for a while to see if anything ran sometimes and consumed a chunk of CPU. Now, creating and sending the message seems around 4 seconds. So, a good improvement overall, but, not quite in the ball park I need for realtime messaging - especially as I wanted to call this from a sketch. I still think some compiled code might be alot quicker, I'll see if I knock something up in C++
So, a good improvement overall, but, not quite in the ball park I need for realtime messaging - especially as I wanted to call this from a sketch. I still think some compiled code might be alot quicker, I'll see if I knock something up in C++
I think realtime is not in the area Galileo is looking at.
Yeah, I also think that using some compiled code with C/C++ will have some better results on your project, especially since it concerns real time (or near real time, haha) messaging. I think that anything using a mid-point between the source code and the processor code execution, such as bytecode in Python and Java (maybe .NET too, I'm not into C# ) will provoke a general slow-down in Galileo. As @FGT said, Galileo is probably not enhanced to deal with realtime areas. If you want to delve into that, I'd suggest using Edison.
@Intel_Peter Yes, that's exactly what I'm doing. I'm disabling the running processes I don't need, and making a list of them in case I ever stumble upon a similar situation. The performance is having some relative small improvements, but much better than before. Just leaning a bit off-topic, would there be any way to build a new Galileo image from IoT image but selecting resources by exclusion (IoT image minus Node.js, Bluetooth, Wyolidrin, etc.)?
I still need to add in the full functionality to be a like-for-like comparison against the Python/Java equivalent code, but, the C++ version I've knocked together is hugely faster, coming in at under a second. Will be interesting if calling from the sketch slows things up, but that performance is manageable now at least.
but, the C++ version I've knocked together is hugely faster, coming in at under a second
gabrielw6, makes sense, getting rid of the interpreter factor was expected to increase the speed of the script. However, getting the script to send messages in under a second is impressive but I would expect an increased delay when calling the it from an sketch.
would there be any way to build a new Galileo image from IoT image but selecting resources by exclusion (IoT image minus Node.js, Bluetooth, Wyolidrin, etc.)?
vinnieb, have you checked https://downloadcenter.intel.com/download/23197/Intel-Quark-BSP Intel® Quark™ BSP 1.2.0?