- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
We have machines with 2x Xeon Phi cards.
When I run Ganglia in the MIC OS by installing the Ganglia k1om RPM I think the following occurs.
Since the MICs are bridged/MASQueraded via the HOST, the Ganglia metrics of the MICs seem to come from the same IP as the HOST (from the Ganglia collector node's point of view)
This wouldn't be a problem if we didn't have 2x MIC's in each server. That means the Ganglia metrics from mic0 and mic1 appear to come from the same IP and that the metrics of mic0 overwrite the metrics of mic1 and vice versa.
A solution would be to use the "override_hostname" and "override_ip" configuration option of Ganglia. However with MPSS 3.3 comes Ganglia 3.1.7 k1om rpm and the override options in Ganglia have been added (I believe) in Ganglia version 3.3+.
Would it be possible to get either a newer Ganglia version k1om RPM, or the SRC RPM of ganglia-3.1.7-r0.k1om.rpm so I can try to update it to a newer version myself?
I have tried to build stock ganglia 3.6.0 within the MIC, but I am having difficulties with the stock .spec file and build dependancies for the MIC OS.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I haven't noticed this, but then I have tried only a limited number of configurations (no MASQuerading). Are you broadcasting from the coprocessors or unicasting to a central node? Have you tried unicasting from the coprocessors to their host and then having gmetad poll the hosts? This should work and cut down on network traffic off the host but then you do need to put up with the hosts being polled directly. Another workaround would be to set up two "clusters" for collection purposes. By default, the coprocessors are in "mic_cluster", You could configure mic0 on each host to be in "mic0_cluster" and mic1 to be in "mic1_cluster". Of course, you would then need to have each of them send to a different collector node and aggregate the data later.
I can look into what is needed to build the newer Ganglia. But just so I know - You compiled directly on the coprocessor? Have you tried cross-compiling on the host? What environment variables/options did you use for configure?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
We unicast to 1 collector node. We could separate the Ganglia's of both MIC cards, but we would like to have all hosts and mic's in 1 Ganglia "cluster".
As for compiled, I never got the configure to complete due to the massive build dependancies of a 'stock' ganglia src rpm. That's why I was wondering if the original k1om source RPM was available, or how that was build.
For now we have decided not to use Ganglia on the MICs.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
FYI - I received the following instructions on rebuilding Ganglia on the coprocessor from one of the developers. The instructions are describing how to build Ganglia on the coprocessor itself, not cross compiling. Where it talks about making sure various libraries and tools are installed, it means install the corresponding rpm files from mpss-3.x/k1om on the coprocessor. The directions for doing this are in the MPSS Users Guide, 11.3 Installing Card Side RPMs . Once you have rebuilt Ganglia, you will want to be sure that version of Ganglia and all the necessary supporting libraries are installed each time the coprocessor is booted. If you use an NFS mounted root file system or if you use the StaticRAM option (see the documentation for micctrl) using a cpio file you create after installing Ganglia on the card, then you are ok. If you use the default RAM file system, you will want to either use an NFS file system for /usr/local or copy the contents of /usr/local to the host system and put them in a directory such as /var/mpss/common/usr/local, which will cause them to be loaded on each reboot.
You can compile ganglia 3.3. One way is following the below steps:
1. Follow the directions to setup k1om repo as per MPSS_Userguide 11.3.2
2. Make sure package libuuid1 is installed and make sure a link libuuid.so is in place (or create one)
ln -s libuuid.so.1.3.0 libuuid.so
3. Install the following packages:
gcc, libapr-1-0, libapr-1-dev, gawk, libconfuse0, libconfuse-dev, libexpat1, libexpat-dev, libpcre0, libpcre-dev, make
4. Make sure there is a link to libapr-1.so (or create one as follow)
ln -s libapr-1.so.0 libapr-1.so
5. Move or rename all static libs under /usr/libs64
(libapr-1.la libconfuse.la libexpat.la libpcrecpp.la libpcre.la libpcreposix.la)
The reason is the config provided in the ganglia package will look for those libs as dependencies libs, however an error is found later on due to this.
6. Unpack the ganglia 3.3.7 tarball file and make it available on the card. I used an nfs mounted on the card.
7. Execute the following command:
./configure --host=x86_64-pc-linux --build=x86_64-pc-linux --enable-shared=yes --enable-static=no
8. You may have to execute the same command under libmetrics and gmond/modules as well.
9. Execute make and the make install. gmond will be installed under /usr/local/sbin.
10. You need to modify your configuration for the card to set it as deaf. This can be done by commenting out all send/accept channels. Otherwise you will get a runtime error. To run it use the -c pointing to your configuration file. You also can use ganglia 3.3.7 on the host and use the provided ganglia/mpss-ganglia on the card.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page