- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'm trying to build OFED-3.5-2-MIC for my 5110p coprocessor. The host OS is CentOS 6.6, Linux kernel version is 2.6.32-504.1.3.el6.x86_64
I run the install.pl with root and get the error when installing intel-mic-ofed-compat-rdma-3.5-OFED.3.5.2.MIC.src.rpm,output info is like this:
-I/usr/src/kernels/2.6.32-504.1.3.el6.x86_64/arch/x86/include \
-Iarch/x86/include/generated \
-D__KERNEL__ -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Werror-implicit-function-declaration -Wno-format-security -fno-delete-null-pointer-checks -O2 -m64 -mtune=generic -mno-red-zone -mcmodel=kernel -funit-at-a-time -maccumulate-outgoing-args -fstack-protector -DCONFIG_AS_CFI=1 -DCONFIG_AS_CFI_SIGNAL_FRAME=1 -DCONFIG_AS_CFI_SECTIONS=1 -DCONFIG_AS_AVX=1 -pipe -Wno-sign-compare -fno-asynchronous-unwind-tables -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -Wframe-larger-than=2048 -Wno-unused-but-set-variable -fno-omit-frame-pointer -fno-optimize-sibling-calls -g -pg -Wdeclaration-after-statement -Wno-pointer-sign -fno-strict-overflow -fno-dwarf2-cfi-asm -fconserve-stack -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(main)" -D"KBUILD_MODNAME=KBUILD_STR(compat)" -D"DEBUG_HASH=18" -D"DEBUG_HASH2=35" -c -o /var/tmp/OFED_topdir/BUILD/intel-mic-ofed-compat-rdma-3.5/compat/.tmp_main.o /var/tmp/OFED_topdir/BUILD/intel-mic-ofed-compat-rdma-3.5/compat/main.c
In file included from /var/tmp/OFED_topdir/BUILD/intel-mic-ofed-compat-rdma-3.5/include/linux/compat-2.6.h:55,
from <command-line>:0:
/var/tmp/OFED_topdir/BUILD/intel-mic-ofed-compat-rdma-3.5/include/linux/compat-2.6.34.h:19: error: redefinition of typedef 'mmc_pm_flag_t'
include/linux/mmc/pm.h:25: note: previous declaration of 'mmc_pm_flag_t' was here
In file included from /var/tmp/OFED_topdir/BUILD/intel-mic-ofed-compat-rdma-3.5/include/linux/compat-2.6.h:58,
from <command-line>:0:
/var/tmp/OFED_topdir/BUILD/intel-mic-ofed-compat-rdma-3.5/include/linux/compat-2.6.37.h:198: error: redeclaration of enumerator 'ETH_FLAG_TXVLAN'
include/linux/ethtool.h:405: note: previous definition of 'ETH_FLAG_TXVLAN' was here
/var/tmp/OFED_topdir/BUILD/intel-mic-ofed-compat-rdma-3.5/include/linux/compat-2.6.37.h:199: error: redeclaration of enumerator 'ETH_FLAG_RXVLAN'
include/linux/ethtool.h:406: note: previous definition of 'ETH_FLAG_RXVLAN' was here
make[3]: *** [/var/tmp/OFED_topdir/BUILD/intel-mic-ofed-compat-rdma-3.5/compat/main.o] Error 1
make[2]: *** [/var/tmp/OFED_topdir/BUILD/intel-mic-ofed-compat-rdma-3.5/compat] Error 2
make[1]: *** [_module_/var/tmp/OFED_topdir/BUILD/intel-mic-ofed-compat-rdma-3.5] Error 2
make[1]: Leaving directory `/usr/src/kernels/2.6.32-504.1.3.el6.x86_64'
make: *** [kernel] Error 2
error: Bad exit status from /var/tmp/rpm-tmp.t1Z2c9 (%build)
RPM build errors:
user build does not exist - using root
group build does not exist - using root
user build does not exist - using root
group build does not exist - using root
Bad exit status from /var/tmp/rpm-tmp.t1Z2c9 (%build)
It seems to be a compatibility issue of kernel. But I can't find any useful document which clearly clarifies the compatible versions of OFED, MPSS and Linux kernel.
Besides, I also tried to install OFED 1.5.4.1 following the Intel® Manycore Platform Software Stack (Intel® MPSS) guide, the installation seems to be passed, however, the service could not successfully start:
device node GUID
------ ----------------
scif0 4c79bafffe3005d2
[root@xeonphi-server-mic0 ~]# ibv_devinfo
hca_id: scif0
transport: SCIF (2)
fw_ver: 0.0.1
node_guid: 4c79:baff:fe30:05d2
sys_image_guid: 4c79:baff:fe30:05d2
vendor_id: 0x8086
vendor_part_id: 0
hw_ver: 0x1
phys_port_cnt: 1
port: 1
state: PORT_ACTIVE (4)
max_mtu: 4096 (5)
active_mtu: 4096 (5)
sm_lid: 1
port_lid: 1001
port_lmc: 0x00
link_layer: SCIF
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
One of the people I work with also ran into trouble installing OFED-3.5-2-MIC on a RHEL system that had recently been upgraded to a later kernel. The solution was to install OFED-3.12-1 instead. OFED-3.5-2-MIC, as you could probably tell by the name, is not part of the mainline of OFED development. It is a good idea to get back on the mainline, unless you have a compelling reason to use the -MIC version.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I want the scif0 virtual InfiniBand adapter to facilitate communication between a host and an intra-node coprocessor. So will OFED-3.12-1 be OK for my purpose? Or I must use a -MIC version?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Roth,
Thanks for your reply.
I want to use the scif0 virtual InfiniBand adapter to facilitate communication between a host and an intra-node coprocessor. So will OFED-3.12-1 be OK for my purpose? Or I must use a -MIC version?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes, OFED-3.12-1 supports the virtual InfiniBand connection. You will have the same advantages using OFED-3.12-1 as you get with OFED-3.5-2-MIC.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Roth,
Thanks for your recommendation.
I've successfully reinstalled OFED-3.12-1 with xeon-phi support, currently my environment is MPSS 3.4.4, Linux kernel 2.6.32-358.el6.x86_64, OFED-3.12-1, Intel MPI 5.0.3.048. I tried to use the virtual infiniband functionality. However, I encountered the problems below:
I ran ib_read_bw on server, and then ran ib_read_bw 192.0.2.100 on mic, I got these outputs:
[root@xeonphi-server OFED-3.12-1]# ib_read_bw
---------------------------------------------------------------------------------------
Device not recognized to implement inline feature. Disabling it
************************************
* Waiting for client to connect... *
************************************
---------------------------------------------------------------------------------------
RDMA_Read BW Test
Dual-port : OFF Device : scif0
Number of qps : 1 Transport type : IW
Connection type : RC Using SRQ : OFF
CQ Moderation : 100
Mtu : 4096
Link type : Ethernet
Gid index : 0
Outstand reads : 255
rdma_cm QPs : OFF
Data ex. method : Ethernet
---------------------------------------------------------------------------------------
local address: LID 0x3e8 QPN 0x0002 PSN 0xa8d12d OUT 0xff RKey 0x000001 VAddr 0x007f22c60a1000
GID: 76:121:186:48:05:211:00:00:00:00:00:00:00:00:00:00
ethernet_read_keys: Couldn't read remote address
Unable to read to socket/rdam_cm
Failed to exchange data between server and clients
[root@xeonphi-server-mic0 micshare]# ib_read_bw 192.0.2.100
---------------------------------------------------------------------------------------
Device not recognized to implement inline feature. Disabling it
---------------------------------------------------------------------------------------
RDMA_Read BW Test
Dual-port : OFF Device : scif0
Number of qps : 1 Transport type : Unknown
Connection type : RC Using SRQ : OFF
TX depth : 128
CQ Moderation : 100
Mtu : 4096
Link type : SCIF
Outstand reads : 255
rdma_cm QPs : OFF
Data ex. method : Ethernet
---------------------------------------------------------------------------------------
ethernet_read_keys: Couldn't read remote address
Unable to read from socket/rdam_cm
Failed to exchange data between server and clients
And I also tried to use DAPL fabric to run Intel MPI benchmark, the commands and outputs are like this:
[root@xeonphi-server /tmp]# mpirun -genv I_MPI_DEBUG 2 -host host -n 1 /opt/intel/impi/5.0.3.048/bin64/IMB-MPI1 Sendrecv : -host mic0 -n 1 /tmp/IMB-MPI1
[0] MPI startup(): Multi-threaded optimized library
[0] MPI startup(): RLIMIT_MEMLOCK too small
[0] MPI startup(): dapl fabric is not available and fallback fabric is not enabled
[1] DAPL startup(): trying to open DAPL provider from I_MPI_DAPL_PROVIDER: ofa-v2-scif0
[1] MPI startup(): DAPL provider ofa-v2-scif0
[1] MPI startup(): dapl data transfer mode
Some environment info is as below:
[root@xeonphi-server /tmp]# env | grep I_MPI
I_MPI_ROOT=/opt/intel/impi/5.0.3.048
I_MPI_MIC=enable
I_MPI_FABRICS=dapl
I_MPI_DAPL_PROVIDER=ofa-v2-scif0
[root@xeonphi-server tmp]# rpm -qa | grep dapl
dapl-utils-2.1.2-1.x86_64
dapl-devel-static-2.1.2-1.x86_64
dapl-2.1.2-1.x86_64
dapl-debuginfo-2.1.2-1.x86_64
dapl-devel-2.1.2-1.x86_64
I'm quite confused about these weird issues and very looking forward to your reply.
Thank you.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
Is there any solution to the above problem that was posted in last post?
I am facing same issues. The setup is as follows. mpss-3.2.1 kernel - 2.6.38.8 ofed- 3.2.1 CentOS
Server side- (mic)
[root@bricks06-mic0 ~]# ib_read_lat
------------------------------------------------------------------
RDMA_Read Latency Test
Number of qps : 1
Connection type : RC
Mtu : 2048B
Link type : IB
Outstand reads : 16
rdma_cm QPs : OFF
Data ex. method : Ethernet
------------------------------------------------------------------
local address: LID 0x01 QPN 0x200049 PSN 0xc86176 OUT 0x10 RKey 0x40003002 VAddr 0x000000006c7000
pp_read_keys: No such file or directory
Couldn't read remote address
Unable to write to socket/rdam_cm
Failed to exchange date between server and clients
Client side -
[root@bricks06 bin]# ./ib_read_lat -d scif0 192.0.2.101
---------------------------------------------------------------------------------------
Device not recognized to implement inline feature. Disabling it
ethernet_read_data: Couldn't read reports
Unable to read from socket/rdam_cm
---------------------------------------------------------------------------------------
RDMA_Read Latency Test
Dual-port : OFF Device : scif0
Number of qps : 1 Transport type : IW
Connection type : RC Using SRQ : OFF
TX depth : 1
Mtu : 4096
Link type : IB
Outstand reads : 255
rdma_cm QPs : OFF
Data ex. method : Ethernet
---------------------------------------------------------------------------------------
local address: LID 0x3e8 QPN 0x0003 PSN 0x295fc7 OUT 0xff RKey 0x000002 VAddr 0x00000001cba000
ethernet_read_keys: Couldn't read remote address
Unable to read from socket/rdam_cm
Failed to exchange data between server and clients
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Happy new year :)
I would really appreciate any help on my last post?
Doe this error indicate that my infiniband set up was not set correctly using verbs? because I can do communication over Ip over Ib here for xeon phi.
Please let me know
thanks much
Mrunal
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Mrunal,
Sorry for my late reply.
Actually I didn't find any solution to this issue and gave it up afterwards.
I regret that I may not be able to help you :(
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for your reply.
So you did not at all do a setup on rdma over infiniband? or did some alternate configuration?
thanks
mrunal

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page