Software Archive
Read-only legacy content
17061 Discussions

Working combo for RHEL 6.7 + MPSS 3.6 + OFED + lustre ?

Tommi_T_
New Contributor I
791 Views

Hi,

I tried to recompile lustre packages for MPSS 3.6 (using OFED 1.5.4.1) and seems that MPSS gcc 5.1.1 is more strict and build fails.

/tmp/lustre-release/lnet/klnds/o2iblnd/o2iblnd_cb.c: In function 'kiblnd_resolve_addr':
/tmp/lustre-release/lnet/klnds/o2iblnd/o2iblnd_cb.c:1213:14: error: implicit declaration of function 'rdma_set_reuseaddr' [-Werror=implicit-function-declaration]
         rc = rdma_set_reuseaddr(cmid, 1);
              ^
cc1: all warnings being treated as errors

So is there any working (M)OFED version that could use used with RHEL 6.7 kernel, MPSS 3.6 and lustre on the Phis?

 

0 Kudos
4 Replies
William_Howell
Beginner
791 Views

Leaving my original comment in case it helps others, but the fix to my original problem was to set the following two lines in ofed.conf to 'n'.  It would be useful however if anyone can explain why these are causing problems.

intel-mic-ofed-compat-rdma=n
intel-mic-ofed-compat-rdma-devel=n

 

 

 

#== original post  ==

I also have problems, with different errors, building on

6.7 (2.6.32-573.7.1.el6.x86_64)

MPSS 3.6

OFED-3.5-2-MIC

In file included from /var/tmp/OFED_topdir/BUILD/intel-mic-ofed-compat-rdma-3.5/include/linux/compat-2.6.h:55,
                 from <command-line>:0:
/var/tmp/OFED_topdir/BUILD/intel-mic-ofed-compat-rdma-3.5/include/linux/compat-2.6.34.h:19: error: redefinition of typedef 'mmc_pm_flag_t'
include/linux/mmc/pm.h:25: note: previous declaration of 'mmc_pm_flag_t' was here
In file included from /var/tmp/OFED_topdir/BUILD/intel-mic-ofed-compat-rdma-3.5/include/linux/compat-2.6.h:58,
                 from <command-line>:0:
/var/tmp/OFED_topdir/BUILD/intel-mic-ofed-compat-rdma-3.5/include/linux/compat-2.6.37.h:13: error: redefinition of 'proto_ports_offset'
include/linux/in.h:292: note: previous definition of 'proto_ports_offset' was here
In file included from /var/tmp/OFED_topdir/BUILD/intel-mic-ofed-compat-rdma-3.5/include/linux/compat-2.6.h:58,
                 from <command-line>:0:
/var/tmp/OFED_topdir/BUILD/intel-mic-ofed-compat-rdma-3.5/include/linux/compat-2.6.37.h:198: error: redeclaration of enumerator 'ETH_FLAG_TXVLAN'
include/linux/ethtool.h:405: note: previous definition of 'ETH_FLAG_TXVLAN' was here
/var/tmp/OFED_topdir/BUILD/intel-mic-ofed-compat-rdma-3.5/include/linux/compat-2.6.37.h:199: error: redeclaration of enumerator 'ETH_FLAG_RXVLAN'
include/linux/ethtool.h:406: note: previous definition of 'ETH_FLAG_RXVLAN' was here
In file included from /var/tmp/OFED_topdir/BUILD/intel-mic-ofed-compat-rdma-3.5/include/linux/compat-2.6.h:62,
                 from <command-line>:0:
/var/tmp/OFED_topdir/BUILD/intel-mic-ofed-compat-rdma-3.5/include/linux/compat-3.1.h:32: error: redefinition of 'ip_is_fragment'
include/net/ip.h:249: note: previous definition of 'ip_is_fragment' was here
In file included from /var/tmp/OFED_topdir/BUILD/intel-mic-ofed-compat-rdma-3.5/include/linux/compat-2.6.h:63,
                 from <command-line>:0:
/var/tmp/OFED_topdir/BUILD/intel-mic-ofed-compat-rdma-3.5/include/linux/compat-3.2.h:160: error: conflicting types for '__netdev_printk'
include/linux/netdevice.h:2788: note: previous declaration of '__netdev_printk' was here
make[3]: *** [/var/tmp/OFED_topdir/BUILD/intel-mic-ofed-compat-rdma-3.5/compat/main.o] Error 1
make[2]: *** [/var/tmp/OFED_topdir/BUILD/intel-mic-ofed-compat-rdma-3.5/compat] Error 2
make[1]: *** [_module_/var/tmp/OFED_topdir/BUILD/intel-mic-ofed-compat-rdma-3.5] Error 2
make[1]: Leaving directory `/usr/src/kernels/2.6.32-573.7.1.el6.x86_64'
make: *** [kernel] Error 2
error: Bad exit status from /var/tmp/rpm-tmp.80r4sI (%build)

Any help would be appreciated.

 

 

0 Kudos
Frances_R_Intel
Employee
791 Views

For the coprocessor, you will want to follow the directions in Intel® Manycore Platform Software Stack (Intel® MPSS) User's Guide (section 3.5.6 in the MPSS 3.6 release) and install the Lustre files from the k1om directory that comes with the MPSS release. The version of Linux kernel which runs on the coprocessor does not support the rdma_set_reuseaddr function.

As far as rebuilding Lustre for your host machine, is this the same version you used previously? I did not believe rdma_set_reuseaddr was supported in the kernel that comes with RHEL 6.7 either, but I might be wrong.

0 Kudos
Frances_R_Intel
Employee
791 Views

And to William - there should have been no conflicts. The contents of the /var/tmp/OFED_topdir directory are strictly the contents from the OFED-3.5-2-MIC file, right? Could you check to see what package include/linux/mmc/pm.h came from?

EDIT:

I went back over some old forum issues and found a note saying if you run into trouble with OFED-3.5-2-MIC switch to OFED-3.12-1. Try that. Both support the coprocessor and OFED-3.12-1 is in the main line of OFED releases.

0 Kudos
William_Howell
Beginner
791 Views

Frances Roth (Intel) wrote:

And to William - there should have been no conflicts. The contents of the /var/tmp/OFED_topdir directory are strictly the contents from the OFED-3.5-2-MIC file, right? Could you check to see what package include/linux/mmc/pm.h came from?

EDIT:

I went back over some old forum issues and found a note saying if you run into trouble with OFED-3.5-2-MIC switch to OFED-3.12-1. Try that. Both support the coprocessor and OFED-3.12-1 is in the main line of OFED releases.

 

Hi Frances, and thank you for helping look into this. 

The contents of /var/tmp/OFED_topdir were indeed entirely from OFED-3.5-2-MIC and unmodified. I've cleaned the node so would have to go back and repeat to check the origin for pm.h .  I'll be happy to do so if you find it useful.

I've given the host a fresh os and mpss install to try for OFED-3.12-1. I now have a different redefinition as shown below. I haven't posted the full output for the attempted install because it is quite large, but have saved it should it become useful. Here are some details

build047:~$ uname -r
2.6.32-573.7.1.el6.x86_64

build047:~$ gcc --version | grep ^gcc
gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-16)

build047:~$ rpm -qa | grep `uname -r`
mpss-modules-dev-2.6.32-573.7.1.el6.x86_64-3.6-1.x86_64
kernel-headers-2.6.32-573.7.1.el6.x86_64
mpss-modules-2.6.32-573.7.1.el6.x86_64-3.6-1.x86_64
kernel-2.6.32-573.7.1.el6.x86_64
kernel-devel-2.6.32-573.7.1.el6.x86_64

build047:~$ wget https://www.openfabrics.org/downloads/OFED/ofed-3.12-1/OFED-3.12-1.tgz
build047:~$ tar -xzf OFED-3.12-1.tgz
build047:~$ cd OFED-3.12-1

build047:~$ ./install.pl --with-xeon-phi -vvv --all

#fails referencing temporary log file whose contents end with:

make[1]: Entering directory `/usr/src/kernels/2.6.32-573.7.1.el6.x86_64'
  CC  /var/tmp/OFED_topdir/BUILD/compat-rdma-3.12/compat/main.o
In file included from /var/tmp/OFED_topdir/BUILD/compat-rdma-3.12/include/linux/compat-2.6.h:63,
                 from <command-line>:0:
/var/tmp/OFED_topdir/BUILD/compat-rdma-3.12/include/linux/compat-2.6.37.h:15: error: redefinition of 'proto_ports_offset'
include/linux/in.h:292: note: previous definition of 'proto_ports_offset' was here
In file included from /var/tmp/OFED_topdir/BUILD/compat-rdma-3.12/include/linux/compat-2.6.h:65,
                 from <command-line>:0:
/var/tmp/OFED_topdir/BUILD/compat-rdma-3.12/include/linux/compat-2.6.39.h:196:1: warning: "PTR_RET" redefined
In file included from /usr/src/kernels/2.6.32-573.7.1.el6.x86_64/arch/x86/include/asm/processor.h:31,
                 from include/linux/prefetch.h:14,
                 from include/linux/list.h:7,
                 from include/linux/mm_types.h:7,
                 from include/linux/kmemcheck.h:4,
                 from include/linux/skbuff.h:18,
                 from include/linux/if_ether.h:136,
                 from include/linux/netdevice.h:29,
                 from /var/tmp/OFED_topdir/BUILD/compat-rdma-3.12/include/linux/compat-2.6.29.h:5,
                 from /var/tmp/OFED_topdir/BUILD/compat-rdma-3.12/include/linux/compat-2.6.h:55,
                 from <command-line>:0:
include/linux/err.h:64:1: warning: this is the location of the previous definition
In file included from /var/tmp/OFED_topdir/BUILD/compat-rdma-3.12/include/linux/compat-2.6.h:67,
                 from <command-line>:0:
/var/tmp/OFED_topdir/BUILD/compat-rdma-3.12/include/linux/compat-3.1.h:32: error: redefinition of 'ip_is_fragment'
include/net/ip.h:249: note: previous definition of 'ip_is_fragment' was here
make[3]: *** [/var/tmp/OFED_topdir/BUILD/compat-rdma-3.12/compat/main.o] Error 1
make[2]: *** [/var/tmp/OFED_topdir/BUILD/compat-rdma-3.12/compat] Error 2
make[1]: *** [_module_/var/tmp/OFED_topdir/BUILD/compat-rdma-3.12] Error 2
make[1]: Leaving directory `/usr/src/kernels/2.6.32-573.7.1.el6.x86_64'
make: *** [kernel] Error 2
error: Bad exit status from /var/tmp/rpm-tmp.KNLxBj (%build)


RPM build errors:
    user vlad does not exist - using root
    group vlad does not exist - using root
    user vlad does not exist - using root
    group vlad does not exist - using root
    Bad exit status from /var/tmp/rpm-tmp.KNLxBj (%build)

 

 

 

0 Kudos
Reply