- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I'm trying to collect data with MPI_Allgatherv with a large receive buffer for which the total size is larger than 2GB. As I could understand here (http://software.intel.com/en-us/forums/topic/361060) this is not supported. Unfortunately when I try to use the -ilp64 option with mpiifort I run into several problems:
1) when using include 'mpif.h' to include mpi, then after the following commands:
mpiifort -warn -O1 -g -traceback -check bounds -i8 -c gather.f
mpiifort -warn -O1 -g -traceback -check bounds -ilp64 gather.o -o gather.exe-ilp64
mpirun -ilp64 ./gather.exe-ilp64
I aborts with:
Assertion failed in file ../../i_rtc_cache.c at line 638: buf_end_palign > buf_start_palign
Assertion failed in file ../../i_rtc_cache.c at line 638: buf_end_palign > buf_start_palign
2) when including the mpi types through a "use mpi" statement, I can't compile the test program with '-i8' as it tells me the interface is incompatible. I guess this is because it doesn't know that i want to use the ilp64 interface. When compiling + linking in one go, it does work with only '-ilp64', but not if I add '-i8':
mpiifort -warn -O1 -g -traceback -check bounds -ilp64 gather.f -o gather.exe-ilp64
mpirun -ilp64 ./gather.exe-ilp64
after this, the program still crashes but now with the following error message:
Fatal error in PMPI_Allgatherv: Invalid count, error stack:
PMPI_Allgatherv(1430): MPI_Allgatherv(sbuf=0x2b33c8000010, scount=0, dtype=0x4c000829, rbuf=0x2b34b66b3010, rcounts=0x7fff2b9e7b70, displs=0x7fff2b9e7b60, dtype=0x4c000829, MPI_COMM_WORLD) failed
PMPI_Allgatherv(1375): Negative count, value is -1071939176
BUFRECV = 5.55500000000000
Fatal error in PMPI_Allgatherv: Invalid count, error stack:
PMPI_Allgatherv(1430): MPI_Allgatherv(sbuf=0x2b53e8000010, scount=0, dtype=0x4c000829, rbuf=0x2b54d66b3010, rcounts=0x7fff75f545f0, displs=0x7fff75f545e0, dtype=0x4c000829, MPI_COMM_WORLD) failed
PMPI_Allgatherv(1375): Negative count, value is -484441656
or with
Fatal error in PMPI_Allgatherv: Invalid count, error stack:
PMPI_Allgatherv(1430): MPI_Allgatherv(sbuf=0x2b8c98000010, scount=0, dtype=0x4c000829, rbuf=0x2b8d866b3010, rcounts=0x7fff83e96470, displs=0x7fff83e96460, dtype=0x4c000829, MPI_COMM_WORLD) failed
PMPI_Allgatherv(1375): Negative count, value is -1883799144
forrtl: error (69): process interrupted (SIGINT)
Image PC Routine Line Source
libpthread.so.0 00002B5F11907251 Unknown Unknown Unknown
libdaploucm.so.2 00002B5F12F7869C Unknown Unknown Unknown
libmpi_dbg.so.4 00002B5F10E8676F Unknown Unknown Unknown
libmpi_dbg.so.4 00002B5F10E83718 dapl_rc_poll_recv 296 dapl_poll_rc.c
libmpi_dbg.so.4 00002B5F10E8330D MPID_nem_dapl_rc_ 124 dapl_poll_rc.c
libmpi_dbg.so.4 00002B5F10FC18C7 MPID_nem_network_ 23 mpid_nem_network_poll.c
libmpi_dbg.so.4 00002B5F10DCD90E MPIDI_CH3I_Progre 735 ch3_progress.c
libmpi_dbg.so.4 00002B5F10F2B592 MPIC_Wait 568 helper_fns.c
libmpi_dbg.so.4 00002B5F10F290E9 MPIC_Sendrecv 206 helper_fns.c
libmpi_dbg.so.4 00002B5F10F2BA18 MPIC_Sendrecv_ft 717 helper_fns.c
libmpi_dbg.so.4 00002B5F10D7890E MPIR_Allgatherv_i 770 allgatherv.c
libmpi_dbg.so.4 00002B5F10D7965F MPIR_Allgatherv 955 allgatherv.c
libmpi_dbg.so.4 00002B5F10D799B0 MPIR_Allgatherv_i 1000 allgatherv.c
libmpi_dbg.so.4 00002B5F10D7C822 PMPI_Allgatherv 1400 allgatherv.c
libmpigf.so.4 00002B5F10AA4279 Unknown Unknown Unknown
libmpi_ilp64.so 00002B5F108709C3 Unknown Unknown Unknown
gather.exe-ilp64 0000000000403D1B MAIN__ 56 gather.f
gather.exe-ilp64 0000000000402F1C Unknown Unknown Unknown
libc.so.6 00002B5F11DBECDD Unknown Unknown Unknown
gather.exe-ilp64 0000000000402E19 Unknown Unknown Unknown
So, that makes me wonder if I actually compiled it properly?
Test program is attached, mpiifort -show:
ifort -I/software/intel/impi/4.1.3.048/intel64/include -I/software/intel/impi/4.1.3.048/intel64/include -L/software/intel/impi/4.1.3.048/intel64/lib -Xlinker --enable-new-dtags -Xlinker -rpath -Xlinker /software/intel/impi/4.1.3.048/intel64/lib -Xlinker -rpath -Xlinker /opt/intel/mpi-rt/4.1 -lmpigf -lmpi -lmpigi -ldl -lrt -lpthread
and ifort --version:
ifort.orig (IFORT) 13.1.3 20130607
Copyright (C) 1985-2013 Intel Corporation. All rights reserved.
grtz
Steven
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Steven,
You are (in the first step) correctly compiling and linking with ILP64. However, this does not enable support for messages larger than 2 GB.
Sincerely,
James Tullos
Technical Consulting Engineer
Intel® Cluster Tools
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Steven,
You are (in the first step) correctly compiling and linking with ILP64. However, this does not enable support for messages larger than 2 GB.
Sincerely,
James Tullos
Technical Consulting Engineer
Intel® Cluster Tools
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for the confirmation! I worked around it by partitioning in smaller message sizes.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
if it doesn't enable. How about this link http://ijssst.info/Vol-12/No-1/paper3.pdf ? From my summarization, they evaluate performance between Intel MPI and MVAPICH with infiniband technology by using intel micro benchmark on Intel-Westmere Processor. In experiment they vary message size until 16 MB. From my point, if you see the result of Allgather testing it shows that everything is ok but it should not run because message larger than 2 GB or i misunderstand about Intel MPI limitation.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page