Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Beginner
62 Views

Determining DIRECT_IO_FACTOR on Linux

Hello,

 

 

I have compiled Quantum Espresso (QE) in a Linux cluster using intel2016.4.258 and openmpi-2.1.1. I get a runtime error for IOSTAT=121. The DIRECT_IO_FACTOR ( block length byte) in the QE code is set to 8. 

My understanding is that DIRECT_IO_FACTOR is system dependent. Is there a way to find out what DIRECT_IO_FACTOR value is on any Linux system?

Thank you,

Vahid

0 Kudos
7 Replies
Highlighted
Black Belt Retired Employee
62 Views

Can you explain in more detail what DIRECT_IO_FACTOR is supposed to mean? It is not a term I have seen before.

IOSTAT 121 is "CWDERROR", which isn't documented (grrr), but by the name I would assume it has something to do with changing the working directory.

0 Kudos
Highlighted
Black Belt
62 Views

One can guess the definition of the constant DIRECT_IO_FACTOR from the comments in https://github.com/maxhutch/quantum-espresso/blob/master/Modules/io_files.f90, but there is some ambiguity. That code provides for two cases: file storage unit is either (i) a byte or (ii) a double-precision word (8 bytes on X86/64). If Ifort is used without the -assume byterecl option, I think DIRECT_IO_FACTOR would need to be 4, but you have to consult the QE authors regarding this.

Quite clearly, DIRECT_IO_FACTOR is a factor provided in the QE code to allow adjustment for the different file storage units of different Fortran compilers. It would serve users best if this configuration constant were to be replaced using equivalent facilities available in standard Fortran (e.g., F2003).

The INQUIRE statement with the IOLENGTH= clause could be used to infer the proper value to use for DIRECT_IO_SIZE on any system.

Module ISO_FORTRAN_ENV provides the constant FILE_STORAGE_SIZE, about which the IFort documentation says:

    FILE_STORAGE SIZE s the size of the file storage unit expressed in bits. To use this constant, compiler option assume byterecl must be enabled.

You could try running again after recompiling with the option -assume byte-recl. Without this option, Intel Fortran uses 4-bytes as the unit of file storage, and that discrepancy may have led to the I/O error that you reported.

0 Kudos
Highlighted
Beginner
62 Views

Hello Steve and mecej4,

The code that is giving the error is EPW which is installed with the QE package. In the readwfc.f90 file, we have the following section to open wave function files generated by QE, read and close:

  !--------------------------------------------------------
  subroutine readwfc ( ipool, recn, evc0 )
  !--------------------------------------------------------
  !
  !  open wfc files as direct access, read, and close again
  !
  ! RM - Nov/Dec 2014
  ! Imported the noncolinear case implemented by xlzhang
  !
  !-------------------------------------------------------------
  USE io_files, ONLY : prefix, tmp_dir
  USE units_ph, ONLY : lrwfc, iuwfc
  USE kinds,    ONLY : DP
  USE wvfct,    ONLY : npwx
  USE pwcom,    ONLY : nbnd
  USE noncollin_module,ONLY : npol
  USE mp_global,ONLY : nproc_pool, me_pool
  USE mp_global,ONLY : npool
  !
  implicit none
  integer :: recn, ipool
  !  kpoint number
  !  poolfile to be read (not used in serial case)
  complex(kind=DP) :: evc0 ( npwx*npol, nbnd )
  character (len=3) :: nd_nmbr0
  ! node number for shuffle
  !
  integer :: unf_recl, ios
  character (len=256) :: tempfile

!  open the wfc file, read and close
  !
  CALL set_ndnmbr ( ipool, me_pool, nproc_pool, npool, nd_nmbr0)
  !
#if defined (__ALPHA)
#  define DIRECT_IO_FACTOR 2    
# else
#  define DIRECT_IO_FACTOR 8
#endif

#if defined(__MPI)
  tempfile = trim(tmp_dir) // trim(prefix) // '.wfc' // nd_nmbr0
# else
  tempfile = trim(tmp_dir) // trim(prefix) // '.wfc'
#endif
  unf_recl = DIRECT_IO_FACTOR * lrwfc
  !
  open  (iuwfc, file = tempfile, form = 'unformatted', &
          access = 'direct', iostat=ios, recl = unf_recl)
  IF (ios /= 0) call errore ('readwfc', 'error opening wfc file', iuwfc)
  read  (iuwfc, rec = recn) evc0
  close (iuwfc, status = 'keep')
  !
  !
  end subroutine readwfc

The error occurs when ios turns out to be 121. I have printed DIRECT_IO_FACTOR in this subroutine and it is 8. lrwfc is defined as:

lrwfc,     & ! the length of wavefunction record

In addition, I did compile QE (and EPW) with the following flag:

FFLAGS         = -O2 -assume byterecl -g -traceback

I will try the INQUIRE statement to see what value it gives.

Many thanks for your inputs.

Vahid

0 Kudos
Highlighted
Black Belt
62 Views

This confirms previous responses about getting the recl size consistent with your application build settings. If you choose -assume byterecl it should be spelled that way, as would be implied if you set -standard-semantics as mentioned above, this sets record length in bytes. You would remove any files left over from inconsistent settings.
0 Kudos
Highlighted
Beginner
62 Views

I did manage to get the IOLENGTH printed and its value is 4. The option -assume byterecl was enabled during compiling of the code. 

Vahid

0 Kudos
Highlighted
Black Belt
62 Views

If the file that your are attempting to read was generated on a different system than Linux+Ifort, you will have to watch out for incompatibilities in the record markers, etc. The two values provided for DIRECT_IO_FACTOR in the file that you showed in #4 are 2 and 8. These do not agree with the values in the file that I gave a link to, in which the two values are 1 and 8. However, I do not know what the macros __SX6 and__ALPHA signify and what relation they might have with record size units.

0 Kudos
Highlighted
Beginner
62 Views

All the files are generated on the same system using the same ifort and openmpi modules.

I think SX6 refers to Nec MathKeisan machines while ALPHA refers to Linux ALPHA with a Compaq/HP fortran compiler. Neither of these two are my case so DIRECT_IO_FACTOR is in fact 8 as far as QE is concerned.

Vahid

0 Kudos