Re: XML in FORTRAN

Intel_C_Intel · ‎12-01-2003

I have to have a model read its configuration from an XML file. I'm going to use MSXML4 in C++ (because I know how and I've used it already!). So I want to define some structures in a module so I can access them from the C++ layer.

However, the structures have allocatable elements, so is the following the correct 'redefinition'? And also, where should I allocate/deallocate?

extern "C" struct
{
    char[32] Recipe_ID;
    double Means[];
    double Std_Devns[];
    double Covariances[][];
    long n_Observations;
} MMSPC_mp_t_Recipe;
TYPE t_Recipe
    CHARACTER Recipe_ID*32
    REAL*8, ALLOCATABLE :: Means( : )
    REAL*8, ALLOCATABLE :: Std_Devns( : )
    REAL*8, ALLOCATABLE :: Covariances( : ,: )
    INTEGER*4 n_Observations
END TYPE t_Recipe

Intel_C_Intel · ‎12-01-2003

Also, how do you redefine in C/C++ a TYPE with nested allocatable arrays of other TYPEs?

Jugoslav_Dujic · ‎12-01-2003

Alas, it's not a correct translation (although it may appear so at first glance). CVF implementation of ALLOCATABLE component arrays is the same as for POINTER arrays (although the semantic is different) -- 28-byte descriptor for 1-dimension array. See page "Handling arrays and Visual Fortran array descriptors" under Mixed-language chapter of Programmer's Guide.

Thereis even a CVF sample called "DESCRIPT" containing samples with manipulation with array descriptor. Take a look at it.

Jugoslav

Steven_L_Intel1 · ‎12-01-2003

A warning - the DESCRIPT sample was not updated when the descriptor layout changed. The current layout is correctly described in the Programmer's Guide.

Intel_C_Intel · ‎12-01-2003

Hmmm. This is getting more complicated than I hoped. To be honest, I'm not sure if I'm following the DESCRIPT stuff at all.

It ispopulating the array of POINTERS with the addresses of the elements of farray yes? And populating the array in the 'C' fashion? So you use the pointer array in FORTRAN and the actual array in C?

Bring on FORTRAN CLR compatibility! Please! :smileyvery-happy:

Steven_L_Intel1 · ‎12-01-2003

You cannot use ALLOCATABLE for this purpose. It is incompatible with what C is using.

Look instead at the "Integer POINTER" extension, which is different from Fortran 90 POINTER. You would declare the components of the structure as

integer (INT_PTR_KIND())

Intel_C_Intel · ‎12-01-2003

I'm going to have to think about this.I might get my head around it eventually.:smileysad:

Maybe I'll just pass each value individually, which will involve writing lots of functions...hmmm....maybe not...

Intel_C_Intel · ‎12-02-2003

Sorry for harping on with this, but....

My FORTRAN currently has a module with lots of types containing allocatable arrays of values, and types containingg allocatable arrays of other types.

Even if I have to fiddle with the definitions of 'allocatable' elements etc., I still think it would be possible to access and populate these module values directly from C/C++ and then use the data in FORTRAN.

Am I correct? If so, I'll start fiddling with test code to get it working.

Cheers,

Dan

Jugoslav_Dujic · ‎12-02-2003

Depends on your definition of "populate". Does it include allocation?

One of maindesign issuesin mixed-language programming is who allocates the memory? Then, there's the rule of thumb that the same side must deallocate it. Further, I think it's natural that memory is allocated by the side who knows the required size first. Which is your case?

IMO it's not so hard to populate F90 array descriptors on C++ side. That has the feature that C++ side is "ugly" (it fiddles with F90 array descriptors) but you keep Fortran side clean. On the other hand, you can have C++ side "clean" (int*'s and float*'s within the structure), but you'll have to fiddle with Cray pointers on Fortran side. Based on limited information I have, I would go for the first option, like (I'mwriting from memory about what the descriptor contains, so it's just a scratch):

typedef struct tagF90_DESCRIPTOR {

void* lpStartAddress; // returned by new foo[size]

int nIndividualSize; // = sizeof (foo)

int iFlags;// allocated, etc.

int iOffset; // = -sizeof(foo)

int iStart; // = 1 by default

int iSize; // number of elements

int iStride; // = either 1 or sizeof(foo), check it

} F90_DESCRIPTOR, LPF90_DESCRIPTOR*;

Maybe you could even define a proper C++ constructor/desctructor for it.

typedef struct tagYourStruct {

F90_DESCRIPTOR Data;

} YourStruct;

Shoud match:

TYPE YourStruct

REAL, ALLOCATABLE::Data(:)

END TYPE

A good thing about F90_DESCRIPTOR structures is that one matches everything allocatable: real, integer, type(), so you can write ugly code only once.

Jugoslav

Intel_C_Intel · ‎12-03-2003

Jugoslav, thanks for the reply.

With regard to 'who knows what', the sizes of the arrays are stored in the XML file, so I guess the C will 'know' the size first and could 'allocate'. However, since I'll need to store the sizes for use in FORTRAN loops etc., the FORTRAN side will need to 'know' and could do the allocation/deallocation.

I was hoping to have a single call from FORTRAN to C to read the file, then return to FORTRAN to do some maths with thedata store inthe FORTRAN module.I'm essentially in a FORTRAN environment utilizing a bit of C not the other way around - I'd like FORTRAN to retain 'control' so to speak, so I would favour keeping the FORTRAN 'clean' also.

Intel_C_Intel · ‎12-03-2003

Steve,

Can I ask what has changed in the decriptor layout? My documentation seems consistent with the DESCRIPT example...? Do I have the 'wrong' help file? (I have updated to 6.6C)

Thanks,

Dan

Steven_L_Intel1 · ‎12-03-2003

The fourth longword, marked Reserved in DESCRIPT, is instead as follows (from the Programmer's Guide):

The fourth longword (bytes 12 to 15) contains the low-order bit set if the array has been defined (storage allocated). Other bits may also be set by the compiler within this longword, for example, to indicate a contiguous array.

Please note that with Intel Visual Fortran, the descriptor layout will change again in a way incompatible with CVF. An extra longword is added. The new layout is as follows

The components of the current Intel Fortran array descriptor on IA-32 systems are as follows:

- The first longword (bytes 0 to 3) contains the base address. The base address plus the offset defines the first memory location (start) of the array.
- The second longword (bytes 4 to 7) contains the size of a single element of the array.
- The third longword (bytes 8 to 11) contains the offset. The offset is added to the base address to define the start of the array.
- The fourth longword (bytes 12 to 15) contains the low-order bit set if the array has been defined (storage allocated). Other bits may also be set by the compiler within this longword, for example, to indicate a contiguous array.
- The fifth longword (bytes 16 to 19) contains the number of dimensions (rank) of the array.
- The sixth longword (bytes 20 to 23) is reserved and should not be explicitly set.
- The remaining longwords (bytes 24 to 103) contain information about each dimension (up to seven). Each dimension is described by three additional longwords:
  - The number of elements (extent)
  - The distance between the starting address of two successive elements in this dimension, in bytes.
  - The lower bound
Note

The format for the descriptor on Itanium-based systems is identical to that on IA-32 systems, except that all fields are 8-bytes long, instead of 4-bytes.

Message Edited by MADsblionel on 12-03-2003 05:46 AM

Jugoslav_Dujic · ‎12-03-2003

In that case, I think it could be better that you go through somedocumentation and reverse-engineering (take a look at TRANSFER function, which can help you reverse-engineer a descriptor, e.g.

integer:: iRawBytes(7)

type t_something

real, allocatable:: x(:)

end type t_something

type(t_something):: t

allocate(t%x(23))

iRawBytes = TRANSFER(t, iRawBytes, 7)

do i=1,7

write(*,"(z8.5)") iRawBytes(i)

end do

will give you hex representation of a F90_DESCRIPTOR containing an array of 23 reals. Now, you should match it (fill properly in the same way) in C and there you go. Write few C wrappers, e.g. void F90_Alloc(F90_DESCRIPTOR*, int nSize, int nSizeOf) and void F90_Dealloc(F90_DESCRIPTOR*) and your C code should look "almost" clean. Basically, that means that you're reinventing the F90 ALLOCATE statement, but I don't see a cleaner way.

One potential gotcha is that you'll probably have to write C routines for cleanup as well (F90_Dealloc above). Although the type(t_something) as received by Fortran code will look to the compiler identical as if it were normally ALLOCATEd by Fortran, you'll potentially have problems with DEALLOCATE -- if it was allocated by C RTL using malloc or new, it probably cannot be released by Fortran RTL using DEALLOCATE.

Jugoslav

eddie_breeveld · ‎12-04-2003

Seeing all these structures shared between languages reminds me of a problem we once had sharing C++ and Fortran structures: It was solved by putting the SEQUENCE statementafterthe Fortran TYPE statement - it stops the compiler 'optimising' the structure.

I don't think that theC++ compiler does such optimisation for you, but it might.

Eddie

Steven_L_Intel1 · ‎12-04-2003

Without SEQUENCE, the compiler is allowed to reorder and pad the components of the derived type. Be sure to use SEQUENCE if you're sharing a derived type with another language.