Intel® C++ Compiler
Support and discussions for creating C++ code that runs on platforms based on Intel® processors.
Announcements
Welcome to the Intel Community. If you get an answer you like, please mark it as an Accepted Solution to help others. Thank you!
7568 Discussions

Any tips on updating old C code to support 64-bit integers?

segmentation_fault
New Contributor I
482 Views

I want to update a matrix solver package written in C nearly 25 years ago. It has over 140k lines of code.  Here is the link to it:

 

https://www.netlib.org/linalg/spooles/spooles.2.2.html

 

It works fine for a matrix up to 1 million by 1 million in size. Beyond that it will crash. I believe the reason is because it does not support 64-bit integers. Is there a quick way to fix that?


I tried compiling it by adding these CFLAGS , -DMKL_ILP64 -DLONGLONG . But that did not seem to change the behaviour of it crashing. I suppose I need to manually modify each source file?

0 Kudos
1 Solution
jimdempseyatthecove
Black Belt
398 Views

>>Or am I totally simplifying the complexity of all this?

Yes

The preferred method is to use the type size_t as this will become the appropriate integer type to use of the targeted platform.

e.g. on 32-bit platform "unsigned int32", on 64-bit platform "unsigned int64".

 

To change all int (and unsigned int) types to 64-bit types will almost certainly introduce other errors as invariably, much code is/was written assuming int==32-bit. 

 

You must carefully look at the code changes before making such changes. The following procedure (sketch) may be helpful in reducing the amount of effort to make the necessary changes.

 

1) Identify the #define/#define's/literal/literals/variable/variables that are used directly and/or indirectly to specify values larger than 2GB of address span from a base address. By "address span", if for example an array of doubles is used, this is > (2GB/sizeof(double)). IOW compilers generally have been known to use the integer of type used to index and multiply that by the size of the array element (e.g. 8 for double), then add that to the base of the array. This is opposed to promoting the type of the index to size_t, then multiplying it by the size of the array element. IOW do not rely on assumptions as to what the compiler will do or had done in the past. Be explicit.

 

An example of the above might be a "#define N ..." or "int nParticles;"

Where you would change those to the appropriate 64-bit declarations *** then...

Do your egrep for all references to those literals/variables.

 

Next, for each of the references:

a) if they are used to specify a static array  (or fixed allocation in class/struct) then these must have code changed to make them allocatable.

b) if they are used in an expression to produce a result, that result must be the appropriate bitness (e.g. size_t)

 

Then for all the arrays (potentially) allocated to these larger sizes (note the size in bytes of the array, not just the index value) .gt. 2GB, any/every index use on these arrays must also be specified as size_t.

 

The process is something of a cascade effect starting with values that specify the "counts", then to the arrays, then to the indexes of the arrays, then to the arrays and/or variables that hold indexes to other of these large arrays. Note, indexes for example scanning strings need not (necessarily) be upsized.

 

Be meticulous and you may get all the necessary changes done before your first test run.

 

***

Keep the original code in separate development folder such that you can produce verification data of test runs of data of a few 100MB array size.

 

Also, if your code is using a random number generator to produce indexes, make sure you use a RNG that produces the appropriate ranges.

 

The best method is to remove errors in coding without introducing new errors in coding.

 

Jim Dempsey

View solution in original post

8 Replies
jimdempseyatthecove
Black Belt
461 Views

You will have to edit the source file to change the int declarations that are used for array indexing to int64. This may include promotions of some expressions that generate indexes. Example:

int64 mySize = ((int64)N) * N;

Expect that you will make some errors in the process of conversion.

Jim Dempsey

segmentation_fault
New Contributor I
440 Views

Thanks! I am surprised there is no automated tool to do this. Surely, I am not the first person to want to update 32-bit code for 64-bits.

 

Per your suggestion I ran the following grep command at the top level:

 

 

$ egrep -r '^\s+int\s+' * | wc -l
5434

 

 

Most int declarations look like one of these:

 

 

int      size,
int      index[]
int      *ploc
int   **pydistinct,
int    *ivec1 = IVinit2(n) ;
int   *newToOld = perm->newToOld = IVinit(size, -1) ;

 

 

Next I ran this and it only found five matches:

 

 

$ egrep -r '\(\(int\)' *
MPI/drivers/testScatterDenseMtx.c:      map[v] = ((int) Drand_value(&drand)) % nproc ;
MPI/drivers/testScatterInpMtx.c:      map[v] = ((int) Drand_value(drand)) % nproc ;
MPI/drivers/testSplitDenseMtx.c:   map[v] = ((int) Drand_value(&drand)) % nproc ;
MPI/drivers/testSplitInpMtx.c:   map[v] = ((int) Drand_value(drand)) % nproc ;

 

 

Would a simple global replace of all these matches to int64 get me on the right track? Or am I totally simplifying the complexity of all this?

jimdempseyatthecove
Black Belt
399 Views

>>Or am I totally simplifying the complexity of all this?

Yes

The preferred method is to use the type size_t as this will become the appropriate integer type to use of the targeted platform.

e.g. on 32-bit platform "unsigned int32", on 64-bit platform "unsigned int64".

 

To change all int (and unsigned int) types to 64-bit types will almost certainly introduce other errors as invariably, much code is/was written assuming int==32-bit. 

 

You must carefully look at the code changes before making such changes. The following procedure (sketch) may be helpful in reducing the amount of effort to make the necessary changes.

 

1) Identify the #define/#define's/literal/literals/variable/variables that are used directly and/or indirectly to specify values larger than 2GB of address span from a base address. By "address span", if for example an array of doubles is used, this is > (2GB/sizeof(double)). IOW compilers generally have been known to use the integer of type used to index and multiply that by the size of the array element (e.g. 8 for double), then add that to the base of the array. This is opposed to promoting the type of the index to size_t, then multiplying it by the size of the array element. IOW do not rely on assumptions as to what the compiler will do or had done in the past. Be explicit.

 

An example of the above might be a "#define N ..." or "int nParticles;"

Where you would change those to the appropriate 64-bit declarations *** then...

Do your egrep for all references to those literals/variables.

 

Next, for each of the references:

a) if they are used to specify a static array  (or fixed allocation in class/struct) then these must have code changed to make them allocatable.

b) if they are used in an expression to produce a result, that result must be the appropriate bitness (e.g. size_t)

 

Then for all the arrays (potentially) allocated to these larger sizes (note the size in bytes of the array, not just the index value) .gt. 2GB, any/every index use on these arrays must also be specified as size_t.

 

The process is something of a cascade effect starting with values that specify the "counts", then to the arrays, then to the indexes of the arrays, then to the arrays and/or variables that hold indexes to other of these large arrays. Note, indexes for example scanning strings need not (necessarily) be upsized.

 

Be meticulous and you may get all the necessary changes done before your first test run.

 

***

Keep the original code in separate development folder such that you can produce verification data of test runs of data of a few 100MB array size.

 

Also, if your code is using a random number generator to produce indexes, make sure you use a RNG that produces the appropriate ranges.

 

The best method is to remove errors in coding without introducing new errors in coding.

 

Jim Dempsey

View solution in original post

segmentation_fault
New Contributor I
327 Views

Thanks for the detailed response. Sounds like it is going to be quite an undertaking as the code is 140k lines long.  I guess I am not understanding why changing all ints to int64s will cause problems? I realize it will create a larger executable and use more memory, but the machines I am running this on have plenty of RAM. Can you give an example of a likely scenario I would encounter by doing such a thing?

ShanmukhS_Intel
Moderator
352 Views

Hi,


Reminder:

Has the solution provided helped. 


If this resolves your issue, make sure to accept this as a solution. This would help others with similar issue. Thank you!


Best Regards,

Shanmukh.SS



jimdempseyatthecove
Black Belt
305 Views

>> Sounds like it is going to be quite an undertaking as the code is 140k lines long.

Typically there are only a few places that may need changing. And those places may be easily be identified. For example, assume the problem is particle simulation. Where you have a variable named nParticles. Each particle having properties.

In AOS structured program, you need only concentrate  on code involving nParticles: "for(int i=0; i<nParticles; ++i)"

And then examine how "i" is used.

In SOA structured program, you may have additional variables that are on the order of (or small multiple of) nParticles.

And then examine how any loop control variable is used.

Note, the compiler diagnostics may aid in locating places that need to be addressed. For example:

comparing "int" with "size_t" (signed 32-bit integer with unsigned 64-bit integer). And if need be, you can declare an inline operator function template that generates a compile time error to catch all problematic places. (Note, you may have false positives (if(i==0)...), but those can be easily rectified).

The compiler should catch function calls receiving "int" but passed "uint64".

The size of the code isn't proportional to the amount of work.

 

Jim Dempsey

jimdempseyatthecove
Black Belt
304 Views

>>why changing all ints to int64s will cause problems?

Well, consider your code containing boo-koo number of system function calls requiring 32-bit integer arguments...

You would then have introduced (boo-koo number)*(# int arguments) number of errors.

Jim Dempsey

ShanmukhS_Intel
Moderator
182 Views

Hi,


Thanks for accepting the solution. If you need any additional information, please post a new question as this thread will no longer be monitored by Intel.


Best Regards,

Shanmukh.SS


Reply