- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

I am having trouble understanding exactly what the are capabilities of the Intel C++ compilers with respect to 128 bit floating point computations. Do the compilers offer full 128 bit computations as in IEEE Std 754-2008 (not the 80 bit extended formats) and are those computations available on Windows and Linux? Thx

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Yes, I did take a look at that page, but it is for the 128 bit decimal format (radix 10) and does not really discuss the binary floating point standard (radix 2).

Link Copied

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Hi,

Could you please elaborate more on what you exactly looking for? please provide the documentation links you are referring to if applicable.

Also, please provide the environment details you are working on.

**OS Version**:

**Compiler Version:**

Thanks & Regards

Goutham

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Hi,

I'm trying to solve a maximum likelihood optimization problem that is blowing up on commercial statistics packages. It is blowing up in such a way that I have confirmed for certain the maximum likelihood algorithms are incapable of handling this particular problem. I have coded in C++ an object framework that uses the Numerical Recipes algorithms with good success for test data. However, the real problem contains several millions of doubles in about 50 space. I am concerned about the accumulations of the logarithms and want to do the matrix algebra in long doubles so as to rule out significant digit issues (at least as much as possible). At the present I am using VS 2019, but also own a high end Intel Linux workstation with several compilers installed. Years ago, I had to do an EBCDIC to Mac binary conversion for 32 bit floats using IBM format tapes, so I am familiar with the details of working with machine level floats.

The Intel C++ Compiler 19.1 Developer Guide & Reference lists in the Compiler Option Details section a -mlong-double-n option where n can be set to 128 on Linux machines, making the long double a 128 bit IEEE Std 754-2008 float. But, I cannot confirm that the Intel does 128 bit floating point ops in C++ and for certain will do the calculations I want. So, before purchasing the latest Intel Parallel Studio XE version I want to know: if I buy the Windows version for use with Visual Studio, am I locked into at most 80 bit long doubles; if I buy the Linux version, can I for certain compute with 128 bit long doubles and does the compiler support all long double functions and operations; if the Linux version does not strictly adhere to the IEEE Std 754-2008, where does it deviate?

Thanks for your help.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Not sure if you have seen this page? https://software.intel.com/content/www/us/en/develop/documentation/cpp-compiler-developer-guide-and-...

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Yes, I did take a look at that page, but it is for the 128 bit decimal format (radix 10) and does not really discuss the binary floating point standard (radix 2).

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

After consulting with our Developers, I think Intel compilers do offer 128 bit computations.

For example on Linux:

$ cat f128.c

__float128 f(__float128 x, __float128 y, __float128 z) {

return x * y + z;

}

$ icc f128.c -c -V

Intel(R) C Intel(R) 64 Compiler for applications running on Intel(R) 64, Version 19.1.2.254 Build 20200623

Copyright (C) 1985-2020 Intel Corporation. All rights reserved

$

On Windows, you need to use _Quad along with /Qoption,cpp,--extended_float_types option.

_Quad f (_Quad x, _Quad y, _Quad z) {

return x * y + z;

}

C:\temp>icl /Qoption,cpp,--extended_float_types f128.c /c

Intel(R) C++ Intel(R) 64 Compiler for applications running on Intel(R) 64, Version 19.1.2.254 Build 20200623

Copyright (C) 1985-2020 Intel Corporation. All rights reserved.

f128.c

C:\temp>

Thanks,

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Thanks for your reply. The godbolt.org site helped. I was able to ascertain the size of the long double was 16 using some of the icc compilers. However, the Windows version is still a mystery. If I understand your compiler offers, I must choose an operating system before I buy. It's not clear if a single purchase can be used on more than one operating system. The /Qlong-double option on the Intel compiler says:

"However, the alignment requirement of the data type is 16 bytes, and its size must be a multiple of its alignment, so the size of the long double on Windows is also 16 bytes. Only the lower 10 bytes (80 bits) of the 16 byte space will have valid data stored in it."

The way I understand this is the Windows only does extended double computing (80 bit) so, my only real option on the Intel compiler is Linux. I still have to check on the full 128 bit number to see if the 128 bit ops do use the full 128 bit float.

If you could clear any of this up, it would be appreciated.

Thanks

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Somehow I hit the wrong email as solution. This is the solution to this issue. I was able to use https://godbolt.org to work with the 128 bit floats. It appears the Linux version is ahead of MSVC on working with long doubles and that's the direction I'm headed. Thanks. for your help.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

If you have _Quad type in your source code and compile with /Qoption,cpp,--extended_float_types, then it would be 128 bits.

Yes, you would need to have a license for each platforms.

You can get a trial for 30 days https://software.intel.com/content/www/us/en/develop/tools/parallel-studio-xe/choose-download.html.

Let's know how it goes.

Thanks,

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Hi Jay Tuthill,

Glad to know that your issue is resolved!

Could you please let us know If we can close this thread from our side?

Regards

Goutham

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Yes, you can close this thread. Thanks for all your help!

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Hi,

Thanks for the confirmation!

As this issue has been resolved, we will no longer respond to this thread.

If you require any additional assistance from Intel, please start a new thread.

Any further interaction in this thread will be considered community only.

Have a Good day!

Thanks & Regards

Goutham

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page