- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I am looking for fast and robust C/C++ code for 6x6 matrix (double precision, general matrix) inversion optimized for current INTEL Xeon CPUs.
I found the following Intel document (http://www.intel.com/design/pentiumiii/sml/245044.htm), but I have no idea how to recompile this code under C/C++ Intel compiler 12.x.
I tried to use IPP matrix inversion function too, but the matrix is probably near singular, and this function gives very often wrong results.
Is there any highly optimized and robust 6x6 matrix inversion function for Intel compiler?
I found the following Intel document (http://www.intel.com/design/pentiumiii/sml/245044.htm), but I have no idea how to recompile this code under C/C++ Intel compiler 12.x.
I tried to use IPP matrix inversion function too, but the matrix is probably near singular, and this function gives very often wrong results.
Is there any highly optimized and robust 6x6 matrix inversion function for Intel compiler?
Link Copied
10 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You appear to have a contradiction here. If your data require full pivoting for reliable solution, you aren't likely to be able to optimize better than a competent compiler can do. You will need first to find and test a reliable implementation, then do what is possible e.g. with #pragma vector aligned to get it vectorized. If you specialize with fixed dimensions at compile time, you should get better optimization than you would in a variable sized case.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
OK ... I need to know now, what is the proper way to recompile C/C++ SIMD enhanced code mentioned in the document (http://www.intel.com/design/pentiumiii/sml/245044.htm)by Intel C++ compiler 12.x
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting Michal Kvasnicka
OK ... I need to know now, what is the proper way to recompile C/C++ SIMD enhanced code mentioned in the document (http://www.intel.com/design/pentiumiii/sml/245044.htm)by Intel C++ compiler 12.x
Hi,
I'd like to ask a question: Did you have any compilation issues or problems? If Yes, could you provide some details?
Best regards,
Sergey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Just open a document (http://www.intel.com/design/pentiumiii/sml/245044.htm) and try to recompile the code on section 5.2 C Code in General Case with Streaming SIMD Extensions (page 10).
Obviously, a lot of header files are missing + initializations, etc. ... so finaly the code in this form is not possible to compile by current Intel C++ compiler. There is no information about proper compiler switches, options, linking libraries, etc. ... nothing!
Michal
Obviously, a lot of header files are missing + initializations, etc. ... so finaly the code in this form is not possible to compile by current Intel C++ compiler. There is no information about proper compiler switches, options, linking libraries, etc. ... nothing!
Michal
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting Michal Kvasnicka
...Obviously, a lot of header files are missing + initializations, etc. ... so finaly the code in this form is not possible to compile by current Intel C++ compiler. There is no information about proper compiler switches, options, linking libraries, etc. ... nothing!..
If youinclude the following headers the codes will be compiled:
[bash] ... #include
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting Michal Kvasnicka
Just open a document (http://www.intel.com/design/pentiumiii/sml/245044.htm) and try to recompile the code on section 5.2 C Code in General Case with Streaming SIMD Extensions (page 10)...
I compiled and tested that function with a couple of modifications:
[cpp]#define EPSILON ( 1.19209289e-07f ) // Single-Precision ( 23-bit ) ... void PIII_Invert_6x6( float *src ); void PIII_Invert_6x6( float *src ) { ... __m128 tmp1 = { 0.0f }, tmp2 = { 0.0f }, tmp3 = { 0.0f }; ... } [/cpp]
Here are results of my test case:
[cpp] // Matrix A float fMatrix[6][6] = { { 1.0f, 0.0f, 0.0f, 0.0f, 0.0f, 0.0f }, { 0.0f, 2.0f, 0.0f, 0.0f, 0.0f, 0.0f }, { 0.0f, 0.0f, 3.0f, 0.0f, 0.0f, 0.0f }, { 0.0f, 0.0f, 0.0f, 4.0f, 0.0f, 0.0f }, { 0.0f, 0.0f, 0.0f, 0.0f, 5.0f, 0.0f }, { 0.0f, 0.0f, 0.0f, 0.0f, 0.0f, 6.0f }, }; PIII_Invert_6x6( &fMatrix[0][0] ); // Inverse of Matrix A // 0.9999999403953 0.0000000000000 0.0000000000000 0.0000000000000 0.0000000000000 0.0000000000000 // 0.0000000000000 0.4999999701976 0.0000000000000 0.0000000000000 0.0000000000000 0.0000000000000 // 0.0000000000000 0.0000000000000 0.3333333134651 0.0000000000000 0.0000000000000 0.0000000000000 // 0.0000000000000 0.0000000000000 0.0000000000000 0.2499999850988 0.0000000000000 0.0000000000000 // 0.0000000000000 0.0000000000000 0.0000000000000 0.0000000000000 0.1999999880790 0.0000000000000 // 0.0000000000000 0.0000000000000 0.0000000000000 0.0000000000000 0.0000000000000 0.1666666567325 DisplayMatrixF( &fMatrix[0][0], 6, 6, RTtrue ); // Verification ( Inv(A) * A = I ) // 0.9999999403953 0.0000000000000 0.0000000000000 0.0000000000000 0.0000000000000 0.0000000000000 // 0.0000000000000 0.9999999403953 0.0000000000000 0.0000000000000 0.0000000000000 0.0000000000000 // 0.0000000000000 0.0000000000000 0.9999999403953 0.0000000000000 0.0000000000000 0.0000000000000 // 0.0000000000000 0.0000000000000 0.0000000000000 0.9999999403953 0.0000000000000 0.0000000000000 // 0.0000000000000 0.0000000000000 0.0000000000000 0.0000000000000 0.9999999403953 0.0000000000000 // 0.0000000000000 0.0000000000000 0.0000000000000 0.0000000000000 0.0000000000000 0.9999999403953 // If a rounding applied it is an Identity matrix // 1.0000000000000 0.0000000000000 0.0000000000000 0.0000000000000 0.0000000000000 0.0000000000000 // 0.0000000000000 1.0000000000000 0.0000000000000 0.0000000000000 0.0000000000000 0.0000000000000 // 0.0000000000000 0.0000000000000 1.0000000000000 0.0000000000000 0.0000000000000 0.0000000000000 // 0.0000000000000 0.0000000000000 0.0000000000000 1.0000000000000 0.0000000000000 0.0000000000000 // 0.0000000000000 0.0000000000000 0.0000000000000 0.0000000000000 1.0000000000000 0.0000000000000 // 0.0000000000000 0.0000000000000 0.0000000000000 0.0000000000000 0.0000000000000 1.0000000000000 [/cpp]Best regards,
Sergey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Sergey ... thank you very much!!!
So, the last question is: Is possible to tranform this code from single to double precision? I am not sure if the SIMD extensions support double precision or not.
Do you know and could you recomend me any other C/C++ implementation of highly optimized 6x6 matrix (general, double precision) inversion code. The 6x6 inversion matrix function plays crucial role in our simulation code. I am looking for any suitable option on this topic to deploy maximum computing power provided by current Xeon CPUs.
Michal
So, the last question is: Is possible to tranform this code from single to double precision? I am not sure if the SIMD extensions support double precision or not.
Do you know and could you recomend me any other C/C++ implementation of highly optimized 6x6 matrix (general, double precision) inversion code. The 6x6 inversion matrix function plays crucial role in our simulation code. I am looking for any suitable option on this topic to deploy maximum computing power provided by current Xeon CPUs.
Michal
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting Michal Kvasnicka
Sergey ... thank you very much!!!
So, the last question is: Is possible to tranform this code from single to double precision? I am not sure if the SIMD extensions
support double precision or not.
[SergeyK] I think it is possible.
Do you know and could you recomend me any other C/C++ implementation of highly optimized 6x6 matrix (general, double precision)
inversion code.
[SergeyK] Intel IPP library has a set of highly optimized functions in a Matrix Processing domain ( ippm ).
Another option to consideris aGNU Scientific Library ( GSL ).
So, the last question is: Is possible to tranform this code from single to double precision? I am not sure if the SIMD extensions
support double precision or not.
[SergeyK] I think it is possible.
Do you know and could you recomend me any other C/C++ implementation of highly optimized 6x6 matrix (general, double precision)
inversion code.
[SergeyK] Intel IPP library has a set of highly optimized functions in a Matrix Processing domain ( ippm ).
Another option to consideris aGNU Scientific Library ( GSL ).
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting Michal Kvasnicka
Sergey ... thank you very much!!!
So, the last question is: Is possible to tranform this code from single to double precision?..
So, the last question is: Is possible to tranform this code from single to double precision?..
Yes. Take into account thatcode changesfor the2nd and 3rd functions will be significant. In a couple of words,
you will need to change a basic type from '__m128' to '__m128d'( only 2 double-precision values ) and all
the rest 200 - 300 code lines...
I wouldmodifythe 1st C-like function ( 'double' type instead of 'float') and than I would try to optimize it as better as possible.
Best regards,
Sergey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
For a double-precision type an epsilon is:
[cpp]#define EPSILON ( 2.2204460492503131e-016L ) // Double-Precision ( 64-bit )
[/cpp]
[cpp]#define EPSILON ( 2.2204460492503131e-016L ) // Double-Precision ( 64-bit )
[/cpp]

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page