- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Consider the following code:

double* a;

size_t n;

a[0:n] = log(a[0:n]);

The compiler reports that _log cannot be vectorized. Reports the same for the pow() function. However, changing to functions such as exp(), sin(), etc. allow vectorization. I thought that log() and pow() were vectorizable functions as in http://software.intel.com/sites/products/documentation/doclib/iss/2013/compiler/cpp-lin/GUID-E98D4E0.... Does anyone know the cause?

Thanks.

Link Copied

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

**log**function.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

I can try those workarounds, but I don't know what is different about log() and pow(). This is the output from level 6:

vectorization support: call to function _log cannot be vectorized.

Same occurs with pow(), with the statement referencing the _pow function.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Your example would require #include <math.h> and possibly a change from size_t to int.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Actually I have #include <mathimf.h> in the file. Isn't that what is required? Why int? It's the same size on 32 bit builds and I believe technically wrong for 64 bit builds.

Again, these would not explain why other math functions work.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

**_log**cannot be vectorized. Is that a macro or a C-like function? Use debugger to verify.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

I figured out what is causing it not to be vectorized. The use of /fp:precise. However, I don't understand why that switch will affect only certain functions while others can be vectorized.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Michael Hlavinka wrote:

I figured out what is causing it not to be vectorized. The use of /fp:precise. However, I don't understand why that switch will affect only certain functions while others can be vectorized.

So you can see why everyone has been asking for a reproducer.

In recent compilers, increasing numbers of svml function invocations are disabled by /fp:precise. That some of them slipped by in the past may have been an oversight. svml functions aren't designed to permit capturing exceptions on individual operands. If you wish to over-rule this effect on math function vectorization, you may set /Qfast-transcendentals.

In principle, you may also need to consider the /Qimf- options. The svml default "guarantees" accuracy only within 4 Ulps (although it is usually better), which is not consistent with expectation for /fp:precise. exp() and pow() functions (and their relatives) are notoriously difficult to vectorize while maintaining full accuracy for corner cases. /Qimf-... allows you to request higher precision/slower or lower/faster functions if they exist.

I noticed a case this week where disabling svml vectorization by -fp-model source doesn't affect vec-report. Apparently, the decision not to report the difference between full vectorization with and partial vectorization without /Qcomplex-limited-range has been carried over to /Qfast-transcendentals.

I had to revise my recommendation for options to observe parentheses while allowing maximum optimization to include fast-transcendentals:

/fp:source /Qftz /Qfast-transcendentals [/Qprec-div- /Qprec-sqrt-]

This still disables vectorization of sum and indexed max/min reductions.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Thanks for the information everyone. Do you still want a repro case as all I did was extract this from a much larger program? My repro case really doesn't do anything more than here.

Tim, do you know the accuracy of the VC++ library in /fp:precise and /fp:fast mode? Since part of my application is compiled with it, I suspect I may need similar accuracies for the various modules.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Microsoft /fp:fast vs. /fp:precise don't affect their math libraries, as far as I know. Most of them, particularly if based on x87 code, should be what Intel calls "high" accuracy. I don't believe there are any vector math functions in the Microsoft libraries. If it's critical, you may want /Qimf-precision:high versions for Intel vector libraries (high accuracy is the default for the scalar functions). Although ICL /fp:source is roughly equivalent to Microsoft /fp:fast, the more aggressive ICL default /fp:fast affects math function accuracy only when it promotes vectorization and imf-precision is set to medium (default) or low (where double "low" is barely better than float high precision).

By the way, /Qimf-precision also affects vectorized divide and sqrt, but /Qprec-div /Qprec-sqrt will force those independently to full accuracy.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

**Sub-Test 5.1**- Calculates Product of 0.1 * 0.1 - RTfloat // { CrtPrintf( RTU("Sub-Test 5.1 - RTfloat\n") ); RTfloat fVal = 0.1f; RTfloat fRes = 0.0f; uiControlWordx87 = CrtControl87( _RTFPU_PC_24, _RTFPU_MCW_PC ); fRes = fVal * fVal; CrtPrintf( RTU("24-bit : [ %1.1f * %1.1f = %.17f ]\n"), fVal, fVal, fRes ); uiControlWordx87 = CrtControl87( _RTFPU_PC_53, _RTFPU_MCW_PC ); fRes = fVal * fVal; CrtPrintf( RTU("53-bit : [ %1.1f * %1.1f = %.17f ]\n"), fVal, fVal, fRes ); uiControlWordx87 = CrtControl87( _RTFPU_PC_64, _RTFPU_MCW_PC ); fRes = fVal * fVal; CrtPrintf( RTU("64-bit : [ %1.1f * %1.1f = %.17f ]\n"), fVal, fVal, fRes ); uiControlWordx87 = CrtControl87( _RTFPU_CW_DEFAULT, _RTFPU_MCW_PC ); fRes = fVal * fVal; CrtPrintf( RTU("Default : [ %1.1f * %1.1f = %.17f ]\n"), fVal, fVal, fRes ); } //

**Sub-Test 5.2**- Calculates Product of 0.1 * 0.1 - RTdouble // { CrtPrintf( RTU("Sub-Test 5.2 - RTdouble\n") ); RTdouble dVal = 0.1L; RTdouble dRes = 0.0L; uiControlWordx87 = CrtControl87( _RTFPU_PC_24, _RTFPU_MCW_PC ); dRes = dVal * dVal; CrtPrintf( RTU("24-bit : [ %1.1f * %1.1f = %.17f ]\n"), dVal, dVal, dRes ); uiControlWordx87 = CrtControl87( _RTFPU_PC_53, _RTFPU_MCW_PC ); dRes = dVal * dVal; CrtPrintf( RTU("53-bit : [ %1.1f * %1.1f = %.17f ]\n"), dVal, dVal, dRes ); uiControlWordx87 = CrtControl87( _RTFPU_PC_64, _RTFPU_MCW_PC ); dRes = dVal * dVal; CrtPrintf( RTU("64-bit : [ %1.1f * %1.1f = %.17f ]\n"), dVal, dVal, dRes ); uiControlWordx87 = CrtControl87( _RTFPU_CW_DEFAULT, _RTFPU_MCW_PC ); dRes = dVal * dVal; CrtPrintf( RTU("Default : [ %1.1f * %1.1f = %.17f ]\n"), dVal, dVal, dRes ); } ... You will need to comment all calls to

**CrtControl87**CRT function and

**FPU**settings need to be set at a compilation time using

**/fp:[ mode ]**option.

**Notes:**CrtControl87 = _control87 CrtPrintf = _tprintf RTU = _T RTfloat = float RTdouble = double etc

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

**Support of 'long double' floating point data type on Intel CPUs ( A collection of threads )**Web-link: software.intel.com/en-us/node/375459 Forum topic:

**Mathimf and Windows**Web-link: software.intel.com/en-us/forums/topic/357759 Forum topic:

**Support of Extended or Quad IEEE FP formats**Web-link: software.intel.com/en-us/forums/topic/358472 Forum topic:

**Using 'long double' in Parallel Studio?**Web-link: software.intel.com/en-us/forums/topic/266290 Forum topic:

**Why function printf does not support long double?**Web-link: software.intel.com/en-us/forums/topic/372720 Forum topic:

**Mixing of Floating-Point Types ( MFPT ) when performing calculations. Does it improve accuracy?**Web-link: software.intel.com/en-us/forums/topic/361134 Forum topic:

**Test results for CRT-function 'sqrt' for different Floating Point Models**Web-link: software.intel.com/en-us/forums/topic/368241

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Sergey, thanks for the info. I'll look into it.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page