Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Intel Community
- Software
- Software Development Technologies
- Intel® ISA Extensions
- Haswell RCPPS/RSQRTPS implementation

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

maratyszcza

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-01-2013
04:06 PM

228 Views

Haswell RCPPS/RSQRTPS implementation

Hi,

I work on code which targets AVX2 + FMA3 and depends on the accuracy of VRCPPS/VRSQRTPS. Should I expect the implementation of these instructions on Haswell to be the same as on Ivy Bridge?

Regards,

Marat

Link Copied

16 Replies

SergeyKostrov

Valued Contributor II

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-02-2013
07:35 AM

228 Views

TimP

Black Belt

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-02-2013
08:22 AM

228 Views

I suppose he means do the instructions maintain the same numerical behavior, as I'd expect.

I didn't hear anything about one time proposed addition or substitution of corresponding instructions with sufficient accuracy to support double with 2 iterations, such as early AMD SSE CPUs had. Current Intel implementations of iterative divide and sqrt methods are specified to maintain 49-bit precision unless the options to accept less are set.

maratyszcza

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-02-2013
09:50 AM

228 Views

Thanks, Tim. However, I'm concerned not only about accuracy, but also about convergence to correctly rounded result. Reciprocal computed with FMA converges to correctly rounded result, but requires that the initial approximation overestimates 1/x when x is a power of two. The RCPPS implementation in Ivy Bridge does not overestimate 1/x for these cases, and if we compute reciprocal with FMA and using Ivy Bridge RCPPS implementation, it will not produce correctly rounded result when x is power of two (e.g. rcp(0x1.FFFFFFFFFFFFFp-1) will converge to 0x1.0000000000000p+0 instead of 0x1.0000000000001p+0).

Thus I wonder if RCPPS/RSQRTPS on Haswell produce numerically different results than on Ivy Bridge.

TimP

Black Belt

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-02-2013
01:09 PM

228 Views

Interesting point. I haven't heard of any changes, but others here are more expert on that.

BRET_T_Intel

Employee

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-02-2013
04:33 PM

228 Views

SergeyKostrov

Valued Contributor II

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-02-2013
10:11 PM

228 Views

I could do a verification on Ivy Bridge system if you provide a test case.

maratyszcza

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-03-2013
09:38 AM

228 Views

Thank you, TimP and BRET T., this is what I was looking for.

Sergey Kostrov, I already have an Ivy Bridge system, and it has the issue I described above.

zalia64

New Contributor I

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-07-2013
11:21 PM

228 Views

Please help an outsider: what is the instruction VRCPPS? it does not appear in the Intel documentation : * Intel® 64 and IA-32 Architectures, Software Developer’s Manual, Volume 2 (2A, 2B & 2C), Instruction Set Reference, A-Z, Jan. 2013 .*

From the discussion above, it looks like an higher-precision version of RCPPS .

Could you name a reference for that (and probably othe) new instructions? I surely would like an high-precision of the RSQRTPS , too.

Thanks

Bernard

Black Belt

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-08-2013
12:22 AM

228 Views

Hi amos,

glad to see someone from Israel:)

VRCPPS computes the reciprocal of 8 32bit floating values.It is AVX instruction type.

Bernard

Black Belt

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-08-2013
12:24 AM

228 Views

zalia64

New Contributor I

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-08-2013
12:31 PM

228 Views

Thanks, iliyapolak. I should have guessed, with the 'V' at the start.

I am an old hand with assembly language, but a new one with 64-bit. My MASM does not support AVX instructions. So, I will have to look somewhere else for AVX support (perhaps WASM)..

Would you happen to know if 'Visual C 2012 professional' supports AVX in the integrated debugger? That is, when using debug and stepping through the code, will the 'disassembly window' show the AVX instructions ? Otherwise, debugging is a big problem...

I have to convert a program from Matlab into C+ASM, for speed. AVX could be a great help, IF and ONLY IF I could find a full-size assembler and debugger. Would you suggest a development enviroment for native AVX ? native AVX, not C mnemonics.

Bernard

Black Belt

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-08-2013
01:36 PM

228 Views

Bernard

Black Belt

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-08-2013
01:47 PM

228 Views

Bernard

Black Belt

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-08-2013
09:27 PM

228 Views

Hi amos,

you mentioned in your post that you are converting matlab program into presumably c/inline assembly version so I would like to ask you do you write scientific software?I have a few projects mainly large library of special functions which i try to optimize are you interested in it?

Thank you in advance

zalia64

New Contributor I

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-08-2013
11:01 PM

228 Views

Hi iliya,

Yes, I develope scientific software. Usually to perform algorithms in Machine Vision, Image Enhancement, On-Line quality control and other problems of noisy and clattered inputs. Real time working systems, not mathematical models.

About optimising your library - perhaps I could be of help. Pls contact in private message.

Bernard

Black Belt

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-08-2013
11:47 PM

228 Views

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

For more complete information about compiler optimizations, see our Optimization Notice.