Intel® ISA Extensions
Use hardware-based isolation and memory encryption to provide more code protection in your solutions.
Announcements
Welcome to the Intel Community. If you get an answer you like, please mark it as an Accepted Solution to help others. Thank you!

Links to instruction documentation

Thomas_W_Intel
Employee
1,568 Views
0 Kudos
38 Replies
SergeyKostrov
Valued Contributor II
408 Views
>>...If a hardware engineer gives me a single number on this, I am certain that is not a complete picture... Could you get that number? I'm sorry and let us decide what to do next. As I've told several times: "...The latest edition of "Intel Optimization Reference Manual" ( 04.2012 ) has lots of details about these two instructions but by some unexplained reason latencies are not specified..." I also don't see any logic in your statements: ...David Chaiken's recommendation on algorithm... ...I am certain that is not a complete picture... ...A number in CPU core cycle will certainly be useless... ...If your algorithm is able to deal with a range of values... ...If you can replace the longer journey with shorter ones... Shih, we would like to see just two numbers ( !!! ), that is latencies for two Intel instructions and nothing else. Do you understand this?
Thomas_W_Intel
Employee
408 Views
Sergey, as Shiv has pointed out, the latency depends on several factors. Therefore, we need some information about the system that you are using. What is the core architecture that you are using? Which platform do you have? What is the core and uncore frequency? What DIMMs are you using (speed and rank) and how are they populated? Do you need the loaded latency and if so what is the bandwidth that you have? Kind regards Thomas
SergeyKostrov
Valued Contributor II
408 Views
Hi everybody, >>... >>What is the core architecture that you are using? Which platform do you have? What is the core and uncore frequency? What DIMMs >>are you using (speed and rank) and how are they populated? Here are some technical specs for my system: Dell Precision Mobile M4700 Intel Core i7-3840QM ( Ivy Bridge / 4 cores / 8 logical processors )( http://ark.intel.com/compare/70846 ) 16GB RAM 320GB HDD NVIDIA Quadro K1000M ( 192 CUDA cores / 2GB memory ) Windows 7 Professional 64-bit Best regards, Sergey
Bernard
Black Belt
408 Views

>>>Once again, Where coud I find latencies for MOVNTDQ and VMOVNTDQ instructions?>>>

Latency of MOVNTDQ is given in Agner instruction tables and it is ~400 cycles for Haswell CPU.

paul_l_2
Beginner
408 Views

hi,

I find that the c++ compiler doesn't generate the AVX2 assembly while I write the AVX2 intrinsics or inline assembly.

but the compiler can generate the correct AVX assembly.

and so I am confused.

some samples as follows:

//////----intrinsic

b = _mm256_stream_load_si256(&a);
011E10A2  lea         eax,
011E10A8  db          c4h 
011E10A9  loop        wmain+118h (11E1128h)
011E10AB  sub         al,...


//////----inline assembly

__asm
 {
  vmovntdqa ymm0, a;
011E110A  db          c4h 
011E110B  loop        _wmain+17Ah (11E118Ah)
011E110D  sub         al,byte ptr
  vmovntpd b, ymm0;
011E1113  db          c5h 
011E1114  std             
011E1115  sub        ...





}

 

thank you very much!

Sergio_J__C_
Beginner
408 Views

Very nice,thanks bro

adel_s_1
Beginner
408 Views

Information is very valuable

Amir_K_2
Beginner
408 Views

Thanks for  sharing the links 

Best Regards 
Amir

Islam_A_
Beginner
408 Views

Thomas,

Is there a downloadable PDF of the Optimization Reference Manual? I'm not finding it.

Also, is there any published data on expected performance of the various AVX intrinsics relative to SSE by cache? I.E. vmulps is 2X faster in L1, 1.8X faster in L2, etc. Maybe that's a dumb question, but it's hard to tell if code is optimal without some idea of ideal hw throughput.

Thanks for the pointers,
 

McCalpinJohn
Black Belt
408 Views

The best way to find the Intel Optimization Reference Manual is to do a search on the document number.   E.g., with Google, the search would be "248966 site:intel.com".    The PDF should be one of the first results.  
Searching for "248966" using the Intel website internal search engine also gets the result quickly.

The most recent update is revision 033, dated June 2016.

To help make these searches easier, I typically rename the PDF files on my system to include both a descriptive name and the full document number (including revision).  Then I don't have to open the document to look up the number when I do my periodic checks for new versions.

Anton_R_
Beginner
408 Views

Hello,

May be Intel has the instructions set reference in some formal format suitable for reading programmatically i.e. in xml? Can I have it?

Thank you,

Anton

Thomas_W_Intel
Employee
408 Views

Anton,

unfortunately, I'm not aware of such a instruction set reference that is easily parsable by programs.

Kind regards

Thomas

sirrida
Beginner
408 Views

You can obtain a machine readable instruction set reference e.g. at http://www.nasm.us/pub/nasm/snapshots/latest/ (NASM) in the source file insns.dat of e.g. nasm-2.12.01rc1-20160308.zip. It should be quite complete and up to date.

sirrida
Beginner
408 Views

The sources of NASM (http://nasm.us/) contain a machine readable instruction set reference.

james_l_3
Beginner
408 Views

Hi Sergey

The best advice I could offer is to borrow from an article I read about David Chaiken's recommendation on the algorithm.

To design a suitable algorithm, think about its performance model underneath.

If a hardware engineer gives me a single number on this, I am certain that is not a complete picture, and it would be a dis-service to publish a number due to the complexity of situations that software can deploy into the wide variety of platform.

A number in CPU core cycle will certainly be useless, considering the core operates in a different clock domain. I believe the DRAM subsystem may bring in another clock domain into the picture.

The sources of NASM https://www.surfproxyserver.com contain a machine-readable instruction set reference

If your software gets deployed on a multi-socket platform, what kind of complications will snoop bring?

Note__Mark
Beginner
408 Views

Thank. read, very informative :-)
 

DANNIE__SANG
Beginner
408 Views

Brijender Bharti (Intel) wrote:

Hi,
Please use the following link:
http://www.intel.com/content/www/us/en/architecture-and-technology/64-ia...

It will open the reading pan. On Top right Mega Fast Keto Boost hand corner there is a down arrow button that means download (next to print).

Hi Thanks for the tip :)

Podjachev__Evgeny
408 Views

HI, site https://software.intel.com/sites/landingpage/IntrinsicsGuide doesn't work. It loads but doesn't show any intrisicts. Can't it be fixed? Or is there pdf version of it?

Reply