- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Win7 Pro x64 PS Pro V16.0 update 1.
CPU E5-2620 v2
The compiler with updates was running quite well until recently (today). The host system has AVX (not AVX2). However, the runtime code linked in is using AVX2 instructions.
... 000007FEE29B98D6 call __intel_avx_rep_memset (7FEE29CBF60h) ... __intel_avx_rep_memset: 000007FEE29CBF60 push rdi 000007FEE29CBF61 push r15 000007FEE29CBF63 mov r11,rcx 000007FEE29CBF66 mov r10,r11 000007FEE29CBF69 mov rax,101010101010101h 000007FEE29CBF73 movzx r9,dl 000007FEE29CBF77 imul r9,rax 000007FEE29CBF7B lea rdx,[7FEE29CCB80h] 000007FEE29CBF82 vmovd xmm0,r9 000007FEE29CBF87 vpbroadcastd ymm0,xmm0
The vpbroadcastd is an AVX2 instruction.
I am not sure what caused the symptom to occur, it had been building and running successfully before. The only thing I can think of is (after many builds and runs, both debug and release), I performed some release builds with targeted instruction sets. The last of which was /QxCORE-AVX2, which I know cannot run on my development system.
After this AVX2 targeted release build, I have now switched back to /QxHost, and then subsequently /QxAVX and both builds are now linking in __intel_avx_rep_memset... that uses an AVX2 instruction.
Is there a way to undo this behavior (IOW undocumented option to not use __intel_avx_rep_memset).
.OR. is there an updated library that corrects this bug?
Note, I am waiting for a License update in order to load/use the V16.0 update 2.
Jim Dempsey
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'm looking into this....
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The routine would be linked in regardless, but there is CPU-dispatching code that would call it for appropriate instruction-set capable Intel processors. I looked at the source for this and it correctly dispatches to the "avx" routine only for AVX2-capable Intel processors (despite not saying AVX2 in the name.) See later reply.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Sorry for all the thrash on this, but I keep thinking about it and realizing I made errors in earlier replies.
How you build the program should have no effect on the run-time dispatch of _intel_fast_memset, which is internal to the run-time library. You haven't said exactly what goes wrong when it fails. As best as I can tell, when this routine is executed on an E5-2620 V2 (Ivy Bridge microarchitecture), the routine containing the AVX2 instruction would not be called (even though it will be linked in.)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I get an illegal instruction fault (the PC points to the vbroadcastd instruction)
__intel_avx_rep_memset
is the name of the function. Note it is not named
__intel_avx2_rep_memset
Jim
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I understand the name is confusing, but the code dispatches there only on AVX2 systems - or at least that's how I read the code. It is selected by "4th generation core", which has AVX2. Is this being called from _intel_fast_memset?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
In the opening post, the call to __intel_avx_rep_memset is made directly from the Fortran compiled source code. There is no call to the dispatcher __intel_fast_memset.
Recall I am compiling with /QxHost or /QxAVX. This is a targeted build for (first gen) AVX, and thus would (should) not call the dispatcher.
If I compile without /QxHost or /QxAVX, meaning use the test and dispatcher routines, wouldn't the AVX2 version have a different name from the AVX version (either that or enforcing a .DLL load of a different library using the same named but different coded routines).
I will build and test a version using the dispatcher and see where it ends up. I would like to bypass the dispatcher (though I could specify 2 or 3 targets).
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It's called directly from the compiled code? Really? One of the posts I deleted suggested that you shouldn't use /QxHost if you're going to run on a different system. I still believe that, but didn't think it relevant. It might be that the compiler saw that the host supported AVX2 and used that, but I'm pretty sure that routine is not called directly from compiled code. Can you show me the .asm file with the call? I'll note that when the "avx" routine is entered it is not called, so the call stack may be misleading.
This dispatch is entirely within the run-time library, not in your compiled code. In this it's like the math library or MKL.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Steve,
The point of /QxAVX (or other targeted /Qx...) is to remove the dispatch code, and its overhead.
That aside, I believe I located the source of the problem. The wrong .dll was being loaded (IOW the AVX2 targeted was being loaded). An oversight on my part when too many library load paths are involved in a hybrid C# (managed), C++ dll, Fortran DLL is put together.
Thanks for your attention to my issue.
The odd part was I could step through the source code with the debugger, and step into the disassembly of the avx2 code.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
As Steve pointed out, /QxAVX doesn't remove internal ISA dependent dispatch from math library or MKL. Those are controlled by /Qimf-arch-consistency:true and MKL conditional numerical reproducibility. It gets difficult to remember all this, even in the absence of those additional factors Jim mentioned. Even more of a problem than running into illegal instruction is the possibility sometimes encountered in the past of unexpected results on some platform which wasn't tested. If you stepped into the runtime, the execution path presumably would depend on the platform you are running on.
It's not evident whether there is any control over ISA dispatch for memset and the like, but those wouldn't have unexpected numerical effects.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Tim,
I get that. I am not using MKL. This call was made from
Array = 0.0
Which calls the appropriate __intel_..._ memset depending on compiler option. If none is specified, it will call the one with the dispatcher test. If a single one is specified (/QxAVX) then the dispatch code is omitted and the appropriate routine (entry point) is linked in.
The cause of the problem was the wrong DLL was loaded (my fault). The solution is quite screwy and does some goofy things. It is a mixed language solution where the Fortran library can be build with multiple (12 targeted) configurations, but the main C# has two configurations (Debug and Release). To resolve the association of what Fortran DLL gets used to which C# and C++ build, there are pre-build events and post-build events where the appropriate files (.lib, .dll and .pdb) are copied to the correct place. Unfortunately, if a specific project build isn't performed (its up to date), then the pre and/or post build events are not executed, and subsequently the desired .dll's do not get copied. The situation cannot be resolve through dependencies (without exploding the number of C# builds).
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This particular dispatch is not controllable with an option. I am curious though as to what actually called the routine. Maybe there was C++ code that did manual dispatch? I can't see a way that Fortran-compiled code would call this (though I could be mistaken.)
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page