Since I use custom DLL with custom dispatching, I wanted to check the new S8 (Atom) cpu code. Turns out ippimerged_t.lib does not contain any S8 code. Also, the sample Linkage/MergedLib does not contain any references to S8.
In the Bin folder however, is a file ippis8-6.1.dll, obviously with S8 code.
Also, looking into ippimerged_t.lib, I see a new cpu code "g9".
- ippmerged.c must be updated with S8 code. - g9 in ippimerged_t.lib must be documented - Is g9=s8 ? - userguide_win_ia32.pdf must be updated.
The s8 library (Atom-optimized) is not present in the static libraries, only in the dynamic libraries. However, IPP applications built with the static library will run on an Atom processor with very good to equivalent performance using the v8 library (which is automatically selected, so you don't need to do anything special for an Atom).
The fundamental difference between the v8 and s8 libraries are the compiler options used to build them, which accommodates the differences in the pipelines between the two processors. My understanding is that if the processor is very busy the differences in the pipeline architecture is not that significant. Since the IPP functions tend to keep the processor busy, these two variations in the library (v8 and s8)give nearly identical performance on an Atom. Also, the s8 version of the IPP library does not use any Atom-unique instructions, so nothing is lost there by using the v8 slice.
See this article for more info regarding the library slices:
Since I use a custom DLL with my own custom cpu dispatching code, Dynamic library mechanics are not available to me.
I have created a dll with common code, and with a cpu dispatcher, loading a cpu-specific dll, containing only a subset of IPP.
I do this to create the smallest footprint, as loading time from a network share is important.
I then link in two IPP cores, one static unthreaded in the first dll, to be able to have it detect the cpu type. After having detected the cpu, I dynamically load my second dll, with single cpu code in it. This second dll also contains an IPP core, but this time, it statically links to the threaded libraries, in which I have it link directly to _v8_ippCopy_xxx() etc. Since I link directly, I really must known what cpu prefix (px_, w7_, etc) I must link to.
Therefore, it is important that I know what code an Atom should use, and also what g9_ means.
Maybe this is a call for Intel to supply me/us with a framework to create new dynamic IPP dlls with only a specified subset. With that, I wouldn't have to do it myself, and I'd be content with the smaller footprint.