IPP 7.0.1a static libraries

Mark_Rubelmann · ‎12-05-2010

I'm glad to see the generic base libraries have made their way back into IPP but I'm a little puzzled about how to use the statically linked version. This is from the update notification email I just got:

"The generic dynamic library files will automatically integrate with the Intel IPP library dispatch mechanism; the generic static library files do not integrate with the Intel IPP dispatcher."

So how do we make sure our code will run on systems that require the generic library if we've statically linked IPP? Write our own dispatcher? That would kind of suck.

Thanks,
Mark

j_miles · ‎12-06-2010

I had the exact same thoughts when I read the update notification email. If the static generic libraries do not integrate with the dispatcher, it really would make it much more difficult and cumbersome to utilize them. Please clarify and provide us with a solution.

Thanks.

Regards,

- Jay

Thomas_Jensen1 · ‎12-06-2010

I think they mean that the code in ipp-samples\advanced-usage\linkage\mergedlib does not include PX, so if you have custom code based on merged static libraries, you'd have to manually add PX.

PaulF_IntelCorp · ‎12-08-2010

Hello Mark,

If you want to have a statically linked app that runs on a "less than SSE2 machine" you'll have to build two versions of your application: one linked against the separately downloaded PX library and one linked against the standard distribution library (which supports SSE2 thru AVX). Then you'll have to use some "pre-execute" code to decide at runtime which one of your two applications should beexecuted, or dispatched.You can use the ippInit functions to help you determine which application to use -- if the ippInit call fails use the PX version of your app, otherwise, start the 7.0 version of your app.

(Note, for a 64-bit app the generic library is called "MX" and the standard distribution library supports SSE3 thru AVX, not SSE2 thru AVX.)

If you are building an application that utilizes the standard dynamic libraries the dispatch mechanism will work as expected with the "generic" optimization layer libraries, just add the generic (PX/MX) dynamic library files to the directories containing the other dynamic IPP library files and everything will work as it did with the 6.1 version of the library.

The dispatch mechanism for the dynamic libraries is (somewhat) dynamic. It checks at library initialization time to see which SIMD optimization level is appropriate for the current CPU and then looks for the best dynamic lib(s) to satisfy that condition. To test this behavior you can remove all the SSE3, SSE4, etc. libraries from the dynamic library search path so that all that remains are the PX/MXlibrary files. Run your app and it will use the PX library, even though it would have been appropriate to usesome higher level optimization.For example, If you've got an SSE4.x machine running 32-bit Windows, remove the g9, p8, s8 and v8 libraries. Your application will then run the SSE2 (w7) libraries, even though you are running it on an SSE4 (p8) machine. Then add back each level of libraries to the library directory and do it again.

In the case of the static library dispatcher every available optimization level for the functions your application is using has been bound to your executable. The dispatcher is "hard coded" to only recognize and search for those optimization layers that were included with the static libraryyou linked against. In other words, it doesn't check for nor does it know about any optimization layers other than those that were available when the static library was built at the factory. The static initialization routine only checks to see what sort of processor your application isrunning on and thenassumes that the only the optimization layers that were shipped as part of the static library will be available.

The last time a mainstream x86 compatibleprocessor supporting less than SSE2 shipped was approximately nine years ago!(There are a few specialty low-end embedded processors that don't fit this profile.) Most of those older systems that will not run SSE2 instructions do not have sufficient memory, or speed, to be candidates for the typical application that is built to use the Intel IPP library.

Hope that answers your question,

Paul

PaulF_IntelCorp · ‎12-08-2010

Hello Thomas,

Unfortunately, the generic PX/MX library files won't work with the technique described in theipp-samples\advanced-usage\linkage\mergedlib sample because the px_/mx_ prefixes needed to make the technique described by that sample work are missing. The PX/MX static libraries are truly "generic." They do not contain a dispatcher so a call to each function goes directly to the generic

Unless there is a technique available to allow you to wrap a namespace identifier around the generic library so that you can distinguish between two identically named entry points within the generic library and the standard library, I'm not sure how you could build a custom static librarythat would combine the generic and standard library files into a single application.

Paul

p.s. The static generic (PX/MX) libraries are single-threaded libraries, they do not utilize OpenMP or any other mutli-threading technology.

Thomas_Jensen1 · ‎12-08-2010

So, while making the effort to reinstate PX/MX support, why was the library split up in two, and why was OpenMP removed?

Why was the PX/MX prefix removed from the function names? It renders mergedlibs useless for PX/MX.

While I'm happy that PX/MX is back, I'm unhappy that Intel choose to burden me with much more complex implementation of it, when PX/MX was just a cpu-member before.

After my rants, I think that I can handle PX/MX without cpu prefix names, since I was forced to create a complex subset cpu DLL framework anyway, that allows me to keep a small footprint, while still offering the user to utilize existing cpu hardware fully.

j_miles · ‎12-09-2010

Hello Paul,

For the static libraries that is not a usable solution. Period.

For the dynamic libraries (that we do not use), it sounds like the right, elegant solution - as it should be.

Some, like Thomas, might be able to awkwardlyget by in their solution but others have different solutions and needs. It was requested to still have these generic static libraries but including the dispatching functionality! I doubt that anyone (e.g. the people participating in thread #75428) did not expect that it would still be a part of the dispatching chain (somehow). The dispatching is/was one of the many good aspects of IPP, i.e. that the automatic (or controlled) dispatching allowed to deploy a single version of an application. So instead of you doing the extra (certainly not much) bit of work once, you now enforce a lot of work, maintenance and cumbersome deployment issues on all ISVs that need this feature (which is really an integral/essential part of IPP). How one can choose such a solution, amazes me... (This does have the smell like that it seemed as the fastest/easiest thing to do - for you).

We can live with having to deal with separate installation of the IPP libraries for the developers. We can live with having to link with these generic libraries specially. We can even live with having to do a more elaborate initialization routine to get it work, if that was what would be needed. But we cannot accept a situation where we have to build/deploy things in a dramatically different way, e.g. two separate versions of an application.

If the main problem is the static locked definition of the dispatcher, maybe a way ahead could be to either override the dispatcher when one includes the generic static libraries in the linking process, or add an extra step to the initialization process that still allows both the generic and the optimized variants to be included in the same single application.

Please, seriously consider a different solution to the generic static libraries that allows proper dispatching for the next update release of IPP 7.0. The current solution is very prohibitive and may have the effect that many will not / cannot upgrade to v. 7.0, and it might even affect the reputation of IPP.

Sorry, if this got the sound of a rant. I did try to keep my tone calm...

Regards,

- Jay

PaulF_IntelCorp · ‎12-09-2010

Hello Jay,

Thanks for your feedback. I have asked engineering to review this difficult dispatching situation to see if it can be addressed for the static generic add-on library. This is not a guarantee that a fix is possible or will be made available.

I would be surprised if new installations of your application are being deployed on anything less than a Pentium 4 (which supports SSE2 --Pentium III machines do not support SSE2). See the following article for some pointers regarding which procesors support SSE2:

http://en.wikipedia.org/wiki/SSE2

SSE2 was introduced on Intel processors in 2001 and on AMD processors in 2003. Today Windows 7 64-bit will not work properly under some conditions if you utilize the older MMX/FP registers -- which means we're having to remove support for floating point and MMX instructions from the 64-bit version of the library to avoid those problems.

Please also note thatthe SSE2 introduction corresponds roughly to the release of Windows XP. If your customers are running Windows XP or higher there is a very high likelihood that they are also running on machines that support SSE2.

We would be very interested to understand the makeup of your customer platform processors, if you are able to provide that information.

Regards,

Paul

j_miles · ‎12-14-2010

Hello Paul,

Thanks for listening. I was not aware of the Windows 7 64-bit problem, so thanks for that info.

I can surely understand your reasoning with the relative age of the Pentium 4 processors (although they have been on the market until like 2008). What I am mostly worried about is not as much removing explicit optimization levels (as was done for Pentium III) but removing the base generic layer that allows running IPP on any x86-compatible processor incl. AMD. This is a strong point of IPP, in our view. The worry is also not towards people deploying new systems but rather customers upgrading from one version of the software to another. Surely, we would recommended our customers to deploy new installations using modern/recent processors, whereas previous installations should (to some extent) be able to upgrade to new versions without a penalty in performance or hardware incompatibility. At some point, between major releases, we can choose to enforce a shift in hardware (and even OS) support.

Another worry is the timing: As I have mentioned more than once, we would have liked a somewhat longer period of transition window before the Pentium III optimizations were pulled out and similarly for such things happening with IPP 7.0 (even more so, as the changes have a more dramatic impact). It is perfectly ok to phase out some things but we just need a reasonable time to cope with it. For this change, I think the window of transition is too short. The beta window is not long enough. How about announcing it when releasing one major version (sort of as a deprecation), and then really remove it in one of the subsequent major releases?

Obviously, our applications have other pre-requisities, and we may actually be getting closer to a point where other factors determine that the minimum system requirements should change making this a less important point. It would be nice though that we as the ISVs can determine to a certain extent the pace and timing of such things as they fit with our applications and our customer base. Or at least allow us reasonable time and version-adaption to cope with it. The pace of changing system requirements cannot always be in sync with when we need certain new features (and new processor support) from new versions of IPP.

I cannot really provide any info of our customer platforms because it is quite varied. We also have applications that need to run on any unknown machine (ah well, we can set certain requirements but cannot set the bar too high). Think of it as a standalone viewer-like application that need to be able to run on "any" system. This is also the reason that we do not want to handle two (or more) deployments to cope with all the different installations.

All in all, yes, we can a some point change the requirements and disable support for older processors but we do like a longer transition window with full automatic dispatching capabilities for the static libs during that period to enable us to plan ahead and prepare the market. The removal of the base generic layers is a loss of one of the top IPP attributes, I think. Am I really the only one..?

Thanks.

Regards,

- Jay

Thomas_Jensen1 · ‎12-14-2010

Jay,

I fully agree with your text.

- Full dispatching including PX
- PX base support
- SSE2+ full support (OpenMP, Optimized code).
- Single deployment

Thomas

PaulF_IntelCorp · ‎12-14-2010

Jay and Thomas:

Thank you for your feedback. We are in the process of defining the next generation of the product and will take your feedback into account as part of that process. We are definitely trying to find a better way to provide more notice.

Regarding the PX dispatch support in 7.0 -- at this point we must await a response from engineering to determine what sort of solution, if any, is available. Going forward, SSE2 is our commitment for the base 32-bit library (SSE3 for the base 64-bit library). since SSE2 is supported by virtually all the processors produced by Intel and AMD since about 2003.

Please note that as a licensed IPP developer you can continue to use the 6.1.6 library indefinitely within your product, even if you are also using the 7.0 product. Obviously, this means you may have to build a "legacy" version of your product(s) to satisfy some of the pre-SSE2 systems out there -- not necessarily an ideal solution for all --to paraphrase a popular quotation, you cannot satisfy all the people all the time, which is a crossroad we are having to deal with.

Paul

Mark_Rubelmann · ‎12-14-2010

Paul,

Indeed you can't satisfy everyone all the time, but I doubt hooking the static PX library up to the dispatcher would disatisfy anyone. (Excluding your developers that is...) ;)

Anyway, in case it wasn't obvious from the fact that I started this thread, I just wanted to point out that I'm in the same boat as Jay and Thomas in that we've still got customers using pre-SSE2 computers, probably running Windows 2000 and Linux. They're not the majority but I'd still really hate for an upgrade to make the program unusable for them. Having AVX support is well and good but my prediction is that it won't mean much for our business for at least another 2 or 3 years. Our target PC right now is a Core 2 Duo which has been around for about 4 years IIRC.

-Mark

oxydius · ‎12-14-2010

Regarding the Win7 64-bit incompatibility with MMX/FP registers, doesn't that only affect 64-bit IPP? All 64-bit x86 processors support at least SSE2 (AMD) or at least SSE3 (Intel) so it's perfectly fine to remove the base layer from 64-bit.

32-bit IPP should be the only library whose concern is backward-compatibility. I don't see why px would be so hard to dispatch statically aside from the fact that it was removed and re-added so late in the development cycle, which is painful for developers. Personally I compile my whole apps with SSE2 as a baseline so I'm happy to see px go and shrink our libraries, but I understand why others can be vocal about the lost convenience. Old MMX shouldn't be expected but some sort of "at least it runs" slow layer can be nice while still getting AVX support.

P.S.: Thanks again for re-adding SSE2 optimized support.

Mark_Rubelmann · ‎12-14-2010

This is getting off topic but I'd like to know more about this Win7 MMX issue and haven't seen it mentioned anywhere else. Do any of you have links to anything with more info? We've got some legacy MMX code kicking around.

-Mark

Thomas_Jensen1 · ‎12-14-2010

I think, that if there are processors where IPP will not run properly, Intel should add a document with a compatibility matrix (IPP version vs 32/64 vs processor type).

PaulF_IntelCorp · ‎12-15-2010

Mark,

The Win7 MMX issue only affects 64-bit drivers and other code that runs in 64-bit Windows kernel mode. It's an OS and compiler driven issue. I do not know if a similar register usage restriction exists for the Linux kernel. Here's a link to an MSDN article that describes these Win64restrictions:

http://msdn.microsoft.com/en-us/library/ff545910%28VS.85%29.aspx

If your legacy code only runs in Win64 user space or on a Win32 machine there is no issue. If your code will run in Win64 kernel space, you may have an issue.

Paul

PaulF_IntelCorp · ‎12-15-2010

Hello Thomas,

Providing a complete matrix of supported processors would require a validation process that we cannot support, given the large number and variety of competing processors out there. We do test on some non-Intel processors, but the variety of processors we are able to test against are not sufficient to publish a compatibility matrix, and the mix of tested processors changes frequently enough that we cannot commit them to publication.

Call the ippInit() function to see if a processor will support the IPP library before your application uses the library:

http://software.intel.com/en-us/articles/ipp-dispatcher-control-functions-ippinit-functions/

The ippInit() function determines the level of SSE instructions supported by the processor using the CPUID instruction and then initializes the dispatcher for that level of SIMD instructions. The manufacturer string returned by the CPUID instruction is not used as part of this init test; however, the CPUID results are interpreted according to Intel processor conventions. This means that if a manufacturer reports theSIMD instructions inways that are compatible with an Intel processor, the test passes (assuming the reported SIMD level is supported by the library); if not, the test fails.As far as I know,all manufacturers report their SSE2 and SSE3 support in a fashion that is compatible with Intel processors. After SSE3 the SIMD instruction sets diverge across manufacturers.

We have no way of insuring that all manufacturers guarantee that all of their processors will report their SIMD instruction set support in an Intel compatible fashion. To the best of our knowledge, the Intel IPP library will run on those processors that report SSE2 or SSE3 instruction set support, and we strive to insure equivalent or better performance on non-Intel processors when compared to competing library solutions. We have and will address compatibility issues that arise for non-Intel processors.

Paul

Refer to our Optimization Notice for more information regarding performance and optimization choices in Intel software products.

PaulF_IntelCorp · ‎12-15-2010

Hello Thomas,

Providing a complete matrix of supported processors would require a validation process that we cannot support, given the large number and variety of competing processors out there. We do test on some non-Intel processors, but the variety of processors we are able to test against are not sufficient to publish a compatibility matrix, and the mix of tested processors changes frequently enough that we cannot commit them to publication.

Call the ippInit() function to see if a processor will support the IPP library before your application uses the library:

http://software.intel.com/en-us/articles/ipp-dispatcher-control-functions-ippinit-functions/

The ippInit() function determines the level of SSE instructions supported by the processor using the CPUID instruction and then initializes the dispatcher for that level of SIMD instructions. The manufacturer string returned by the CPUID instruction is not used as part of this init test; however, the CPUID results are interpreted according to Intel processor conventions. This means that if a manufacturer reports theSIMD instructions inways that are compatible with an Intel processor, the test passes (assuming the reported SIMD level is supported by the library); if not, the test fails.As far as I know,all manufacturers report their SSE2 and SSE3 support in a fashion that is compatible with Intel processors. After SSE3 the SIMD instruction sets diverge across manufacturers.

We have no way of insuring that all manufacturers guarantee that all of their processors will report their SIMD instruction set support in an Intel compatible fashion. To the best of our knowledge, the Intel IPP library will run on those processors that report SSE2 or SSE3 instruction set support, and we strive to insure equivalent or better performance on non-Intel processors when compared to competing library solutions. We have and will address compatibility issues that arise for non-Intel processors.

Paul

Refer to our Optimization Notice for more information regarding performance and optimization choices in Intel software products.

j_miles · ‎03-31-2011

Hi Paul,

What is the current status of this issue? We would like to move ahead to IPP 7.0 for various reasons but really do have a problem with the removal of the base "runs-on-everything" PX layer (using static libraries). We are not going to do double deployment so we would really like a way to handle this in a single deployment. We still have a hard time understanding the choices behind this. It seems that the removal is generating some hassles for your ISVs (e.g. forum thread #80925), which would be much simpler to solve by just re-introducing the PX base layer.

Just a quick thought: Would it be possible to somehow have two separate libraries (one containing all except PX, and the other library PX-only) and then when linking (and possibly when StaticInit'ing) choose whether or not to include the base PX layer library. I have no idea of how/if this could be done but it would be able to cater to both camps and satisfy all ISVs...

Thanks.

- Jay

Thomas_Jensen1 · ‎03-31-2011

Reading Optimization Notice does not help at all.

Reading between the lines, it is clear that Intel want us all to use only Intel products.
That is also a sound way of positioning an Intel-product, such as IPP.

However, the world is more than just Intel.

I close to say goodbuy to IPP because of the above mentioned.

I don't mind Intal removing SSE3 because SSE2 is actually faster.

I do mind Intal screwing up PX. I (we) must invest a lot of new hours getting PX to work in version 7, when it worked just fine in version 6.

PX just doesn't work any more, when static linking is used.
With "doesn't work", I mean, a lot of extra hours has to be put in, to make it work.

My road to make it work is a (tiresome) road that does one extra useful thing, namely a custom dispatcher (ippmerged.c), that additionally splits (a subset of) IPP into several different DLL files (one master DLL being called from my APP, and additional DLL's, one for each CPU-library).

I was hoping for a free ride with Intel implementing this model for me (for us), but I had to do it myself.

PaulF_IntelCorp · ‎04-01-2011

Hello Jay,

We do provide the px/mx layers as a separate download (look for the "generic" download). Unfortunately, the static link version of that separate download cannot be used to implement a single application that could dispatch internally, you have to create two separate executables -- this is not a problem with the dynamic link version of the files.If the static link file was provided with the px/mx prefixes it would be possible to build a single executable -- but even then you'd have to manually dispatch internally (within your application) between the generic library and the automatic dispatch library -- there is no simple way to make the dispatcher work with the static link model without having the library be fully integrated. I have requested that the px/mx prefixes be restored so that you can, at minimum, dispatch manually within your application, but I have no committment that these prefixes will be restored.

Please note that we are not planning, at this time, to restore the generic optimization layer to the Intel IPP library product. We are unable to cover all the bases for every customer and periodically have to make some difficult choices regarding what we can continue to optimize, test and validate. Given that the SSE2 instruction set has been supported by nearly every x86-compatible processor for almost a decade, the number of platforms that cannot run an IPP-enabled application today is very, verysmall.

If you are having trouble locating the px/mx generic downloads, please let me know and I'll get you more detailed instructions.

Paul