Intel® Integrated Performance Primitives
Deliberate problems developing high-performance vision, signal, security, and storage applications.

How to create even smaller footprint custom DLLs

Thomas_Jensen1
Beginner
634 Views

Since I only use a subset of IPP, I have created a single DLL, only containing the code I use. It is compiled with four CPUs PX/W7/T7/V8, and it contains the waterfall code to dynamically select the proper variant. It also is compiled with OMP support. This setup works fine.

However, if my app is placed on a Windows network share, the loading time is rather high since my single DLL must be loaded through the network. I want to split my single DLL up into four seperate DLLs, so only the variant with the compatible CPU is loaded. I need guidance for this.

I have looked at IPP samples advanced-usage, but I only seem to find what I already implemented (a single dll).

For instance, the file ippiw7-6.1.dll contains all functions for CPU W7.

I would like to create the same file, for W7 (and the others as well), but with less code in it: Small_ippiw7.dll

Can you give me a pointer in the right direction how to implement this?

PS. I only require IPP domains i, s, j, cc, and cv.

0 Kudos
1 Solution
matthieu_darbois
New Contributor III
634 Views

Hi,

I modified the mergedlib sample to do what you wanted. Let me know if this works as you expect.

You'll see I added a cpp file which is used to build an executable that generates .def files for all DLLs

The code is not "clean" as I didn't take time to keep only what was needed.

Regards,

Matthieu

View solution in original post

0 Kudos
12 Replies
Chao_Y_Intel
Moderator
634 Views

Hi Thomas,

Would you like to create library for each domains and each CPU specific?

You will possibly get 20 DLL libraries. (4 CPU specific code x 5 domains). Surly it will discrete each DLL size.

You also can split library into different CPU specific code. For example, you can create small_ippw7.dll, which only contains w7 code for all domains. Now you only need to create 4 DLLs (PX/W7/T7/V8).

Creating DLLs is similar with implementing a single DLLs there, except that, you only need to link with *merged.lib,

It does not need to link with *emerged.lib, and does not need call CPU detect code in DLLMain,

Also, the entry point in DLL will be like w7_ippsCopy_8u(), w7_ippiMalloc_8u_C1()

When calling the functions in the CPU specific DLLs, you also need to use CPU specific name ( e.g. w7_ippsCopy_8u)

The following article may help you understand how IPP is dispatch for different CPU:

http://cache-www.intel.com/cd/00/00/21/93/219301_linkage_models.pdf

Thanks,

Chao

0 Kudos
Thomas_Jensen1
Beginner
634 Views

I would like to create one small dispatcher DLL "ipp.dll" plus 4 CPU DLLs (ipp_w7.dll etc). From my application, I want to call ipp.dll using ippsCopy_8u(). Looking at linking, today I use the magic in ippmerged.c, with my customized funclist.h.

My problem is that I somehow must use funclist.h to generate ipp.dll and ipp_w7.dll etc, because it is close to impossible to do that by hand, as my funclist.h is quite large.

Is it not possible that I rebuild all the files in IA32\Bin, using a special Intel funclist.h to reduce the number of used functions?

0 Kudos
Chao_Y_Intel
Moderator
634 Views

Hi Thomas,

If you only want to include the functions used in your application into the DLLs, it needs to provide a list of function used in the application.

IPP has function list for each of domain at includefolder (\ia32\include), but this list looks too much for your application.

Thanks,

Chao

0 Kudos
Thomas_Jensen1
Beginner
634 Views

Well, that is what I already do today, with funclist.h as per IPP specifications.

However, that gives me ONE dll (with all cpus), whereas I would like to get FOUR dlls (each with one cpu).

0 Kudos
matthieu_darbois
New Contributor III
633 Views

Hi,

From what I saw in ipp samples, there's no "easy" way to do this.

I guess what you could do is to make the 4 DLL (cpu-specific code) using static linking without dispatching and create your own dispatching mecanism in the 5th DLL. What would do your dispatch DLL is load the correct library and update jump pointers so that ipp* in your DLL jumps to w7_ipp* in the w7 DLL for example (I think this how the dispatch library works in IPP). If you have a certain amount of functions, you can probably create a script that creates the proper C file needed for this.

Also, the dispatch DLL needs to contain the "core" components of IPP and your 4 other DLLs needs to link with that in order to access core functionnality.

Regards,

Matthieu

0 Kudos
Thomas_Jensen1
Beginner
634 Views

Your understanding of my problem is perfect.

I see an "AdressBook" in ippmerged.c.

I think I must compile a dispatch dll with the address book automatically filled with funclist.h.

I must also export all used functions in the dispatch dll. Each function must be declared, and the code must use the address book to jump to the proper cpu dll.

Indeed, the cpu dlls must also link to the dispatch dll to access common core functions.

Intel, since you already constructed ippmerged.c, can't you construct a framework for this case?

0 Kudos
matthieu_darbois
New Contributor III
634 Views

Hi again,

I didn't see this sample earlier.

The only difference, I think, with the static version is that you can't use a static initializer for AdressBook (that is, if you want only one of the cpu specific DLL to be loaded at runtime -as I understand). Instead, you have to initialize it with LoadLibrary/GetProcAddress and remove all declarations for px, w7... functions. This shouldn't be too complicated using macros to load the appropriate functions. If you use DllMain to initialize your library, it shouldn't call LoadLibrary according to Microsoft documentation... I wonder how it is done inside IPP.

By the way, the code provided works only on x86 32bits mode as inline asm is not supported in the x64 compiler provided by Microsoft.

Regards,

Matthieu

0 Kudos
matthieu_darbois
New Contributor III
634 Views
Sorry, proxy error so I double post this... Anyway to remove a post ?
0 Kudos
Thomas_Jensen1
Beginner
634 Views

Yes, I surely must use GetProcAddress for each function.

Its unfortunate that I'm not so fluent in C++ and its macros (I'm very fluent in Delphi...).

I must reuse my lengthy funclist.h without changing it.

For instance, the function ippiGetLibVersion (also cpu specific), declared in my funclist.h :

IPPAPI( const IppLibraryVersion*, ippiGetLibVersion, (void) )

If, in a block with GetProcAddress, I redeclare IPPAPI to my own definition

Old: #define IPPAPI( type,name,arg ) type __STDCALL name arg;
New: #define IPPAPI( type,name,arg ) type __STDCALL P##name = (type)GetProcAddress("name");

That would give:

const IppLibraryVersion* __STDCALL PippiGetLibVersion = GetProcAddress("ippiGetLibVersion");

And automatically the same for all the other functions in funclist.h.

Then ippiGetLibVersion in the dispatch dll would do this:

const IppLibraryVersion* ippiGetLibVersion(void)
{
return (*PippiGetLibVersion)();
}

I guess this could work without too much handwriting. The dispatch dll has handwritten functions for each used IPP function, and the cpu dlls can be compiled without any handwriting.

0 Kudos
matthieu_darbois
New Contributor III
634 Views

You don't have to handwrite any of the exported function :

From the sample, locate these lines :
#undef IPPAPI
#define IPPAPI(type,name,arg) \
static FARPROC d##name=INIT_NAME(name); \
__declspec(naked) void __STDCALL name arg { __asm {jmp d##name } }
#include "funclist.h"

Replace this with :

#undef IPPAPI
#define IPPAPI(type,name,arg) \
static FARPROC d##name; \
__declspec(naked) type __STDCALL name arg { __asm {jmp d##name } }
#include "funclist.h"

Then, the init function should do something like :

HMODULE hModule = LoadLibrary("w7_myipp.dll");

if (!hModule)

try next optimized version

else

{

#undef IPPAPI

#define IPPAPI(type,name,arg) \

d##name=GetProcAddress(hModule, "w7_" #name); \

if (! d##name) \

try next optimized version

#include "funclist.h"

}

That should work with minimal handwriting

Regards,

Matthieu

0 Kudos
matthieu_darbois
New Contributor III
635 Views

Hi,

I modified the mergedlib sample to do what you wanted. Let me know if this works as you expect.

You'll see I added a cpp file which is used to build an executable that generates .def files for all DLLs

The code is not "clean" as I didn't take time to keep only what was needed.

Regards,

Matthieu

0 Kudos
Thomas_Jensen1
Beginner
634 Views

Hmm, you certainly have been a great help for my question. Now I just have to find time to wrap it all up.

0 Kudos
Reply