Intel® Integrated Performance Primitives
Deliberate problems developing high-performance vision, signal, security, and storage applications.
6716 Discussions

ippiFilterGauss_8u_C1R 5x5 crashes on Core Duo/Xeon

daren_tyminski
Beginner
698 Views

Hi ALL,

ippiDemo.exe crashes in ippiFilterGauss_8u_C1R 5x5on Core Duo/Xeon. This bug affects our customers. It crashes every time. Any one else sees this?

Are there any newer updates than IPP 5.1.1?

Can you force IPP runtime to install and use px dlls on Core Duo/Xeon as a workaround?

(PX DLLs work fine on above CPUs).

Thanks!

0 Kudos
15 Replies
Vladimir_Dudnik
Employee
698 Views

Hello,

could you please provide more info about your issue, what image sizes, what parameters do you use in function call? I've just tried this function on Pentium M with 256x256 8u C1 image Jaehne and detected no problem with that. Could you also describe what IPP componets versions are reported in Help->About dialog?

Well, will also try on Core 2 Duo...

Regards,
Vladimir

0 Kudos
Vladimir_Dudnik
Employee
698 Views

tested ippiDemo.exe on Intel Core 2 Duo 2.66 GHz, Win32, IPP v5.1.1, Jaehne image, 256x256, 8u, C1, ippiFilterGauss_8u_C1R, 5x5 kernel- no problem detected

Vladimir

0 Kudos
daren_tyminski
Beginner
698 Views

Hi Vladimir,

Thanks for the response.

Image size is 640x480 mono 8. It seems to crash with any image size. I have two customers reporting crash on has T2500 2 GHz cpu, the other has z Xeon (details below). Both run IPP in DLL form (they use IPP 5.1.1 runtime install). I have reproduced crash on Xeon. crash goes away if you:

In windowssystem32 locate "ippiw7-5.1.dll" and rename it to "ippiw7-5.1-org.dll"

In windowssystem32 locate "ippipx-5.1.dll" and copy it over old "ippiw7-5.1.dll"

Seems clean-cut bug in ippiw7-5.1.dll. Please try on the T2500 or Xeon.

Thanks!

CPU-Z 1.37 report file

Processor(s)
Number of processors 2
Number of cores 1 per processor
Number of threads 2 (max 2) per processor
Name Intel Xeon
Code Name Prestonia
Specification Intel Xeon CPU 2.40GHz
Package Socket 604 mPGA
Family/Model/Stepping F.2.5
Extended Family/Model F.2
Brand ID 11
Core Stepping M0
Technology 0.13 um
Core Speed 2399.2 MHz
Multiplier x Bus speed 18.0 x 133.3 MHz
Rated Bus speed 533.2 MHz
Stock frequency 2400 MHz
Instruction sets MMX, SSE, SSE2
L1 Data cache 8 KBytes, 4-way set associative, 64-byte line size
Trace cache 12 Kuops, 8-way set associative
L2 cache 512 KBytes, 8-way set associative, 64-byte line size
Chipset & Memory
Northbridge Serv erWorks ID0017 rev. 32
Southbridge ServerWorks ID0225 rev. 00
Memory Type
Memory Size 1024 MBytes
System
System Manufacturer Dell Computer Corporation
System Name PowerEdge 1600SC
System S/N 382PM21
BIOS Vendor Dell Computer Corporation
BIOS Version A12
BIOS Date 10/19/2004
Memory SPD
Software
Windows Version Microsoft Windows XP Professional Service Pack 2 (Build 2600)
DirectX Version 9.0c

0 Kudos
Vladimir_Dudnik
Employee
698 Views

Well, theoretically it can be a bug, but we need somehow to reproduce this. I've tried V8 and W7 DLLs on Core 2 Duo, with your image size and can't reproduce issue.

Could you also check if there is no old version of IPP on your system? What is reported in ippiDemo Help->About dialog?

Vladimir

0 Kudos
daren_tyminski
Beginner
698 Views

Hi Vladimir,

The image size is 640x480. Any ROI size crashes. IPP version is 5.1.1 runtime(the latest on your web site). Systems: T2500 (2GHz) and Dual CPUXeon 2.4 GHz. OS: XP Pro.

In ippiDemo I do the following:

1. Load image

2. Select rectangular ROI

3. Select crashing filter.

4. Observe crash.

I'll try to get "ippiDemo Help->About dialog" info. I don't have access to the system with the problem right now. Need to ask customer to provide this info.

Initial crash reported with our product that uses IPP. Then reproduced with ippiDemo.

Thank you.

0 Kudos
Vladimir_Dudnik
Employee
698 Views

Thanks for additional info. I recommend you also submit that issue to Intel Premier Support, then you will be notified when any update will be available. As a workaround for that issue you might use removing ippiw7-5.1.dll from system. In that case application will automatically chose lower optimized version (A6 in that case), so you should not loose in performance dramatically.

Vladimir

0 Kudos
Vladimir_Dudnik
Employee
698 Views

Please also try to ensure that there is no conflict between two versions of IPP on target system (or conflict with some other software). We still not able to reproduce that issue, so I can't confirm that there is a bug in IPP v5.1.1

Vladimir

0 Kudos
daren_tyminski
Beginner
698 Views

Thanks Vladimir.

0 Kudos
Vladimir_Dudnik
Employee
698 Views

After further investigation I've found that there was similar issue withother filtering function (ippiFilterLaplace_8u_C1R). Issue was related to OpenMP threading inside of IPP DLLs. These functions use similar approach in internal threading so it might be the same issue.

Before IPP v5.2 beta, where this issue was fixed, you have several options to workaround this:

1. Use lower optimized DLLs (A6 or PX), in this case you will loose in performance, especially for PX case

2. If your application was compiled with Intel C/C++ compiler you can recompile it with following changes (in this case you will not loose performance at all):

The bug in the code of ippiFilterLaplace_8u_C1R has been found. It will be fixed in the next lib version. We can suggest a temporary solution for a customer.
For successful application execution one should insert at the beginning of the program three code lines:

int numThreads;
ippGetNumThreads(&numThreads);
omp_set_num_threads(numThreads);

For successful building also one should insert omp.h:

#include

3. If your application was build with non Intel compiler you can disable IPP internal threading through call of ippSetNumThreads(1). In this case only one thread will be launched, but most appropriate optimized code will be runned

4. You can also disable IPP internal threading without recompiling your application. In that case you need to set environment variable OMP_NUM_THREADS=1 before launching application.

Regards,
Vladimir

0 Kudos
daren_tyminski
Beginner
698 Views

All makes sense now. Thanks again.

0 Kudos
Albert3
Beginner
698 Views

This is a life saver. I was forced to use IPP5.1.0....

Will solution #2 works with VC2005, which support OpenMP?

0 Kudos
Albert3
Beginner
698 Views

Since Mac OS X dylib also have problem with threading, I strongly suggest a bug fix release for Ipp5.1.1 since it is pretty muchno good for high performance platforms such as our Dual Xeon systems and Mac Pro.

I can understand the IPP5.2 is coming. But I would be reluctant to use it right away in our product release until it is proven stable. I think many of your customers will really appreciate a bug fix release for IPP5.1.1.

Best regards,

Albert

0 Kudos
Vladimir_Dudnik
Employee
698 Views

Hello Albert,

I've submitted your request to Intel Premier Support, so you can expect someone will contact you soon.

Regarding your previous question I think that option #2 should work with VC2005.

By the way, do you have any other comments regarding IPP? How do you find its functionality/performance/usability? Do you see any missed functionality which you may want to have in IPP?

Regards,
Vladimir

0 Kudos
Albert3
Beginner
698 Views

Hi Vladimir,

Thanks for the response and ask me my commemts!

For meone of the most impotantthing is its internalthreading in major image processing fumctions. This will really get the high performance out of multiple processor system without complicate the development. I hope more effort are on it to make it more efficient.

I am using IPP5.1 one both Windows and Mac. I cannot make the threading works on 2 Xeon system or Mac Pro. I tried your fix, doesNOT work for me.

I don't knowif IPP5.2 have max/min(a,b), where a and b are images? How about color space conversion for data type other than _8u?

Since more and more processor are added to high end system, I dont' know how much improvement you can get by threadhing single functions. For example ippiAdd for large images must be more about memory access than the computing. I suspect many of the image processing function cannot take advantage of the cache due to large image size.

If youcan find a way to group functions together, performance may increase signaficantly. Say I need to do C=k*A+B/A, where A,B,C are large images:

ippiMulC_(k, A, C);

ippiDiv_ (A, B, TMP); // TMP is a temp image

ippiAddC_(TMP, C);

Since the images are large and the three operation run through the whole images one by one, it is pretty much outside of cache, accessing image 7 times. Threading MulC, Div will not help too much.

If you can have a function:

ippiFunction op[3];

op[0] = ippiMulC_

op[1] =ippiDiv_

op[2]=ippdAdd_

ippiApplyFunctions(A,B,C, op..);

Then you will have many ways to do the functions inside at pixel level, or tile level, so that memory only be access 3 times, instead of 7.

Just an idea, don't know if you can even do it this way...

Albert

0 Kudos
Intel_C_Intel
Employee
698 Views

Hi, Albert,

I could answer your second question. We are now doing the add-on IPP feature that could allow for executing sequences of image processing operations. The code for the sequence of operations will look close to natural writingand for big images (more than L2 cache) will work 2-3 times faster.

I hope to seethis featurein the next release. For out-of-cycle please ask via premier.intel.com.

Thanks,

Alexander

0 Kudos
Reply