I have compiled OpenCV by using Intel C++ Compiler. I have used the following flags for compilation:
-O3 -xAVX -march=corei7-avx -openmp -parallel -ipp -mkl -tbb -opt-matmul -std=c++0x -g0
The problem is that: Whenever I run opencv unit tests, some tests are failed due to accuracy problems. Unit tests output for OpenCV core module is attached to this post.
Due to these problems, I have read about floating point operations from compiler's reference guide.
As I have learnt from compiler's reference guide default fp-model is fast, then I have used following option to generate consistent floating point operations: -fp-model precise. This option is not working too.
I work on a 32-bit Ubuntu distro, and using 32 bit compiler environment variables by using following command:
source /path/to/compiler/compilervars.sh -ia32
What do you suggest?
The only option related with -fp-model was -fp-model precise. I have written all used flags in my post.
When I inspect failed unit tests, I have realized that, unit test are not just fail for floating point arithmetics, but also faling for integer arithmetics. For example, for reduce function of opencv, related unit test fails for all types of opencv, eg. Reduce function with following arguments: srcType = CV_8UC1, dstType = CV_8UC1, opType = CV_REDUCE_MAX, dim = COLS. In this operation only integer operations should be included, but bad accuracy error is reported from unit test.
I have researched on whether a related bug is reported for unit tests, but I could not found any related bug with these unit tests.
I have inspect unit tests for nearly ~15 different compiler flag sets, and I have realized that whenever I switch to -O2 from -O0, these bad accuracy related problems are reported. For example the following 2 flag sets are interesting:
-O0 -fp-model fast=2 (here I have enforced to get some floating point accuracy problems, but all unit tests are passed, these are the all flags I have used)
-O2 -fp-model fast=2 (unit tests are failed, these are the all flags I have used)
(In the attached unit test result file, you can see detailed results for test marked with ###OPENCV REDUCE FUNTION###)
With these results at hand, I have started to think unit test problems are related with vectorization or other optimization procedures of compiler. But how integer arithmetics are impacted from these procedures?
While I have been researching on this issue, I have realized that, some people have some difficulties with cmake and icc, eg not detected compilers etc. Here is my compilation procedure:
1. source /path/to/compiler/bin/iccvars.sh ia32
2. export CC=icc
3. export CXX=icpc
4. cmake -i /path/to/source
I have checked that this procedure should be correct, but I have shared so that you can check that too.
But I have detected a suspicious item: In verbose mode of cmake, I have realized that compiler was fed with a flag named -fsigned-char although I haven't used that flag. I couldn't solve that problem. I indicating this situation here, since issues may be related with this. I will share this situation on a cmake related community, and share the results as some response.
I am working on a Ubuntu 32-bit machine, with gcc4.6 and icc 13.1. OpenCV version is 126.96.36.199.
Sorry, first post is incomplete. I have completed it.
Try to be as specific as possible. There are tens of different Intel integer instructions and your generic explanations do not help to pin down a possible reason of all these problems.
But I expect, you will find this revised post generic too. Could you please say me what specific result should I use to inform you?
Kostrov, thank you so much for your detailed interest.
I have just read your last post. I understand the case now. In order to debug opencv, I will create an Eclipse project. Please give me some time for preparation.
Before I have read your last post, I have prepared some source codes for you related with unit tests. I have chosen OpenCV function reduce. In last post, I have pointed to fails related with this function. I will appreciate you if you can find some time to inspect them.
In this post, you will find three files attached to this post.
ts.cpp contains BaseTest class implementation which is base class for all test case classes.
test_mat.cpp contains reduce function implementation. I have simplified this source code so that it only contains test case implementations of reduce function.
reduce.cpp contains original implementation of reduce function in opencv. You can find detailed information about opencv at this page: http://docs.opencv.org/modules/core/doc/operations_on_arrays.html?highlight=reduce#cv.Reduce
I have tried to simplify your work on inspection as possible. In order to do that I have commented on source code files. You will find some labels. These labels indicate the order you should follow while inspecting. You should start from test_mat.cpp line 317 with label "a".
Thanks a lot.
Here additioanal technical details, before going into the Eclipse, I have modified test_mat.cpp to see the outputs of the operations.
In attached files, matrix outputs for all phases can be found. There is a naming convention for the attached files:
There are three types of outputs:
The first category of output files starting with "Generated_" prefix, second starting with "NonOptimized_" prefix, third starting with "Optimized_" prefix. Then the file names are continuing by using the following pattern:
The file name Optimized_1_1_COLUMN_8UC1_8UC1.yml means, original reduce function output by using randomly generated input matrix which is type 8UC1. Output is of type 8UC1 and reduce function performs a COLUMN reduction.
All three categories of files in specified folder name.
I don't inspect the results, I have found it reasonable to share them with you immediately, since these outputs may help you on investigation. I am going to inspect them after posting this message.
I have integrated Intel to Eclipse, and I have created an OpenCV Eclipse project. I have tried to debug but there is a problem. When I put a breakpoint on source code, program hangs on a different position and I never see program hits at my breakpoint. Hence I cannot debug OpenCV library.
I have tried to find a solution to this problem with no success.
I will inform you whenever I solve the problem.
I have debugged reduce test and seen where the problems are. Alternative implementations for these problematic areas solve the accuracy problem of the reduce test. Let me explain what I have obtained from debugging with a printf approach:
First of all, optimized implementation of reduce function, in other words the original implementation is working accurate. But test case fails because, there are problems on non-optimized implementation of reduce function and comparison code of optimized and non-optimized matrices. I will explain problematic areas later. But before, I can say that, problems are occurring on primitive OpenCV cases. You will understand well when I explain the problems and their solutions.
Three work-arounds solve the accuracy problem for reduce function.
1. Matrix initialization:
In non-optimized function implementation, I have realized that generated matrices for min, max, sum are initialized wrongly. The original one uses the following implementation for matrix initialization:
cv::Mat sum; //Declaring matrix sum
sum.create (1,100, CV_64F); //Creating matrix, a 1-100 matrix, with element type 64bit double precision floating point
sum.setTo(0); //Set all elements to 0.0
I have realized that, there are some elements that are not set to 0, some elements are too big numbers and some are too small numbers.
As a work around I have implemented an alternative solution:
cv::Mat sum; /Declaring matrix sum
sum = cv::Mat::zeros(1,100,CV_64F); //Creating it and filling all elements to 0.
This way, I have seen that matrix sum are created correctly.
Moreover, creating a matrix and filling all elements to DBL_MAX and -DBL_MAX is also problematic:
cv::Mat min, max;
This usage generates wrong matrices and I have changed the implementation as follows:
min = cv::Mat(1,100,CV_64F, DBL_MAX);
max = cv::Mat(1,100,CV_64F, -DBL_MAX);
By implementing in this way problem has solved.
2. Matrix comparison
As I have stated before, two results of two different implementations of reduce functions (optimized and non-optimized ones) are compared to check whether optimized one generates correct results. But I have realized that, comparison code block is not working correctly.
Original implementation was as follows:
Assume that, opRes and dst are results of optimized and non-optimized results of reduce implementations respectively. diff is difference matrix between opRes and dst. All of these are type of cv::Mat.
absdiff( opRes,dst,diff );
bool check = false;
if (dstType == CV_32F || dstType == CV_64F)
check = countNonZero(diff>eps*dst) > 0;
check = countNonZero(diff>eps) > 0;
In this implementation I have realized that, diff > eps*dst and diff > eps generates correct results, ie all the elements are 0, but countNonZero produces wrong outputs. countNonZero states that, some elements are different than zero, but this is not the case. Hence according test fails. A primitive function of OpenCV countNonZero() generates inappropriate results.
A work around is as follows:
When I have obtained diff matrix, in a for loop I have iterate through diff matrix and in each iteration I have checked whether current element is bigger than the specified threshold. In this way, comparison phase of the test generates correct results.
3. Average computation
Reduce function can calculate average of rows or columns. I have realized that something goes wrong with average calculation.
Original version of non-optimized reduce function is as follows:
sum.convertTo( avg, CV_64FC1 ); //write all elements of sum to avg in type of 64FC1
avg = avg * (1.0 / (dim==0 ? (double)src.rows : (double)src.cols)); //calculate average, denominator changes as is this a row or column operation
But I have realized that outputs of this operations are wrong.
Work around around this problem is as follows:
In a for loop I have iterate over all members of sum. In each iteration results of sum / denominator is written to avg. In this way, average is calculated correctly.
Here, primitive OpenCV operation operator *() generates wrong outputs.
All these three work arounds are solved the problem and now accuracy tests are passing. But there should be a different problem somewhere. I don't think there are problems on implementations of primitive OpenCV functions. I cannot understand where the problem is.
In the attachment to this post you can find detailed results and detailed explanations for work arounds.
Yes there are problems with OpenCV functions but I don't think OpenCV functions are buggy. They shall be bug-free because they are primitive ones. I think there is a problem with my compilation phase. I am suspicious from CMake configuration phase hence I have written to CMake mailing list, bu I cannot get any help from there. You can find the thread opened here from :http://www.cmake.org/pipermail/cmake/2013-August/055425.html
I have tried with compiler flags: -O3 -xAVX -march=corei7-avx -openmp -parallel -ipp -mkl -tbb -opt-matmul -std=c++0x -g0 and there is no problem with reduce function.
Actually, if you inspect test results you will see that following tests are failing:
These are primitive operations and note that CountNonZero test also fails. If you remember, one work around to solve the problem of reduce accuray test was implementing an alternative method for CountNonZero. I will inspect why tests above are failing.