Multiple Problems with ICL - Page 2

mtlroom · ‎08-07-2009

Hello everybody,

I ported ffmpeg lib to be able to compile it with intel c++ compiler for windows. The only reason I did so was to be able to debug ffmpeg related code in ms visual studio.

While porting code I noticed some problems with the compiler and I had to do workarounds. Overall ffmpeg binaries compiled with icl work ok, but the biggest problem is that debug info is somewhat broken.

I'll list some of the problems with intel compiler that I encountered while compiling ffmpeg library.

So, I decided to post a question here on the forum and at the same time check what's new. Good news is that I see some of the problems that I encountered are fixed.

Basically, the biggest problem with debug info is that it's broken. Very often variables show wrong values in the debugger.

For example:

[cpp]struct some_context * h;
...
h->function_ptr(a,b,c);
h->var1 = 123;
decode(h, 1, 2, 3);
[/cpp]

and then if I get assertion inside decode(..) I see value of h as 0 which is impossible cosidering code that executed before calling decode(...);

Ffmpeg cannot be compiled without compiler optimization, so that kind of errors could be related to optimizations, but I highly suspect that it's not the case. Whenever I see a variable with wrong value I just switch to another function in the call stack and there the variable has correct value. In the posted example code, inside decode(...) function passed value of h could be 0, but if I go back in the stack then iside the function that calls decode I see correct value of h.

This is one of the problems, the other one is more serious: I get completely broken call stack.

Sometimes, I have asserts in code and I examine callstack and for sure that it can't be real - I see something like av_malloc calls some encoding function, which obviously cannot happen. So, I tried to put a breakpoint before that function... everything is fine, the moment that I step into the function (F10) call stack in the debugger's window become garbage (the functions below in the list become absolutely different)

Any info on that??

While debugging some problems, I found out that they were related to valiable length arrays. Replacing them with alloca calls fixed the problem. I see some of the posts that this problem has been fixed in latest build. While debugging that variable length array problem I was also getting similar weird call stack in debugger window, could it be related?

====

Next problem related to perfrmance/optimizations.

In ffmpeg there are a lot of constants that are used in different encoders/decoders. Many of these constants are 64bit sized consts. The issue with intel compiler that in most of the cases it emits such bad junk that it's difficult to make it anywhere slowere than what icl does.

I had an test app a while ago, but can't find it at the moment. Basically, I had a 64bit const and then I was assigning that 64 bit const to an mmx register. My intention was to generate code that moves 64bit data to an mmx register located static const variable. Instead, intel compiler emits code that pushes two 32bit ints onto the stack and then moves to mmx register using stack pointer. I'm not sure if it's clear what I'm saying... in short, here's code example:

[cpp] Instead of 
 	movq _some64bit_const, mmx_reg //_some64bit_const is {0x12345678, 0x56781234 } 
 It would emit junk like
 	push 0x12345678
 	push 0x56781234
 	movq esp, mmx_reg
[/cpp]

At first when I saw that bloat I thought that probably intel compiler knows what it does and produces faster optimized code instead of making processor load variable that located god know were outside of the function body, but on practice that way of pushes on stack and loading through stack pointer appear to be much slower and bigger in size, which is unacceptable for highly optimized encoder's code. To avoid this "optimization" I had to combine most of the64-bitstatic consts into arrays:

[cpp]static const int64_t var1 = 0x0101010101010101;
static const int64_t var2 = 0x0202020202020202;

became:

static const int64_t var1_var2[] = {0x0101010101010101, 0x0202020202020202};
[/cpp]

In this case icl doesn't try to "optimize" static consts anymore.

====

Another problem is related to the fact that icl reports some 3dnow inline asm instructions as invalid. This is known, and was reported I think on these forums. For that reason some projects chose to completely not support intel compiler.

====

Last problem I encountered just recently.

Somewhere on the web I read a post saying that performance of their math related library drops a few times if they use intel's math instead of math lib that comes with ms compiler. So I decided to make a test as I was using mathimf.h instead of math.h. The reason I sed mathimf instead of MS's math.h was because microsoft's math.h doesn't have some of c99 math functions that ffmpeg uses. These are rint,lrint,lring,lrintl,isnan,isinf etc. So, I tested these functions and results are astonishing (garbage code gnereated).

I did this test just while ago and I'll post a complete example program to test that.

///main.c

[cpp]//#include "mathimf.h"
int main(int argv, char**argc)
{
	double d;
	int x;

	d = 1.123;
	x = lrintf(d);

	return x;
}
[/cpp]

guess what's the value of x at exit? That's right, x is ...-2147483648 !!! icl compains about function "lrintf" declared implicitly, but still it links and runs, with garbage results. I tried to step through and it goes to libmmd anyways, but if you uncomment the first to incude mathimf.h then result of x at exit is correct: 1, BUT ... I tried to step trhough assembly generated and definetly icl generates big load of gunk with multiple tests of some variabels, running cpuid instruction, doing something I have no idea what exactly and finally doing what was needed to be done, fistpl instruction. Instead I opted to use code from gcc which work fine for me:

[cpp]static __inline long lrintf(float x)
{
	long retval;
	__asm__ __volatile__("fistpl %0"  : "=m" (retval) : "t" (x) : "st");
	return retval;
}[/cpp]

TimP · ‎09-04-2009

It's difficult to persuade Microsoft or Intel Windows tools to work with AT&T or gcc style asm syntax. Microsoft took a position several years ago to provide full support for the SSE intrinsics but not for in-line asm.

mtlroom · ‎10-08-2009

Quoting - mtlroom

Any info on that code? Isn't it what I was talking about?

PUBLIC _set_mm0:

sub esp, 8

mov eax, 1145311505

mov edx, 305419896

mov DWORD PTR [esp], eax

mov DWORD PTR [4+esp], edx

movq mm0, QWORD PTR [esp]

add esp, 8

ret

ALIGN 16

as an update. This issues seems to be fixed from: Version 11.1 Build 20090903 Package ID: composer_update2.066

aholzinger · ‎10-22-2009

Hi mtlroom,

I'm currently trying exactly the same: compiling ffmpeg with icl under Windows to be able to debug under VC9. This is (yet) the only reason to buy/use icl.

I'm having problems regarding linking with the MS linker, so I would be very interested in how you did acheive to be able to completely compile/link ffmpeg with icl.

It would be nice if you could contact me under a_holzinger_at_gmx_dot_de (remove all the underlines and replace at by @ and dot by .).

Cheers
aholzinger