As a hack fix "work around"

Alex_K_8 · ‎12-10-2015

Hi everyone,

When building my application with no optimisation flags in use, the application will run.

When building the application with any optimisation flags (such as /O1 or /O2 for example), the application will crash at startup.

This is a large codebase with thousands of commits, so it's hard to keep track of what changes have caused the use of optimisation flags to cause the whole application to crash. Does anyone have any advise on the best way to find out which files are generating bad output? I've tried debugging and stepping through (since it doesn't take long for the application to crash), but from what I can see all the data being processed is valid and the crash ends up in one of Microsoft's CRT library files.

I do apologise if this question isn't very coherent or lacking any in-depth information, if there's anything else I can include to help just let me know :-)

Using Intel C++ compiler 16 Update 1 on Windows 10 with Visual Studio 2013 Update 5.

TimP · ‎12-10-2015

One might suspect uninitialized variables or data overlap. If you can build a full set of objects with each set of compile flags, you should be able to start a binary search for the failure by linking combinations.

KitturGanesh · ‎12-10-2015

I agree, Tim - binary search (replacing the object file compiled with no optimization with the corresponding file with -O1 in the path of the crash trace) is the way to go until the culprit object is located.....

_Kittur

Vladimir_P_1234567890 · ‎12-10-2015

I suggest to run Inspector and look for memory initialization problems

--Vladimir

KitturGanesh · ‎12-11-2015

That's a great suggestion, Vladimir. _Kittur

Alex_K_8 · ‎12-15-2015

Hi guys,

Thanks for the advice, but it looks like the issue is much more complicated than I initially thought.

The unoptimized build hits WindowsCrtPatch::Init(); the optimized build doesn't. I know for sure XRE_DONT_SUPPORT_XPSP2 isn't defined, but I even removed the guards around WindowsCrtPatch::Init(); and the line still doesn't get hit!

Here's what an unoptimized step-through is like, where the program executes correctly and hits WindowsCrtPatch::Init():

https://vimeo.com/149082096

And here is what an optimized step-through is like, where it completely skips it:

https://vimeo.com/149082097

Here is what gets thrown when the optimized build hits main function.

First-chance exception at 0x00007FFCA78F1F08 (KernelBase.dll) in firefox.exe: 0x0000071A: The remote procedure call was canceled, or if a call time-out was specified, the call timed out.

First-chance exception at 0x00007FFCA78F1F08 (KernelBase.dll) in firefox.exe: 0x40080201: WinRT originate error (parameters: 0x0000000080004005, 0x000000000000006D, 0x0000008A249FDE70).

I've stepped through and had every file that the step through goes through either compiled with VC or remove optimization flags completely with still the same result.

Any idea why having optimization flags is causing this to happen? I've tried using Inspector but I can't seem to find anything relevant. Results here.

Anything else I could upload to help figure this out?

Thanks!

Bernard · ‎12-17-2015

Looks like compiler removed argc variable and operator delete is trying either to free unallocated or zeroed memory. Just guessing by looking at optimized screenshot.

Alex_K_8 · ‎12-17-2015

iliyapolak wrote:

Looks like compiler removed argc variable and operator delete is trying either to free unallocated or zeroed memory. Just guessing by looking at optimized screenshot.

I thought so too, but I had that specific file compiled with no optimization flags and also separately with MSVC and the result is still the same.

Bernard · ‎12-17-2015

Can you post full call stack?

jimdempseyatthecove · ‎12-20-2015

As a hack fix "work around" (assuming argc is a local int) see what happens when you make argc also volatile. The object of the experiment is to assure argc is not registerized. BTW, the only way I could conceive of argc getting registerized is if "main" is visible to the compiler (as opposed to externally by the linker). Note, note, with multi-file interprocedural optimizations, the linker may reinvoke the compiler to inline (or look at) main thus potentially permitting it to registerize argc.

Then if volatile int argc fixes the issue, experiment with removing the volatile, and compiling the code shown in a manner such that it is not a candidate for multi-file ipo.

One of, or both of, the methods may get you going (and please report issue to premier).

Jim Dempsey

Alex_K_8 · ‎12-23-2015

I appreciate the help so far. So I've managed to find the patch (after going through thousands of commits...grim) causing the exit and it looks like the debugger would've never shown the relevant information relating to this patch.

Here's the patch. Here are the new header files included: 1, 2.

Could ICL have been overaggressive with the change from:

  static const int size = 8;
  char buf[size];

to:

  char buf[8];

Or is that just completely irrelevant? The codebase is going to start replacing all instance of PR_snprintf with snprintf_literal so finding the root cause of the issue would be good to know if the rest of the codebase will be affected when these changes come into effect as well.

Now that I know the root cause of this issue, what's the best way of using the debugger to find this issue (working backwards I suppose)?

jimdempseyatthecove wrote:

As a hack fix "work around" (assuming argc is a local int) see what happens when you make argc also volatile. The object of the experiment is to assure argc is not registerized. BTW, the only way I could conceive of argc getting registerized is if "main" is visible to the compiler (as opposed to externally by the linker). Note, note, with multi-file interprocedural optimizations, the linker may reinvoke the compiler to inline (or look at) main thus potentially permitting it to registerize argc.

Then if volatile int argc fixes the issue, experiment with removing the volatile, and compiling the code shown in a manner such that it is not a candidate for multi-file ipo.

One of, or both of, the methods may get you going (and please report issue to premier).

Jim Dempsey

Should I still report the issue to Premier now that I've got the above information?

iliyapolak wrote:

Can you post full call stack?

With the new info above, would it still be helpful for me to post the full stack to find out how to find what the actual issue was?

Bernard · ‎12-24-2015

Usually with debugger you are supposed to work backwards in order to find the culprit. In your case I would insert a breakpoint on dynamic allocation of argConverted array of pointers. The most important thing should be tracking of argc variable.

Alex_K_8 · ‎01-09-2016

iliyapolak wrote:

Usually with debugger you are supposed to work backwards in order to find the culprit. In your case I would insert a breakpoint on dynamic allocation of argConverted array of pointers. The most important thing should be tracking of argc variable.

I've followed argc through but can't find anything wrong with it. It gets declared and used correctly.

I'm at whits end now, even though the application will start, if a user who doesn't have an existing profile runs the application, it will crash straight away again. So many variables in the debugger have the same <Error reading register value> or <Unable to read memory> message it feels impossible to pinpoint any one issue.

Don't suppose any one has an suggestions? I'm not quite sure what I should submit to Intel Premier support, if there is anything I can submit at all to them.

KitturGanesh · ‎01-12-2016

Hi Alex,
The issue can only be filed with the developers if there's a reproducer. If you're able to reproduce with a smaller test and can attach a preprocessed file (*.i files generated using /P or attach the project soln if smaller testcase) to the issue you can file in Premier Support (so the support triage engineer can communicate there and request more info and the content will be secure). The exact file/phase can then be traced through iterations using internal options so the bug (if it's a bug) can be fixed. Without a reproducer it's hard for the support engineer to file the issue with the developers as well.
_Kittur

Bernard · ‎01-13-2016

@Alex K,

I do not know if this advise will be helpful, but anyway you may give it try. In your debugging effort use windbg it is more powerful than VS debugger. Try to reproduce the same steps as in post #13.

Application crashes with the use of any optimisation flags. Best way to find out why this occurs?