Intel® C++ Compiler
Community support and assistance for creating C++ code that runs on platforms based on Intel® processors.
7953 Discussions

IPO&PGO possibly take multiple function return value as constant

vush
Novice
358 Views
Recently I try build Ruby 1.9.3 p0 with ICC PGO & IPO on.
After clear known issues, whole stuff seems compiled out and tests working.

But when I use Ruby's Gem to install something, the program just looping into a dead end.

I add debug option and analyse the problem even in assembly code.
Finally, I found problem due to transcode.c.

[cpp]VALUE rb_econv_substr_append(rb_econv_t *ec, VALUE src, long off, long len, VALUE dst, int flags) { unsigned const char *ss, *sp, *se; unsigned char *ds, *dp, *de; rb_econv_result_t res; int max_output; if (NIL_P(dst)) { dst = rb_str_buf_new(len); if (ec->destination_encoding) rb_enc_associate(dst, ec->destination_encoding); } if (ec->last_tc) max_output = ec->last_tc->transcoder->max_output; else max_output = 1; res = econv_destination_buffer_full; while (res == econv_destination_buffer_full) { long dlen = RSTRING_LEN(dst); if (rb_str_capacity(dst) - dlen < (size_t)len + max_output) { unsigned long new_capa = (unsigned long)dlen + len + max_output; if (LONG_MAX < new_capa) rb_raise(rb_eArgError, "too long string"); rb_str_resize(dst, new_capa); rb_str_set_len(dst, dlen); } ss = sp = (const unsigned char *)RSTRING_PTR(src) + off; se = ss + len; ds = (unsigned char *)RSTRING_PTR(dst); de = ds + rb_str_capacity(dst); dp = ds += dlen; res = rb_econv_convert(ec, &sp, se, &dp, de, flags); off += sp - ss; len -= sp - ss; rb_str_set_len(dst, dlen + (dp - ds)); rb_econv_check_error(ec); } return dst; }
[/cpp] From the start of the while loop, rb_str_capacity is called and return value stored at [EBP-20].
This call routine even placed ahead the loop condition statement.
Every loop, every place needs getting a new capcity of dst only have constants stored at [EBP-20].
Worse, the second rb_str_capacity call leads to infinite loop. :(

rb_str_capacity seems just a very normal function.
size_t rb_str_capacity(VALUE);

With -Qprof-use -Qprof-dirE:\\Profile -Qipo these very two parameters, problem above appears.

In Profiling process, rb_str_capacity likely return different value every time, why ICC believe it as constant ?
This kind of optimization might be risky, I don't know if other place came across same issue potentially.

I manually rename rb_str_capacity to ignore PGO to get this working.
0 Kudos
7 Replies
jimdempseyatthecove
Honored Contributor III
358 Views
I do not use Ruby but your problem is symptomatic of VALUE being declaired as/with "const" attribute, meaning it is not subject to change and thus the optimizer is assuming it is constant. Your code though is bypassing the const-ness (via rb_str_resize). If my assumption is correct (inspect definition of VALUE), then you will have to adjust your code accordingly:
[cpp] unsigned long new_capa = rb_str_capacity(dst); // old capacity while (res == econv_destination_buffer_full) { long dlen = RSTRING_LEN(dst); if (new_capa - dlen < (size_t)len + max_output) { new_capa = (unsigned long)dlen + len + max_output; if (LONG_MAX < new_capa) rb_raise(rb_eArgError, "too long string"); rb_str_resize(dst, new_capa); rb_str_set_len(dst, dlen); } ss = sp = (const unsigned char *)RSTRING_PTR(src) + off; se = ss + len; ds = (unsigned char *)RSTRING_PTR(dst); de = ds + new_capa; dp = ds += dlen; res = rb_econv_convert(ec, &sp, se, &dp, de, flags); off += sp - ss; len -= sp - ss; rb_str_set_len(dst, dlen + (dp - ds)); rb_econv_check_error(ec); } [/cpp]
Jim Dempsey
0 Kudos
vush
Novice
358 Views

Hi Jim

I check the reference, VALUE is typedef as "unsigned long" without "const", parameter "dst" is actually "unsigned long dst".

http://rxr.whitequark.org/mri/source/include/ruby/ruby.h?v=1.9.3#088


But even assuming dst is const, the function return value is not possibly constant the same.
Like we give a const handle ptr to get latest status.
This kind of optimization, in my opinion, is far more than aggressive.
Intel have to do make some correctness on this.

BTW, I adjusted the code in same thought of yours, thanks for your guide.

Regards
V.E.O
0 Kudos
Georg_Z_Intel
Employee
358 Views
Hello,

I'm currently looking into your original problem description. Please help me to understand the problem:

  1. What does "I manually rename rb_str_capacity to ignore PGO to get this working." mean? What did you rename it to?
  2. PGO requires to run the data collection step multiple times with different set of input data & environment settings. Those should make the application execute all important code paths.
    Profile data from how many runs did you use? What happens if you also collect data when executing the case above that fails? Does it also fail after PGO optimization then?
Best regards,

Georg Zitzlsberger
0 Kudos
vush
Novice
358 Views
Hi Georg,

My description maybe confusing. That "rename rb_str_capacity" is a method to make it difference from the existed profiling data, so compiler will not do profiling optimize for rb_str_capacity this function. It could be anything but different from profiled name.

I actually expand rb_str_capacity to its definition(only one operation getting one member variable from one struct). It builds well, and running well.


I collected PGO data mainly through its usual usage(running ruby script).
And I found this function rb_econv_substr_append called when writing to files.

The while loop seems just convert to in generating windows text files.

I searched into the profile data, this path is executed many times.

I am certain that no errors happen here when profiling.
Assume during the execution, rb_str_capacity returns always the same value, so that PGO consider it as a constant.
But it was not really being constant, if so, a dead loop would happen, and anyway at least one time they are unequal, why PGO still consider it constant?

If you would like the PGO data, I could provide. But it seem huge, maybe filter related data still making this happen.


Best regards
V.E.O
0 Kudos
Georg_Z_Intel
Employee
358 Views
Hello,

thank you for your answer. Indeed it sounds like something I should have a closer look at.
Hence I'm trying to reproduce the steps you did. Could you please provide me the steps you did to seeing this result (starting with build & following), if possible?
I'd also need to know which compiler version you are using.

Again, thank you in advance,

Georg Zitzlsberger
0 Kudos
vush
Novice
358 Views
Hi,

I used latest evaluated Parallel Studio XE 2011 version.
Compilation on Ruby 1.9.3-p0.
I added -Qipo and -Qprof-gen e:/profile options.
After compilation over, I use it do a lot of usual job of Ruby(install gems, rails, running rails app)
Then I came back to change -Qprof-gen to -Qprof-use, the profiling data is about 2 giga-bytes.

During the recompilation, one bootstrap of ruby(miniruby) is already not working due to issue above.
I made a substitute using normal miniruby.exe and compilation is done.

But I mistaked, the new ruby just has the same problem as its minimized version.

When I removed one of those two option(Qipo, Qprof), it worked out good.

I guess it is rather hard to reproduced it. I can provide filtered profiling data with ruby configured file if you need it.

0 Kudos
Georg_Z_Intel
Employee
358 Views
Hello,

I'm trying to reproduce your sighting. As soon as I have some results I'll let you know. Thank you for providing me the necessary information!

Best regards,

Georg Zitzlsberger
0 Kudos
Reply