Community
cancel
Showing results for 
Search instead for 
Did you mean: 
gpseek
Beginner
395 Views

Bit test intrinsics functions wanted!

BT, BTS and BTC instructions are fast again in Core 2.smiley [:-)] Could your compiler guys impleament those bit test intrinsics? I think BT instruction should be impleamented at least.

The intrinsics benifits are quite obvious. Suppose we want to test if bit i is set in an integerbitmap, we usually do this in C/C++:

if (bitmap & (1 << i))

........

The problems of the above C test are

1. more intructions genereated and

2. register cl is needed, thus increasingregister pressure. And moreregister swap/save instructions often neededbecause rcx/ecx is often used as an function parameter.

Another plus for _bit_test(integer, index) is that it reduces code size.

One additional suggestion to the compiler optimization:

Sometimes(not always) bitmap & (1 << i) should be compiledas a BT instruction.

Thanks!

0 Kudos
31 Replies
TimP
Black Belt
312 Views

As you appear to be referring to a specific compiler version, you should specify which one. If you have specific suggestions, there are appropriate places for you to post the details (your premier.intel.com account for Intel C++, gcc bugzilla for gnu). In either case, you would want to test a version which is currently being enhanced (usually, the latest available).
gpseek
Beginner
312 Views

Tim,

Thank you for reply.

I think Itested the official latest version of IntelC++ Compiler 9.1 for Windows, which I downloaded last week.

What's a premier.intel.com account? I don't think I have one.Surprised smiley [:-O]

I just want to suggest your compiler developers to impleament such intrinsics that can improve Core 2 cpu performence.smiley [:-)]

TimP
Black Belt
312 Views

At the end of your installation, you should have been invited to set up a support account on premier.intel.com. Also, you should be able to go to https://registrationcenter.intel.com to open an account or get updates. This is how you would submit bug reports and feature requests.
gpseek
Beginner
312 Views

Tim,

Iassume you are a member of the C/C++ compiler team. I'm happy as long as anyone in your team knowsthis request.

I've tested a few things with ICL 9.1.It's a good compilerthat can beat the MS one in most cases. smiley [:-)]

However, I'm sure you can make it even better.Open-mouthed smiley [:-D]

TimP
Black Belt
312 Views

No, I'm not on the compiler team, but I do a lot of testing and work with customers. If you want your requests to go forward, someone has to submit them. it may as well be you, so that you get progress reports.
gpseek
Beginner
312 Views

Tim, Thanks again!

I don't have such an account to file the request. I'm still an evaluation user.Sad smiley [:(] Could you please send this thread to them as a feature request? I think this feature would definitely boost Core 2 processors' SpecInt2000 or similar science benchmark results a little bit. And the impleamentation is not difficult at all if you consider the fact that they already impleamented _bit_scan_forword, _bit_scan_reverse and even a _popcnt!smiley [:-)]

JenniferJ
Moderator
312 Views

When you getting the eval, it asks if you'd like the free support. If you select "Yes", you'll have an account with the PremierSupport at "http://www.intel.com/software/products/support". And you can submit issues or feature requests.

About the "bitmap & (1 << i) should be compiled as a BT instruction",it's a good one for our future compiler

About the _bit_test, will the following intrinsics work better for your case? If yes, I'll submit the feature for you.

int _bit_test(int val, int cnt); // returns either 0 or 1 the bit in val specified by cnt
int _bit_test_and_set(int *val, int cnt); // returns either 0 or 1 the bit in *val specified by cnt. That bit is then set.

Thanks,

Jennifer

gpseek
Beginner
312 Views

Jenifer, Thanks a lot.

int _bit_test(int val, int bit_index) looks much better than the 2nd one that is microsoft syntax. _bit_test returns 0 if the specified bit is not set._bit_test returns non zero if the bit is set (not neccessarily to return 1 because the compilermay actually generate a conditional (CF) jumpwhenit is used in a condition clause).

_bit_test intrisincs should map to the BT instruction as closely as possible. I think you can safely assume that instrinsics users are at least assembly-aware programmers who know what they are doing. The MS version is quite ineffienct, whichgenerates a dummy memory read the last time I checkeda piece of 32 bit code MS 8.0 generated.

jimdempseyatthecove
Black Belt
312 Views

Each function has it's strengths and weaknesses. In a multi-threaded single processor system you would use the bit_test_and_set, in the SMP you would use the interlocked version of the intrinsic. Lacking this you would have to use a critical section or spinlock. Much more costly than using a memory temp.

Jim

gpseek
Beginner
312 Views

Have these intrinsics been implemented in version 10?Surprised smiley [:-O]

JenniferJ
Moderator
312 Views

Thanks for checking back. But sorry. It's not in 10.0. :(

I've sent a note to the engineer.

JenniferJ
Moderator
312 Views

The new intrinsic "_bittest" and some others like "_bittestandset" will be added later this year.

Once it's available, I'll post a news here.

JenniferJ
Moderator
312 Views

I have to add that the new intrinsic "_bitttest" and others will be added later this year, but these intrinsics may not meet your requirements. The betterversion will be added after. It will take some more time. Again I'll post the news here.

gpseek
Beginner
312 Views

Thanks Jennifer for the update and communicationssmiley [:-)]

I likeCore that is much better than Netburst:(

The current compilers are still carrying the tradition of avoiding certain intructions that are solw on P4.

Glad to hear we are going to get new intrinsics. Thanks again!

ILevi1
Valued Contributor I
312 Views

How about naming them shorter to save some typing? For example _bt instead of _bittest and _bts instead of _bittestandset which is awkward to type?
gpseek
Beginner
312 Views

That's fine and makes perfect sense. However, if you consider they already name _bit_scan_forward for bsf and _bit_scan_reverse for bsr, the longer ones make it consistent.Hot smiley [(H)]
ILevi1
Valued Contributor I
312 Views

So you are suggesting that they also rename _mm_move_ps to _move_aligned_packed_single_precision_floating_point for consistency with those longer names?

I would rather introduce short names for _bit_scan_forward and _bit_scan_reverse, and leave the old ones as aliases for compatibility reasons. As you see consistency can be satisfied both ways.

gpseek
Beginner
312 Views

Definitely notOpen-mouthed smiley [:-D]

I prefer the short names that are the same as asm conterparts with a leading underscore too. The big plus for short ones is that you can remember them easily because you already know the asm instructions. So, you havea really good idea:introduce short names for existing awkward long names like _bit_scan_forward and still keepconsistency. And overtime the long ones become deprecated.

_bsf makes more sense to me.

gpseek
Beginner
312 Views

Bumpsmiley [:-)]

any update on this?

Thanks!

JenniferJ
Moderator
169 Views

Sorry, not yet in the product. I'll keep pressing.
Reply