Intel® oneAPI Threading Building Blocks
Ask questions and share information about adding parallelism to your applications when using this threading library.
2465 Discussions

Noisy compile warning (error?) in enumerable_thread_specific.h

Andrei_Vermel
Beginner
696 Views
In tbb22_009oss, MSVC2009, instantiation of enumerable_thread_specific causes

1>x:\3rdParty\inc3rd\tbb/enumerable_thread_specific.h(478) : warning C4200: nonstandard extension used : zero-sized array in struct/union
1> Cannot generate copy-ctor or copy-assignment operator when UDT contains a zero-sized array
1> x:\3rdparty\inc3rd\tbb\tbb_allocator.h(113) : see reference to class template instantiation 'tbb::enumerable_thread_specific::padded_element' being compiled
1> with
....

The problem line is
char padding[ ( (sizeof(U) - 1) / internal::NFS_MaxLineSize + 1 ) * internal::NFS_MaxLineSize - sizeof(U) ];

And the template argument U, which is my template argument to enumerable_thread_specific happens to have sizeof == 128. The expression in brackets becomes 0 and the compiler gets annoyed.
0 Kudos
8 Replies
RafSchietekat
Valued Contributor III
696 Views

I haven't looked at the specifics, but it should be easy enough to substitute a bitfield 8 times the size. 8 times 0 is of course still 0, but, for a bitfield, that is a valid size.

(Added 2009-12-14) Some commas.

0 Kudos
Andrei_Vermel
Beginner
696 Views
Sorry, this was of course MSVC2008 (SP1), not MSVC2009.
0 Kudos
ARCH_R_Intel
Employee
696 Views
Thanks for reporting the issue. I'll post a patch when I have it ready.
0 Kudos
ARCH_R_Intel
Employee
696 Views
Attached is a patch for review, generated by "svn diff". It's against our current sources, but I don't expect any problem with applying it to earlier sources. Here is what the patch utility reported when I applied the diff to the tbb22_009oss sources:
[cpp]patching file enumerable_thread_specific.h
Hunk #1 succeeded at 458 (offset 8 lines).
Hunk #3 succeeded at 506 (offset 8 lines).[/cpp]
It would have been miore elegant to write the primary template as:
[cpp]stemplate
struct ets_element {[/cpp]

so that the modulo stuff would not be exposed at the point of use of ets_element, but alas the VS.NET 2003 compiler choked on that variation.

- Arch

0 Kudos
RafSchietekat
Valued Contributor III
696 Views
Why waste a full NFS_MaxLineSize where an empty bitfield suffices?

Also, why align at 128 instead of 64?
0 Kudos
ARCH_R_Intel
Employee
696 Views
Quoting - Raf Schietekat
Why waste a full NFS_MaxLineSize where an empty bitfield suffices?

Also, why align at 128 instead of 64?

An empty bitfield "I :0" aligns the next field to the natural alignment of integral type I. In enumerable_thread_specific, the alignment is being done to avoid false sharing, and thus needs to be larger than what an empty bitfield can accomplish.

You are right that on modern IA-32 hardware, the cache line size is 64. The value 128 is a legacy of the Intel Netburst Microarchitecture (e.g. Intel Pentium D) where 64-byte lines are paired into 128-byte sectors. When a line in a sector is fetched, the hardware automatically fetches the other line in the sector too. So from a false sharing perspective, the effective line size is 128 bytes on the Netburst processors. More on this obscure bit of history can be found by looking for "sector" in http://www.intel.com/Assets/PDF/manual/248966.pdf .
0 Kudos
RafSchietekat
Valued Contributor III
696 Views
"An empty bitfield "I :0" aligns the next field to the natural alignment of integral type I. In enumerable_thread_specific, the alignment is being done to avoid false sharing, and thus needs to be larger than what an empty bitfield can accomplish."
As suggested in #1 the bitfield would only be empty where #4 would add a full 128 bytes; otherwise, it would be exactly the same number of bytes long.

"So from a false sharing perspective, the effective line size is 128 bytes on the Netburst processors."
So they're the ones whomess things up for all the others! Hmm... who has an idea how to easily adapt at run time?
0 Kudos
Alexey-Kukanov
Employee
696 Views
Quoting - Raf Schietekat
"So from a false sharing perspective, the effective line size is 128 bytes on the Netburst processors."
So they're the ones whomess things up for all the others! Hmm... who has an idea how to easily adapt at run time?

Might be today we could just make Netburst users to pay the tax, instead of all others.
I'm afraid however that changing that constant to 64 can break backward binary compatibility in weird ways.

0 Kudos
Reply