Intel® oneAPI Threading Building Blocks
Ask questions and share information about adding parallelism to your applications when using this threading library.

Intel TBB 4.4 U2 is out!

Vladimir_P_1234567890
2,930 Views

I'm glad to announce that Intel TBB 4.4 U2 is out! 

Intel TBB 4.4 Update 2

TBB_INTERFACE_VERSION == 9002

Changes (w.r.t. Intel TBB 4.4 Update 1):

- Improved interoperability with Intel(R) OpenMP RTL (libiomp) on Linux:
    OpenMP affinity settings do not affect the default number of threads
    used in the task scheduler. Intel(R) C++ Compiler 16.0 Update 1
    or later is required.
- Added a new flow graph example with different implementations of the
    Cholesky Factorization algorithm.

Preview Features:

- Added template class opencl_node to the flow graph API. It allows a
    flow graph to offload computations to OpenCL* devices.
- Extended join_node to use type-specified message keys. It simplifies
    the API of the node by obtaining message keys via functions
    associated with the message type (instead of node ports).
- Added static_partitioner that minimizes overhead of parallel_for and
    parallel_reduce for well-balanced workloads.
- Improved template class async_node in the flow graph API to support
    user settable concurrency limits.

Bugs fixed:

- Fixed a possible crash in the GUI layer for library examples on Linux.

----------------------------

I'm glad to announce that Intel TBB 4.4 U1 is out! 

Intel TBB 4.4 Update 1
TBB_INTERFACE_VERSION == 9001

Changes (w.r.t. Intel TBB 4.4):

- Added support for Microsoft* Visual Studio* 2015.
- Intel TBB no longer performs dynamic replacement of memory allocation
    functions for Microsoft Visual Studio 2005 and earlier versions.
- For GCC 4.7 and higher, the intrinsics-based platform isolation layer
    uses __atomic_* built-ins instead of the legacy __sync_* ones.
    This change is inspired by a contribution from Mathieu Malaterre.
- Improvements in task_arena:
    Several application threads may join a task_arena and execute tasks
    simultaneously. The amount of concurrency reserved for application
    threads at task_arena construction can be set to any value between
    0 and the arena concurrency limit.
- The fractal example was modified to demonstrate class task_arena
    and moved to examples/task_arena/fractal.

Bugs fixed:

- Fixed a deadlock during destruction of task_scheduler_init objects
    when one of destructors is set to wait for worker threads.
- Added a workaround for a possible crash on OS X* when dynamic memory
    allocator replacement (libtbbmalloc_proxy) is used and memory is
    released during application startup.
- Usage of mutable functors with task_group::run_and_wait() and
    task_arena::enqueue() is disabled. An attempt to pass a functor
    which operator()() is not const will produce compilation errors.
- Makefiles and environment scripts now properly recognize GCC 5.0 and
    higher.

Open-source contributions integrated:

- Improved performance of parallel_for_each for inputs allowing random
    access, by Raf Schietekat.

----

I'm glad to announce that Intel TBB 4.4 is out! Check out our nest to find new licensing options available for you!

https://software.intel.com/sites/campaigns/nest/

More info:

Intel TBB 4.4
TBB_INTERFACE_VERSION == 9000

- The following features are now fully supported:
    tbb::flow::composite_node;
    additional policies of tbb::flow::graph_node::reset().
- Platform abstraction layer for Windows* OS updated to use compiler
    intrinsics for most atomic operations.
- The tbb/compat/thread header updated to automatically include
    C++11 <thread> where available.
- Fixes and refactoring in the task scheduler and class task_arena.
- Added key_matching policy to tbb::flow::join_node, which removes
    the restriction on the type that can be compared-against.
- For tag_matching join_node, tag_value is redefined to be 64 bits
    wide on all architectures.
- Expanded the documentation for the flow graph with details about
    node semantics and behavior.
- Added dynamic replacement of C11 standard function aligned_alloc()
    under Linux* OS.
- Added C++11 move constructors and assignment operators to
    tbb::enumerable_thread_specific container.
- Added hashing support for tbb::tbb_thread::id.
- On OS X*, binaries that depend on libstdc++ are not provided anymore.
    In the makefiles, libc++ is now used by default; for building with
    libstdc++, specify stdlib=libstdc++ in the make command line.

Preview Features:

- Added a new example, graph/fgbzip2, that shows usage of
    tbb::flow::async_node.
- Modification to the low-level API for memory pools:
    added a function for finding a memory pool by an object allocated
    from that pool.
- tbb::memory_pool now does not request memory till the first allocation
    from the pool.

Changes affecting backward compatibility:

- Internal layout of flow graph nodes has changed; recompilation is
    recommended for all binaries that use the flow graph.
- Resetting a tbb::flow::source_node will immediately activate it,
    unless it was created in inactive state.

Bugs fixed:

- Failure at creation of a memory pool will not cause process
    termination anymore.

Open-source contributions integrated:

- Supported building TBB with Clang on AArch64 with use of built-in
    intrinsics by David A.

--Vladimir

0 Kudos
14 Replies
e4lam
Beginner
2,930 Views

In tbbmalloc/proxy.cpp, I see this line commented out:

//    __TBB_ORIG_ALLOCATOR_REPLACEMENT_CALL_RELEASE(ucrtbase)

Does this mean that tbbmalloc_proxy doesn't support Visual Studio 2015?

 

0 Kudos
Vladimir_P_1234567890
2,930 Views

We have not had a chance to test this on RTM of Visual Studio 2015, so this replacement was disabled in TBB 4.4 gold. It is enabled again in the release branch. For example it is available for testing in the tbb44_20150928oss development release.

--Vladimir

0 Kudos
Shayan_Y_
Beginner
2,930 Views

Is there any documentation for static_partitioner ?

0 Kudos
Marc_R_5
Beginner
2,929 Views

Hello, 

Ther seems to be a problem with set_async_gateway(...).  If the include file is included multiple time  the linker is not happy..

multiple definition of `tbb::flow::interface8::internal::set_async_gateway(...) {}'

Thanks,

Marc

 

0 Kudos
Alexandr_K_Intel1
2,929 Views

Yes, this is a bug. You can add inline to

void set_async_gateway(...) { }

in _flow_graph_impl.h to fix it. Sorry for the problem.

0 Kudos
Marc_R_5
Beginner
2,929 Views

Hello, 

It looks like tbb::flow::graph::split_node is not working when TBB_PREVIEW_FLOW_GRAPH_NODES is defined.

Just compiling the sample from split_node documentation with the define fails:

#define TBB_PREVIEW_FLOW_GRAPH_NODES  1
#include "tbb/flow_graph.h"
using namespace tbb::flow;

int main() {
    graph g;

    queue_node<int> first_queue(g);
    queue_node<int> second_queue(g);
    split_node< tbb::flow::tuple<int, int> > my_split_node(g);
    output_port<0>(my_split_node).register_successor(first_queue);
    make_edge(output_port<1>(my_split_node), second_queue);

    for (int i = 0; i < 1000; ++i) {
        tuple<int, int> my_tuple(2 * i, 2 * i + 1);
        my_split_node.try_put(my_tuple);
    }
    g.wait_for_all();
}

0 Kudos
Alexei_K_Intel
Employee
2,929 Views

Hi Marc,

I tried your example with Microsoft Compiler (Visual Studio 2013) and Intel Compiler (Intel Parallel Studio XE 2015) and it was successfully compiled. Have you already solved the issue? If not, what compiler and options do you use to compile the example?

0 Kudos
Alexei_K_Intel
Employee
2,929 Views

Hi,

In this thread, two issues were reported. The first one is the "set_async_gateway" related issue and it was confirmed by Alexandr. However, I failed to reproduce the second issue related to "the sample from split_node documentation when TBB_PREVIEW_FLOW_GRAPH_NODES is defined."

You description seems to be related to the first one and it is really not fixed in the latest TBB.

 

0 Kudos
paul_o_1
Beginner
2,929 Views

Alexandr Konovalov (Intel) wrote:

Yes, this is a bug. You can add inline to

void set_async_gateway(...) { }

in _flow_graph_impl.h to fix it. Sorry for the problem.

It seems that this issue still exists in the latest TBB using VS2015 (I have not tried other compilers).  I also looked into _flow_graph_impl.h and it has the definition you refer to in the file.  I am using the previous version of async_node in conjunction with composite node which compiles and works well but would like to upgrade to the latest version.  However, this issue is preventing that.  I have a simple test program which replicates the problem.  

#if __TBB_PREVIEW_ASYNC_NODE
template< typename T, typename = typename T::async_gateway_type >
void set_async_gateway(T *body, void *g) {
    body->set_async_gateway(static_cast<typename T::async_gateway_type *>(g));
}

void set_async_gateway(...) { }
#endif

 

Yes.  I tried what you recommended and it does not fix the problem.  I have attached that file.  I still get the compiler error messages.  Is there another work around?

0 Kudos
paul_o_1
Beginner
2,929 Views

(name withheld) wrote:

Quote:

Alexandr Konovalov (Intel) wrote:

 

Yes, this is a bug. You can add inline to

void set_async_gateway(...) { }

in _flow_graph_impl.h to fix it. Sorry for the problem.

 

 

It seems that this issue still exists in the latest TBB using VS2015 (I have not tried other compilers).  I also looked into _flow_graph_impl.h and it has the definition you refer to in the file.  I am using the previous version of async_node in conjunction with composite node which compiles and works well but would like to upgrade to the latest version.  However, this issue is preventing that.  I have a simple test program which replicates the problem.  

#if __TBB_PREVIEW_ASYNC_NODE
template< typename T, typename = typename T::async_gateway_type >
void set_async_gateway(T *body, void *g) {
    body->set_async_gateway(static_cast<typename T::async_gateway_type *>(g));
}

void set_async_gateway(...) { }
#endif

Yes.  I tried what you recommended and it does not fix the problem.  I have attached that file.  I still get the compiler error messages.  Is there another work around?

0 Kudos
Alexei_K_Intel
Employee
2,929 Views

Oh, I missed the idea of your first message.

Alexandr suggest adding the "inline" keyword for the "void set_async_gateway(...) {}" function, i.e. replace the line

void set_async_gateway(...) {}

with

inline void set_async_gateway(...) {}

You just need to add "inline".

 

0 Kudos
paul_o_1
Beginner
2,929 Views

Alex Katranov (Intel) wrote:

Oh, I missed the idea of your first message.

Alexandr suggest adding the "inline" keyword for the "void set_async_gateway(...) {}" function, i.e. replace the line

void set_async_gateway(...) {}

with

inline void set_async_gateway(...) {}

You just need to add "inline".

 

Perfect.  Thank you for the clarification. 

0 Kudos
Steve_H_3
Beginner
2,929 Views

I would like to suggest that the bin/ia32 subdirectories be named using the $(PlatformToolset) value from Visual Studio. Currently, these subdirectories are named:

vc10, vc11,vc11_ui, vc12,vc12_ui, and vc14.

These values do not correspond to any version value in Visual Studio. As a result, developers cannot just specify a macro like $(PlatformToolset) in a Visual Studio project and automatically pick up the correct library. Either the developer must rename the subdrectories after unzipping them or hard code the appropriate subdirectory name in the project. This requires manual intervention which breaks automated updates & builds.

 

 

0 Kudos
Reply