Intel® oneAPI Threading Building Blocks
Ask questions and share information about adding parallelism to your applications when using this threading library.
2472 Discussions

How will i add operloaded function in following code.


Hello Friends,

 I am trying to develop programs in intel threading building blocks. I got one sample code which describe how will i user prallel_reduce.


#include <cstdio>
#include <cstdlib>
#include <cstring>
#include <ctype.h>
#include "tbb/parallel_reduce.h"
#include "tbb/blocked_range.h"
#include "tbb/task_scheduler_init.h"
#include "tbb/tick_count.h"
#include <iostream>
using namespace tbb;

// Uncomment the line below to enable the auto_partitioner
#define AUTO_GRAIN

struct Sum {
    double value;
    Sum() : value(0) {}
    Sum( Sum& s, split ) {value = 0;}
    void operator()( const blocked_range<float*>& range ) {
        double temp = value;
        for( float* a=range.begin(); a!=range.end(); ++a ) {
            temp += *a;
        value = temp;
    void join( Sum& rhs ) {value += rhs.value;}

class CPU

        double ParallelSum( float array[], size_t n );
        double ParallelSum( double array[], size_t n );

double CPU :: ParallelSum( float array[], size_t n ) {
    Sum total;
    parallel_reduce( blocked_range<float*>( array, array+n ),
                     total, auto_partitioner() );
    return total.value;

//! Problem size
const int N = 600000000;

//! Number of threads to use.
static int MaxThreads = 8;

//! If true, print out bits of the array
static bool VerboseFlag = false;

//! Parse the command line.
int main( int argc, char* argv[] ) {
    int i = 1;
    CPU c;
    float *input=(float *)malloc(sizeof(float)*N);
    for(int i=0;i<N;i++)
        task_scheduler_init init;

        tick_count t0 = tick_count::now();
        double output = c.ParallelSum(input, N);
        tick_count t1 = tick_count::now();

          printf("Output: %10.2lf\n", output);
        printf("%2d threads: %.3f msec\n", p, (t1-t0).seconds()*1000);
    return 0;


But suppose, I want to do same operation with diffrent datatype. like an function overloading how will i achive this with this code without braking it's optimization capability.

0 Kudos
5 Replies
Valued Contributor II

The struct Sum above is a bit of a hodge-podge, combining both doubles and floats into its implementing code.  It should be very hard to use this common class from both the float-based and double-based implementations of ParallelSum above.  If you're just playing around, you might try turning the array element type in to a template parameter in the definition of Sum and then define ParallelSum in terms of that template parameter and do template instantiations of the ParallelSum types you want to create.  That would at least clean up some of the type mismatches that inhabit this code.

0 Kudos

hello Robert,

Hey, I have written this struct which perform parallel sum with diffient data type like. int, long, float, double. But it required diffirent varaiable to store the result of specific datatype's array. I never want to declare seperate vaiable to store result of diffrent datatype's array.Because at a time i will use only one type of array there for other result variable unusable. I think it's wastage of memory. In my program.

How will i solve it? please help me.

Thank you for your yesterdays response.

0 Kudos
Valued Contributor III

Robert's suggestion addresses exactly that: use a class template (instead of overloading).

0 Kudos

Is function overloading affacts on performance?

What are the drow back's of using function operloading insted of class template? or What are the advantages of class templates.

0 Kudos
Valued Contributor III

That's probably not in the scope of this forum. There should be lots of other resources to learn about class templates like the ones you see in the C++ Standard Template Library: a vector<int> is for integers, a vector<float> is for floating-point numbers, etc. It's one of the concepts you should understand as a prerequisite to use TBB.

0 Kudos