Question on tbb::concurrent_queue - Page 2

ksgokul · ‎11-16-2010

Hi,

I have a requirement where in a thread has a lot of requests( in hundreds ) to be enqueued. I was just wondering, if i can form an array or something and enqueue it directly, it should be faster than touching the shared resource again and again. But i don't find an API like that in the reference documentation. Even if i enqueue it as an array, it should get dequeued element wise.

Can someone explain what would be the best approach in solving a problem like this?

Thanks in advance,

Gokul.

aminer10 · ‎11-21-2010

Intel:1,000-core Processor Possible:

http://www.pcworld.com/article/211238/intel_1000core_processor_possible.html

Sincerely,
Amine Moulay Ramdane.

aminer10 · ‎11-21-2010

Dmitry wrote:
>That's pretty cool!

>Unfortunately, I can't say the same about my code, I don't have a 100-core machine for tests...

Even if every thread or process that uses my Parallel Sort Libraryor Parallel
Compression Libraryetc. etc. will not scale beyong 16 cores , that's still fun!

You haveto *UTILIZE* your hundred or thousand cores , so run as much as possible
of your processes or threads on your systemevenif they don't scale very well !

That's still cool ...

Sincerely,
Amine Moulay Ramdane.

aminer10 · ‎11-21-2010

I wrote:
>Even if my lock-free ParallelQueue - that i am using inside my Object Pascal
>Thread Pool Engine - doesn't scale beyong 100 cores or 200 cores ...
[..]
>Example my a Parallel Threadthat usemy
>Parallel SortLibrary will scale to 100 cores and another thread or
>process that uses myParallel CompressionLibrary will scale to 100 cores etc. etc
>[...]

But you know very well Dmitriy that it was just hypothetical example that i gave,
it was not the reality yet ...

Look for example at the follwing application,it does not scale beyong 8 or 10 cores
on a 32 cores system
http://software.intel.com/en-us/blogs/2010/11/18/benchmarks-of-a-haskell-model-checking-application-running-on-intels-manycore-testing-lab-2/

But as i said before, even if every thread or process that uses my
Parallel Sort Libraryor Parallel Compression Libraryetc. etc. will not scale
beyong 16 cores , that's still fun!

You haveto *UTILIZE* your hundred or thousand cores , so run as much as possible
of your processes or threads on your future 100 or 1000 cores systemevenif they
don't scale very well !

That's still cool ...

It's good to make it scale 'better', but, in reality, a much perfect scale isharderto realize...

Sincerely,
Amine Moulay Ramdane.

Dmitry_Vyukov · ‎11-21-2010

> Look for example at the follwing application,it does not scale beyong 8 or 10 cores
on a 32 cores system

Yeah, I saw this one. The funny thing is that the type of simulation they do should easily scale to 32 cores.

> It's good to make it scale 'better', but, in reality, a much perfect scale isharderto realize...

The problem is not sub-linear scalability, the real problem is negative scalability. Check out graphs in the article you pointed to, what number of threads would you choose for the program - 32 or 16?

aminer10 · ‎11-21-2010

Dmitry:
>The problem is not sub-linear scalability, the real problem is negative scalability.
>Check out graphs in the >article you pointed to, what number of threads would you
>choose for the program - 32 or 16?

You are right...

And they didn't say in the webpagewhat wasis factor that limit and even make a negative scalability?

have you any idea on this simulation program Dmitiry ?

Sincerely,
Amine.

Dmitry_Vyukov · ‎11-21-2010

There are a lot of possible problems. I don't know how they parallelize it, and don't know details of Haskell scheduler and runtime, so I have no idea wrt negative scalability.

aminer10 · ‎11-21-2010

Dmitry wrote:
>There are a lot of possible problems. I don't know how they parallelize it,
>and don't know details of Haskell scheduler and runtime, so I have no idea
>wrt negative scalability

I understand.

They have not talked about it , and have notexplain to us what
was happening exactly inside there code...

Sincerely,
Amine.

ksgokul · ‎11-21-2010

>> In the lock-free Array based queue there is *NO* CONTENTION on the locks
>> insidea freelist ora memory manager.

Can you point me to some article on this? So does that mean, it wouldn't matter if i do the batch inserts as multiple single inserts?

Thanks,

Gokul.