<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Parallel Compression 1.0 ..... in Intel® Moderncode for Parallel Architectures</title>
    <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Parallel-Compression-1-0/m-p/855382#M2068</link>
    <description>&lt;BR /&gt;&lt;BR /&gt;Hello again,&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;Now as i have stated before:&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;P&gt;[3] If there is LESS contention THEN the algorithm will &lt;/P&gt;&lt;P&gt;scale better. Due to the fact that S (the serial part) &lt;/P&gt;&lt;P&gt;become smaller with less contention , and as N become bigger, &lt;/P&gt;&lt;P&gt;the result - the speed of the program/algorithm... - of the &lt;/P&gt;&lt;P&gt;Amdahl's equation 1/(S+(P/N)) become bigger.&lt;/P&gt;&lt;BR /&gt;And , as you have noticed , i have followed this theorem [3] when &lt;BR /&gt;i have constructed my Thread Pool Engine etc...&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;Now there is another theorem that i can state like this:&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;[4] You have latency and bandwith , so,IF you use efficiently&lt;BR /&gt; one or both of them - latency and bandwidth - your algorithm &lt;BR /&gt; will be more efficient.&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;It is why you have to not start too many threadsin my &lt;BR /&gt;Thread Pool Engine, so that you will not context switch a lot, &lt;BR /&gt;cause, when you context switcha lot, the latency will grow and &lt;BR /&gt;this is not good for efficiency ..&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;You have to be smart.&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;P&gt;And as i have stated before: &lt;/P&gt;&lt;P&gt;IF you follow and base your reasonning on those theorems &lt;BR /&gt;- or laws or true propositions or good patterns , like theorem [1], [2], [3],[4]... - &lt;BR /&gt;THEN your will construct a model that will be much more CORRECT &lt;BR /&gt;and EFFICIENT. &lt;/P&gt;&lt;BR /&gt;&lt;BR /&gt;Take care...&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;Sincerely,&lt;BR /&gt;Amine Moulay Ramdane.&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;</description>
    <pubDate>Sat, 03 Apr 2010 21:26:58 GMT</pubDate>
    <dc:creator>aminer10</dc:creator>
    <dc:date>2010-04-03T21:26:58Z</dc:date>
    <item>
      <title>Parallel Compression 1.0 .....</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Parallel-Compression-1-0/m-p/855375#M2061</link>
      <description>&lt;P&gt;&lt;BR /&gt;Skybuck wrote:&lt;BR /&gt;&amp;gt; What if people wanna roll there own versions ? ;) &lt;BR /&gt;&amp;gt; They would much better be "served" by algorithms/pseudo &lt;BR /&gt;&amp;gt; code than real code which could be system/language specific ;)&lt;/P&gt;&lt;P&gt;It's easy to EXTRACT algorithms from Object Pascal code...&lt;/P&gt;&lt;P&gt;Look for example inside pbzip.pas, i am using this in the &lt;BR /&gt;main body of my program: &lt;/P&gt;&lt;P&gt;name:='msvcr100.dll'; &lt;/P&gt;&lt;P&gt;It's the 'test' file that i am using - it's inside the &lt;BR /&gt;zip file also - one you compile and execute pbzip.pas it &lt;BR /&gt;will generate a file msvcr100.dll.bz. And as you have &lt;BR /&gt;noticed i am using a - portable - compound filesystem, &lt;BR /&gt;look at ParallelStructuredStorage.pas inside the zip file.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;After that i am opening it with:&lt;/P&gt;&lt;P&gt;fstream1:=TFileStream.create(name, fmOpenReadWrite);&lt;/P&gt;&lt;P&gt;and i am reading chunks of streams and 'distributing' them &lt;BR /&gt;to my Thread Pool Engine to be compressed - in parallel - &lt;BR /&gt;by myobj.BZipcompress method, look at:&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;for i:=0 to e&lt;BR /&gt;do &lt;BR /&gt; begin&lt;BR /&gt; &lt;BR /&gt; if (i=e) and (r=0) then break; &lt;BR /&gt; stream1:=TMemoryStream.create;&lt;BR /&gt; if (r &amp;gt; 0) and (i=e) &lt;BR /&gt; then stream1.copyfrom(fstream1,r)&lt;BR /&gt; else stream1.copyfrom(fstream1,d);&lt;BR /&gt; stream1.position:=0;&lt;BR /&gt; obj:=TJob.create;&lt;BR /&gt; obj.stream:=stream1;&lt;BR /&gt; obj.directory:=directory;&lt;BR /&gt; obj.compressionlevel:=9;&lt;BR /&gt; obj.streamindex:=inttostr(i);&lt;BR /&gt; obj.r:=r;&lt;BR /&gt; obj.number:=e; &lt;BR /&gt; &lt;BR /&gt; TP.execute(myobj.BZipcompress,pointer(obj));&lt;/P&gt;&lt;P&gt; end;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;I am doing the same thing in PZlib.pas...&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;A href="http://pages.videotron.com/aminer/"&gt;http://pages.videotron.com/aminer/&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Sincerely,&lt;BR /&gt;Amine Moulay Ramdane.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 02 Apr 2010 07:17:44 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Parallel-Compression-1-0/m-p/855375#M2061</guid>
      <dc:creator>aminer10</dc:creator>
      <dc:date>2010-04-02T07:17:44Z</dc:date>
    </item>
    <item>
      <title>Parallel Compression 1.0 .....</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Parallel-Compression-1-0/m-p/855376#M2062</link>
      <description>&lt;SPAN style="font-size: x-small;"&gt;&lt;P&gt;&lt;BR /&gt;&lt;BR /&gt;Hello again,&lt;BR /&gt;&lt;BR /&gt;And after that i am reading those compressedfiles&lt;BR /&gt;from the compound filesystem -look inside pzlib.pas - &lt;BR /&gt;and i am'distributing' those compressed files, as streams,&lt;BR /&gt;tomy Thread Pool Engine to be decompressed - look inside &lt;BR /&gt;pzlib.pas - by myobj.Zlibdecompress method, look at:&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;--------------------------------------------&lt;BR /&gt;&lt;BR /&gt;names:=TStringlIST.create;&lt;/P&gt;&lt;P&gt;storage.foldernames('/',names);&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;len:=strtoint(names[0]);&lt;/P&gt;&lt;P&gt;if r=0 then len:=len+ 1 ;&lt;/P&gt;&lt;P&gt;for i:=0 to len&lt;/P&gt;&lt;P&gt;do &lt;/P&gt;&lt;P&gt;begin&lt;/P&gt;&lt;P&gt;if (i=len) and (r=0) then break; &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;obj:=TJob.create;&lt;/P&gt;&lt;P&gt;obj.directory:=directory;&lt;/P&gt;&lt;P&gt;obj.streamindex:=inttostr(i);&lt;/P&gt;&lt;P&gt;obj.index:=i;&lt;/P&gt;&lt;P&gt;obj.number:=e; &lt;/P&gt;&lt;P&gt;obj.r:=r;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;TP.execute(myobj.Zlibdecompress,pointer(obj));&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;end;&lt;BR /&gt;&lt;BR /&gt;--------------------------------------------------&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;Sincerely,&lt;BR /&gt;Amine Moulay Ramdane.&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;/SPAN&gt;</description>
      <pubDate>Fri, 02 Apr 2010 07:28:28 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Parallel-Compression-1-0/m-p/855376#M2062</guid>
      <dc:creator>aminer10</dc:creator>
      <dc:date>2010-04-02T07:28:28Z</dc:date>
    </item>
    <item>
      <title>Parallel Compression 1.0 .....</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Parallel-Compression-1-0/m-p/855377#M2063</link>
      <description>&lt;P&gt;&lt;BR /&gt;I wrote:&lt;BR /&gt;&amp;gt; And as you have noticed i am using a portable &lt;BR /&gt;&amp;gt; compound filesystem, look at ParallelStructuredStorage.pas &lt;BR /&gt;&amp;gt; inside the zip file. &lt;/P&gt;&lt;P&gt;Why ?&lt;/P&gt;&lt;P&gt;Cause you can parallel compress your files and store &lt;BR /&gt;those compound filesystem .zlb (zlib) or .bz (bzip)&lt;BR /&gt;compressed files in a portable compound filesystem&lt;BR /&gt;and after that you can distribute your compound filesystem...&lt;/P&gt;&lt;P&gt;And of course you can uncompress files - or all the &lt;BR /&gt;content of your compound file system - from your compound &lt;BR /&gt;file system. &lt;/P&gt;&lt;P&gt;&lt;BR /&gt;And of course that's easy with Parallel Compression 1.0 :)&lt;/P&gt;&lt;P&gt;&lt;A href="http://pages.videotron.com/aminer/"&gt;http://pages.videotron.com/aminer/&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;Sincerely,&lt;BR /&gt;Amine Moulay Ramdane.&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 02 Apr 2010 08:43:44 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Parallel-Compression-1-0/m-p/855377#M2063</guid>
      <dc:creator>aminer10</dc:creator>
      <dc:date>2010-04-02T08:43:44Z</dc:date>
    </item>
    <item>
      <title>Parallel Compression 1.0 .....</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Parallel-Compression-1-0/m-p/855378#M2064</link>
      <description>&lt;P&gt;&lt;BR /&gt;Skybuvk wrote:&lt;BR /&gt;&amp;gt;[...] an algorithm really ;) &lt;BR /&gt;&amp;gt;What's so special about it ? &lt;/P&gt;&lt;P&gt;Parallel bzip and zlib is not just pbzip.pas and pzlib.pas &lt;BR /&gt;the parallel bzip and zlib algorithm include my Thread Pool Engine &lt;BR /&gt;algorithm + Parallel Queue algorithm ... &lt;/P&gt;&lt;P&gt;I am calling it algorithm cause it uses a finite number of &lt;BR /&gt;instructions and rules to resolve a problem - parallel compression &lt;BR /&gt;and decompression - &lt;/P&gt;&lt;P&gt;Do you understand ?&lt;/P&gt;&lt;P&gt;And as i said you can parallel compress your files and store &lt;BR /&gt;those compound filesystem .zlb (zlib) or .bz (bzip) &lt;BR /&gt;compressed files in a portable compound filesystem &lt;BR /&gt;and after that you can distribute your compound filesystem... &lt;BR /&gt;And of course you can uncompress files - or all the &lt;BR /&gt;content of your compound file system - from your compound &lt;BR /&gt;file system. &lt;/P&gt;&lt;P&gt;&amp;gt; I see a whole bunch of pascal/delphi files thrown together, &lt;BR /&gt;&amp;gt;a whole bunch of dll's and god-forbid ms*.dll files... &lt;/P&gt;&lt;P&gt;Those dlls are mandatory for now...&lt;/P&gt;&lt;P&gt;and you can easily write a batch file etc. and reorganize ... &lt;/P&gt;&lt;P&gt;&amp;gt; I see some "test programs" which are described as "modules" which they &lt;BR /&gt;&amp;gt; simply are not... &lt;/P&gt;&lt;P&gt;That's VERY easy to convert those pzlib.pas and pbzip.pas &lt;BR /&gt;to units, and that's what i will do in the next step... &lt;/P&gt;&lt;P&gt;Parallel Compression will still be enhanced in the future... &lt;/P&gt;&lt;P&gt;&amp;gt; It shouldn't be that hard... set your editor to "use tab character" (turn &lt;BR /&gt;&amp;gt; tabs to spaces off) &lt;/P&gt;&lt;P&gt;I am not using the delphi editor, just the notpad.exe or write.exe... &lt;BR /&gt;and i am compiling from the dos prompt... &lt;/P&gt;&lt;P&gt;&amp;gt;So far it seems like you are inserting your &lt;BR /&gt;&amp;gt;threads/syncronizations &lt;BR /&gt;&amp;gt;everywhere in single-thread-design algorithms ? &lt;/P&gt;&lt;P&gt;No, it's not just insertting threads/syncronizations .. &lt;/P&gt;&lt;P&gt;I have reasoned - and used logic - look for example at &lt;BR /&gt;parallelhashlist.pas inside the zip file, i am using MEWs etc. &lt;BR /&gt;carefully in the right places and i have also a little bit &lt;BR /&gt;modified the serial code... and it uses a hash based method , &lt;BR /&gt;with an array of MREW...&lt;/P&gt;&lt;P&gt;The Thread Pool Engine Engine i have constructued it from zero&lt;BR /&gt;- and i have used my ParallelQueue - an efficent lock-free queue - etc.... &lt;/P&gt;&lt;P&gt;The parallel bzip and zlib, i have constructed it by using &lt;BR /&gt;also my Thread Pool Engine construction etc... &lt;BR /&gt;etc. &lt;/P&gt;&lt;P&gt;That's not just 'inserting' threads/syncronizations. &lt;/P&gt;&lt;P&gt;&amp;gt;But my estimate would be that for now on low core systems... the &lt;BR /&gt;&amp;gt;"compression" would take far more time... &lt;/P&gt;&lt;P&gt;No. pbzlib.pas takes for example 3.3x on 4 cores...&lt;/P&gt;&lt;P&gt;&lt;A href="http://pages.videotron.com/aminer/ParallelCompression/parallelbzip.htm"&gt;http://pages.videotron.com/aminer/ParallelCompression/parallelbzip.htm&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Skybuck also wrote &lt;BR /&gt;&amp;gt; [...] or anything extraordinary... &lt;/P&gt;&lt;P&gt;Don't be stupid Skybuck. &lt;/P&gt;&lt;P&gt;It's in fact: &lt;/P&gt;&lt;P&gt;1- Usefull &lt;BR /&gt;2 - A good thing for educational purpose. &lt;/P&gt;&lt;P&gt;&lt;BR /&gt;Amine Moulay Ramdane.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 02 Apr 2010 15:11:18 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Parallel-Compression-1-0/m-p/855378#M2064</guid>
      <dc:creator>aminer10</dc:creator>
      <dc:date>2010-04-02T15:11:18Z</dc:date>
    </item>
    <item>
      <title>Parallel Compression 1.0 .....</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Parallel-Compression-1-0/m-p/855379#M2065</link>
      <description>&lt;P&gt;&lt;BR /&gt;Skybuck wrote:&lt;BR /&gt;&amp;gt;The thread pool concept is retarded. &lt;BR /&gt;&amp;gt;Any good delphi programmer is capable of creating an array of threads. &lt;BR /&gt;&amp;gt;So my advice to you: &lt;BR /&gt;&amp;gt;1. Delete your thread pool, because it's junk. &lt;BR /&gt;&amp;gt;2. Write a serious/big application that uses many threads, &lt;BR /&gt;&amp;gt;and simply derive from TThread to see how easy it is. &lt;/P&gt;&lt;P&gt;&lt;BR /&gt;How can you be so stupid ?&lt;/P&gt;&lt;P&gt;My Thread Pool Engine is not just an array of threads,&lt;BR /&gt;it uses effient lock-free queues - example lock-free ParalleQueue - &lt;BR /&gt;for each worker thread and it uses work-stealing - for more&lt;BR /&gt;efficiency - etc ...&lt;/P&gt;&lt;P&gt;And it easy the work for you - you can 'reuse' the TThreadPool Class... - &lt;BR /&gt;and it is very useful...&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;Please read again:&lt;/P&gt;&lt;P&gt;&lt;A href="http://pages.videotron.com/aminer/threadpool.htm"&gt;http://pages.videotron.com/aminer/threadpool.htm&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Amine Moulay Ramdane.&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 02 Apr 2010 18:00:58 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Parallel-Compression-1-0/m-p/855379#M2065</guid>
      <dc:creator>aminer10</dc:creator>
      <dc:date>2010-04-02T18:00:58Z</dc:date>
    </item>
    <item>
      <title>Parallel Compression 1.0 .....</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Parallel-Compression-1-0/m-p/855380#M2066</link>
      <description>&lt;P&gt;&lt;BR /&gt;Skybuck wrote in alt.comp.lang.borland-delphi:&lt;/P&gt;&lt;P&gt;&amp;gt; My Thread Pool Engine is not just an array of threads,&lt;BR /&gt;&amp;gt; "&lt;BR /&gt;&amp;gt; &lt;BR /&gt;&amp;gt;&amp;gt; To me it is.&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;You really don't know what you are talking about..&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;The principal threat to scalability in concurrent applications &lt;BR /&gt;is the exclusive resource lock.&lt;/P&gt;&lt;P&gt;And there are three ways to reduce lock contention:&lt;/P&gt;&lt;P&gt;1- Reduce the duration for which locks are held&lt;/P&gt;&lt;P&gt;2- Reduce the frequency with which locks are requested&lt;/P&gt;&lt;P&gt;or&lt;/P&gt;&lt;P&gt;3- Replace exclusive locks with coordination mechanisms that &lt;BR /&gt; permit greater concurrency.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;With low , moderate AND high contention, my ParallelQueue&lt;BR /&gt;offer better scalability - and i am using it inside my &lt;BR /&gt;Thread Pool Engine - .&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;Because my ParallelQueue is using an hash based method &lt;BR /&gt;- and lock striping - and using just a LockedInc() , so,&lt;BR /&gt;i am REDUCING the duration for which locks are held AND REDUCING &lt;BR /&gt;the frequency with which locks are requested, hence i am &lt;BR /&gt;REDUCING A LOT the contention, so it's very efficient.&lt;/P&gt;&lt;P&gt;And as I stated before , and this is a law or theorem to apply: &lt;/P&gt;&lt;P&gt;[3] If there is LESS contention THEN the algorithm will &lt;BR /&gt; scale better. Due to the fact that S (the serial part)&lt;BR /&gt; become smaller with less contention , and as N become bigger,&lt;BR /&gt; the result - the speed of the program/algorithm... - of the &lt;BR /&gt; Amdahl's equation 1/(S+(P/N)) become bigger. &lt;/P&gt;&lt;P&gt;&lt;BR /&gt;It's why my ParallelQueue have scored 7 millions of pop() &lt;BR /&gt;transactions per second... better than flqueue and RingBuffer &lt;/P&gt;&lt;P&gt;look at: &lt;A href="http://pages.videotron.com/aminer/parallelqueue/parallelqueue.htm"&gt;Http://pages.videotron.com/aminer/parallelqueue/parallelqueue.htm&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Also my Threadpool uses efficent lock-free queues -&lt;BR /&gt;example lock-free ParallelQueue - for each worker thread &lt;BR /&gt;- to reduce an minimize the contention - and it uses work-stealing &lt;BR /&gt;so my Thread Pool Engine is very efficient...&lt;/P&gt;&lt;P&gt;And it easy the work for you - you can 'reuse' the TThreadPool &lt;BR /&gt;Class...- and it is very useful... &lt;/P&gt;&lt;P&gt;&lt;BR /&gt;So, don't be stupid skybuck...&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;A href="http://pages.videotron.com/aminer/"&gt;http://pages.videotron.com/aminer/&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;Sincerely&lt;BR /&gt;Amine Moulay Ramdane&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Sat, 03 Apr 2010 18:34:15 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Parallel-Compression-1-0/m-p/855380#M2066</guid>
      <dc:creator>aminer10</dc:creator>
      <dc:date>2010-04-03T18:34:15Z</dc:date>
    </item>
    <item>
      <title>Parallel Compression 1.0 .....</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Parallel-Compression-1-0/m-p/855381#M2067</link>
      <description>&lt;P&gt;&lt;BR /&gt;I wrote:&lt;BR /&gt;&amp;gt; Because my ParallelQueue is using an hash based method&lt;BR /&gt;&amp;gt; - and lock striping - and using just a LockedInc() , so,&lt;BR /&gt;&amp;gt; i am REDUCING the duration for which locks are held AND REDUCING&lt;BR /&gt;&amp;gt; the frequency with which locks are requested, hence i am&lt;BR /&gt;&amp;gt; REDUCING A LOT the contention, so it's very efficient.&lt;/P&gt;&lt;P&gt;With low , moderate AND high contention, my ParallelQueue &lt;BR /&gt;offers better scalability - and i am using it inside my &lt;BR /&gt;Thread Pool Engine - . &lt;/P&gt;&lt;P&gt;And as you have noticed, i am using a low to medium contention &lt;BR /&gt;on the following test:&lt;/P&gt;&lt;P&gt;&lt;A href="http://pages.videotron.com/aminer/parallelqueue/parallelqueue.htm"&gt;http://pages.videotron.com/aminer/parallelqueue/parallelqueue.htm&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;But i predict that on HIGH contention the push() and pop() will&lt;BR /&gt;score even better than that.. &lt;/P&gt;&lt;P&gt;Why ?&lt;/P&gt;&lt;P&gt;Because my ParallelQueue is using an hash based method &lt;BR /&gt;- and lock striping - and using just a LockedInc() , so, &lt;BR /&gt;i am REDUCING the duration for which locks are held AND REDUCING &lt;BR /&gt;the frequency with which locks are requested, hence i am &lt;BR /&gt;REDUCING A LOT the contention, so it's very efficient. &lt;/P&gt;&lt;P&gt;&lt;BR /&gt;And as I stated before , and this is a law or theorem to apply: &lt;/P&gt;&lt;P&gt;&lt;BR /&gt;[3] If there is LESS contention THEN the algorithm will &lt;BR /&gt; scale better. Due to the fact that S (the serial part) &lt;BR /&gt; become smaller with less contention , and as N become bigger, &lt;BR /&gt; the result - the speed of the program/algorithm... - of the &lt;BR /&gt; Amdahl's equation 1/(S+(P/N)) become bigger. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Sincerely,&lt;BR /&gt;Amine Moulay Ramdane,&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;</description>
      <pubDate>Sat, 03 Apr 2010 18:52:52 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Parallel-Compression-1-0/m-p/855381#M2067</guid>
      <dc:creator>aminer10</dc:creator>
      <dc:date>2010-04-03T18:52:52Z</dc:date>
    </item>
    <item>
      <title>Parallel Compression 1.0 .....</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Parallel-Compression-1-0/m-p/855382#M2068</link>
      <description>&lt;BR /&gt;&lt;BR /&gt;Hello again,&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;Now as i have stated before:&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;P&gt;[3] If there is LESS contention THEN the algorithm will &lt;/P&gt;&lt;P&gt;scale better. Due to the fact that S (the serial part) &lt;/P&gt;&lt;P&gt;become smaller with less contention , and as N become bigger, &lt;/P&gt;&lt;P&gt;the result - the speed of the program/algorithm... - of the &lt;/P&gt;&lt;P&gt;Amdahl's equation 1/(S+(P/N)) become bigger.&lt;/P&gt;&lt;BR /&gt;And , as you have noticed , i have followed this theorem [3] when &lt;BR /&gt;i have constructed my Thread Pool Engine etc...&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;Now there is another theorem that i can state like this:&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;[4] You have latency and bandwith , so,IF you use efficiently&lt;BR /&gt; one or both of them - latency and bandwidth - your algorithm &lt;BR /&gt; will be more efficient.&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;It is why you have to not start too many threadsin my &lt;BR /&gt;Thread Pool Engine, so that you will not context switch a lot, &lt;BR /&gt;cause, when you context switcha lot, the latency will grow and &lt;BR /&gt;this is not good for efficiency ..&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;You have to be smart.&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;P&gt;And as i have stated before: &lt;/P&gt;&lt;P&gt;IF you follow and base your reasonning on those theorems &lt;BR /&gt;- or laws or true propositions or good patterns , like theorem [1], [2], [3],[4]... - &lt;BR /&gt;THEN your will construct a model that will be much more CORRECT &lt;BR /&gt;and EFFICIENT. &lt;/P&gt;&lt;BR /&gt;&lt;BR /&gt;Take care...&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;Sincerely,&lt;BR /&gt;Amine Moulay Ramdane.&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Sat, 03 Apr 2010 21:26:58 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Parallel-Compression-1-0/m-p/855382#M2068</guid>
      <dc:creator>aminer10</dc:creator>
      <dc:date>2010-04-03T21:26:58Z</dc:date>
    </item>
    <item>
      <title>Parallel Compression 1.0 .....</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Parallel-Compression-1-0/m-p/855383#M2069</link>
      <description>&lt;P&gt;&lt;BR /&gt;Skybuck wrote:&lt;/P&gt;&lt;P&gt;&amp;gt;Some notes/pointers: &lt;/P&gt;&lt;P&gt;&amp;gt;1. The adventage of "blocking locks" on windows is that it doesn't consume &lt;BR /&gt;&amp;gt;so much cpu (?) when it's waiting... this leads to lower cpu temperatures. &lt;/P&gt;&lt;P&gt;&amp;gt;2. Nowadays I want to make my programs run as "cold" as possible to save the &lt;BR /&gt;&amp;gt;system from overheat death. &lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&amp;gt;3. Spinlocks make the cpu run hot and there is bad ?! Unless very maybe the &lt;BR /&gt;&amp;gt;blocking scenerio would be worse, but that needs to be proven first. &lt;/P&gt;&lt;P&gt;&lt;BR /&gt;Read carefully:&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;"The worker threads enters in a wait state when there is &lt;BR /&gt;no job in the lock-free queues - for more efficiency -"&lt;/P&gt;&lt;P&gt;&lt;A href="http://pages.videotron.com/aminer/threadpool.htm"&gt;http://pages.videotron.com/aminer/threadpool.htm&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;So, my Thread Pool Engine doesn't consume any CPU when &lt;BR /&gt;there is no job in the queues , and this leads to lower &lt;BR /&gt;cpu temperatures.&lt;/P&gt;&lt;P&gt; :) &lt;/P&gt;&lt;P&gt;&lt;BR /&gt;Sincerely,&lt;BR /&gt;Amine Moulay Ramdane.&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;</description>
      <pubDate>Sun, 04 Apr 2010 06:29:38 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Parallel-Compression-1-0/m-p/855383#M2069</guid>
      <dc:creator>aminer10</dc:creator>
      <dc:date>2010-04-04T06:29:38Z</dc:date>
    </item>
  </channel>
</rss>

