topic Please, one specialist in IntelĀ® Moderncode for Parallel Architectures
https://community.intel.com/t5/Intel-Moderncode-for-Parallel/About-my-scalable-conjugate-gradient-linear-system-solver/m-p/1078734#M7051
<P>Please, one specialist responds to this topic</P>Fri, 18 Dec 2015 09:54:23 GMTAhmed_S_12015-12-18T09:54:23ZAbout my scalable conjugate gradient linear system solver library...
https://community.intel.com/t5/Intel-Moderncode-for-Parallel/About-my-scalable-conjugate-gradient-linear-system-solver/m-p/1078733#M7050
<P><BR />
Hello...<BR />
<BR />
<BR />
Today, ladies and gentlemen, i will talk a little bit about my scalable conjugate gradient system solver library..<BR />
<BR />
The important thing to understand is that it it is NUMA-aware and scalable on NUMA architecture, because i am using two functions that multiply a matrix by vector, so i have used a mechanism to distributed equally the memory allocation of the rows of the matrix on different NUMA nodes, and<BR />
i have made my algorithm cache-aware, other than that i have used a probabilistic mechanism to make it scalable on NUMA architecture , this probabilistic mechanism does minimize at best the contention points and it render my algorithm fully scalable on NUMA architecture.<BR />
<BR />
Hope you will be happy with my new scalable algorithm and my scalable parallel library, frankly i think i have to write something like a PhD paper to explain more my new scalable algorithm , but i will let it as it is at this moment... perhaps i will do it in the near future.<BR />
<BR />
This scalable Parallel library is especially designed for large scale industrial engineering problems that you find on industrial Finite element problems and such, this scalable Parallel library was ported to FreePascal and all the Delphi XE versions and even to Delphi 7, hope you will find it really good.<BR />
<BR />
Here is the simulation program that uses the probabilistic mechanism that i have talked about and that prove to you that my algorithm is scalable:<BR />
<BR />
If you look at my scalable parallel algorithm, it is dividing the each array of the matrix by 250 elements, and if you look carefully i am using two functions that consumes the greater part of all the CPU, it is the atsub() and asub(), and inside those functions i am using a probabilistic mechanism so that to render my algorithm scalable on NUMA architecture, what i am doing is scrambling the array parts using a probabilistic function and what i have noticed that this probabilistic mechanism is very efficient, to prove to you what i am saying , please look at the following simulation that i have done using a variable that contains the number of NUMA nodes, and what i have noticed that my simulation is giving almost a perfect scalability on NUMA architecture, for example let us give to the "NUMA_nodes" variable a value of 4, and to our array a value of 250, the simulation bellow will give a number of contention points of a quarter of the array, so if i am using 16 cores , in the the worst case it will scale 4X throughput on NUMA architecture, because since i am using an array of 250 and there is a quarter of the array of contention points , so from the Amdahl's law this will give a scalability of almost 4X throughput on four NUMA nodes, and this will give almost a perfect scalability on more and more NUMA nodes, so my parallel algorithm is scalable on NUMA architecture,<BR />
<BR />
Here is the simulation that i have done, please run it and you will notice yourself that my parallel algorithm is scalable on NUMA architecture.<BR />
<BR />
Here it is:<BR />
<BR />
---<BR />
program test;<BR />
<BR />
uses math;<BR />
<BR />
var tab,tab1,tab2,tab3:array of integer;<BR />
a,n1,k,i,n2,tmp,j,numa_nodes:integer;<BR />
begin<BR />
<BR />
a:=250;<BR />
Numa_nodes:=4;<BR />
<BR />
setlength(tab2,a);<BR />
<BR />
for i:=0 to a-1<BR />
do<BR />
begin<BR />
<BR />
tab2<I>:=i mod numa_nodes;<BR />
<BR />
end;<BR />
<BR />
setlength(tab,a);<BR />
<BR />
randomize;<BR />
<BR />
for k:=0 to a-1<BR />
do tab<K>:=k;<BR />
<BR />
n2:=a-1;<BR />
<BR />
for k:=0 to a-1<BR />
do<BR />
begin<BR />
n1:=random(n2);<BR />
tmp:=tab<K>;<BR />
tab<K>:=tab[n1];<BR />
tab[n1]:=tmp;<BR />
end;<BR />
<BR />
setlength(tab1,a);<BR />
<BR />
randomize;<BR />
<BR />
for k:=0 to a-1<BR />
do tab1<K>:=k;<BR />
<BR />
n2:=a-1;<BR />
<BR />
for k:=0 to a-1<BR />
do<BR />
begin<BR />
n1:=random(n2);<BR />
tmp:=tab1<K>;<BR />
tab1<K>:=tab1[n1];<BR />
tab1[n1]:=tmp;<BR />
end;<BR />
<BR />
for i:=0 to a-1<BR />
do<BR />
if tab2[tab<I>]=tab2[tab1<I>] then<BR />
begin<BR />
inc(j);<BR />
writeln('A contention at: ',i);<BR />
<BR />
end;<BR />
<BR />
writeln('Number of contention points: ',j);<BR />
setlength(tab,0);<BR />
setlength(tab1,0);<BR />
setlength(tab2,0);<BR />
end.<BR />
---<BR />
<BR />
<BR />
You can download my Scalable Parallel Conjugate gradient solver library from:<BR />
<BR />
<A href="https://sites.google.com/site/aminer68/scalable-parallel-implementation-of-conjugate-gradient-linear-system-solver-library-that-is-numa-aware-and-cache-aware" target="_blank">https://sites.google.com/site/aminer68/scalable-parallel-implementation-of-conjugate-gradient-linear-system-solver-library-that-is-numa-aware-and-cache-aware</A><BR />
<BR />
<BR />
Thank you for your time.<BR />
<BR />
<BR />
<BR />
Amine Moulay Ramdane.<BR />
<BR />
<BR />
<BR />
<BR />
<BR />
<BR />
</I></I></K></K></K></K></K></K></I></P>Sat, 21 Nov 2015 20:16:07 GMThttps://community.intel.com/t5/Intel-Moderncode-for-Parallel/About-my-scalable-conjugate-gradient-linear-system-solver/m-p/1078733#M7050aminer102015-11-21T20:16:07ZPlease, one specialist
https://community.intel.com/t5/Intel-Moderncode-for-Parallel/About-my-scalable-conjugate-gradient-linear-system-solver/m-p/1078734#M7051
<P>Please, one specialist responds to this topic</P>Fri, 18 Dec 2015 09:54:23 GMThttps://community.intel.com/t5/Intel-Moderncode-for-Parallel/About-my-scalable-conjugate-gradient-linear-system-solver/m-p/1078734#M7051Ahmed_S_12015-12-18T09:54:23ZDownloaded. Very interesting.
https://community.intel.com/t5/Intel-Moderncode-for-Parallel/About-my-scalable-conjugate-gradient-linear-system-solver/m-p/1078735#M7052
<P>Downloaded. Very interesting. Thanks a lot. :)</P>Sat, 04 Apr 2020 16:50:22 GMThttps://community.intel.com/t5/Intel-Moderncode-for-Parallel/About-my-scalable-conjugate-gradient-linear-system-solver/m-p/1078735#M7052ArthurRatz2020-04-04T16:50:22Z