- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This is my test code: $ cat ca_check.f90 program z implicit none integer :: x(10)
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Insert a SYNC ALL after x=img and see what happens.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Anton,
I'll take a look at this. What compiler version are you using?
ron
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Ron, any progress?
Thanks
Anton
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Our mutual friend Stephen reminded me to revisit this post.
First, a little status on Intel's CAF implementation: Our initial goal was to get a functional CAF implementation that conforms strictly to the Standard. Performance has not been fully addressed at this time and may take a while to get to acceptable levels for production purposes - particularly for distributed memory systems.
But next, I see some errors in the test and question what it is you're timing. In particular, let's visit correctness first. Removing the timing and writes you have this code on Image 1:
do i=1,imgs x = x(:) x(:) = x end do
The problem here - CAF remote reads/writes are inherently asychronous or 1-sided. So the value of X from the read may not have been completed by the time you use X in the next statement on the RHS of the assignment. So the results are unpredictable. And a minor point, for Image 1, do you want to test self-read and write (the do loop goes from 1 to imgs, but do we care about image 1 reading/writing itself in shared memory? ). What I think you want is something like a neighbor exchange, something like this for the read maybe?:
do i = 2,imgs
  sync all
  if ( img = 1 ) then
     !...start timer here
     x = x(:)
     sync images(i) !...wait for remote read to complete
     !...finish timer here, print result?
  else if ( img = i ) then
     sync images(1) !...sync point with image 1
  end if
end do
Remembering that image control statements (like the SYNC IMAGES) imply a SYNC MEMORY. Maybe I should have used SYNC MEMORY instead, but so it goes.
So the next question is, what do you want to time? Do you want find the time for the data transfer as we're doing above? Or do you want to time how long the statement takes to see if it's true asynchronous or synchronous or just darn inefficient? In the above we're also capturing the time for the 1-1 synchronization, so it's not a good measure of throughput. Also, note I had a SYNC ALL at the top of the loop to make sure all the images execute the I iterations in lock step. Thinking of this, I believe it would be OK to remove that. Then each remote image would quickly drop into the SYNC IMAGES(1) and be waiting for image 1's SYNC IMAGES(img). That would be faster, obviously.
Tricky stuff. I might suggest rethinking this experiment to see if we can derive a better test. ALSO, don't use cpu_time as it gathers the sum of thread times for the process, which with threads running in background to do the IO might give too much time. I use a wall-clock instead like this contained procedure mytime() :
program foo
use ISO_FORTRAN_ENV
implicit none
integer, parameter :: dp = REAL64
real (kind=dp) :: tstart, tstop, ttime
!... ready to time a block of code
tstart = mytime()
!...do something
tstop = mytime()
ttime = tstop - tstart
contains
  function mytime()  result (tseconds)
    real (dp)       :: tseconds
    integer (INT64) ::  count, count_rate, count_max
    real (dp)       :: tsec, rate
    CALL SYSTEM_CLOCK(count, count_rate, count_max)
    tsec = count
    rate = count_rate
    tseconds = tsec / rate
  end function mytime 
end program foo
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
ok, maybe you are right. With your modification, the times are reasonable:
LD_LIBRARY_PATH:  /cm/shared/apps/intel-cluster-studio/impi/4.1.0.024/intel64/lib:/cm/shared/languages/Intel-Compiler-XE-14/composer_xe_2013_sp1.0.080/compiler/lib/intel64:/cm/shared/languages/Intel-Compiler-XE-14/composer_xe_2013_sp1.0.080/mpirt/lib/intel64:/cm/shared/languages/Intel-Compiler-XE-14/composer_xe_2013_sp1.0.080/ipp/../compiler/lib/intel64:/cm/shared/languages/Intel-Compiler-XE-14/composer_xe_2013_sp1.0.080/ipp/lib/intel64:/cm/shared/languages/Intel-Compiler-XE-14/composer_xe_2013_sp1.0.080/compiler/lib/intel64:/cm/shared/languages/Intel-Compiler-XE-14/composer_xe_2013_sp1.0.080/mkl/lib/intel64:/cm/shared/languages/Intel-Compiler-XE-14/composer_xe_2013_sp1.0.080/tbb/lib/intel64/gcc4.4:/cm/shared/apps/ParaView-4.0.1/ParaView-4.0.1-Linux-64bit/lib:/cm/shared/apps/torque/4.2.4.1/lib:/cm/shared/apps/moab/7.2.2/lib:/cm/shared/tools/subversion-1.8.4/lib:/cm/shared/languages/Intel-Compiler-XE-14/compiler/lib:/cm/shared/languages/Intel-Compiler-XE-14/compiler/lib/intel64:/cm/shared/languages/Intel-Compiler-XE-14/lib:/cm/shared/languages/Intel-Compiler-XE-14/lib/intel64:/cm/shared/apps/intel-cluster-studio/impi/4.1.0.024/intel64/lib:/cm/shared/apps/intel-cluster-studio/composer_xe_2013.1.117/compiler/lib:/cm/shared/apps/intel-cluster-studio/composer_xe_2013.1.117/compiler/lib/intel64:/cm/shared/apps/intel-cluster-studio/composer_xe_2013.1.117/lib:/cm/shared/apps/intel-cluster-studio/composer_xe_2013.1.117/lib/intel64
	which mpirun:  /cm/shared/apps/intel-cluster-studio/impi/4.1.0.024/intel64/bin/mpirun
	node46-009_46215 (10.131.1.33)
	node46-012_41759 (10.131.1.36)
	node46-010_52398 (10.131.1.34)
	node46-011_56689 (10.131.1.35)
	Remote read took, s :     .1471042633056641E-03
	img: 2x:2222222222
	Remote write took, s :     .2789497375488281E-04
	img: 2x:2222222222
	Remote read took, s :     .7867813110351562E-05
	img: 3x:3333333333
	Remote write took, s :     .1621246337890625E-04
	img: 3x:3333333333
	Remote read took, s :     .9059906005859375E-05
	img: 4x:4444444444
	Remote write took, s :     .1502037048339844E-04
	img: 4x:4444444444
	Remote read took, s :     .9059906005859375E-05
	img: 5x:5555555555
	Remote write took, s :     .1478195190429688E-04
	img: 5x:5555555555
	Remote read took, s :     .9059906005859375E-05
	img: 6x:6666666666
	Remote write took, s :     .1502037048339844E-04
	img: 6x:6666666666
	Remote read took, s :     .8106231689453125E-05
	img: 7x:7777777777
	Remote write took, s :     .1502037048339844E-04
	img: 7x:7777777777
	Remote read took, s :     .9059906005859375E-05
	img: 8x:8888888888
	Remote write took, s :     .1502037048339844E-04
	img: 8x:8888888888
	Remote read took, s :     .1001358032226562E-04
	img: 9x:9999999999
	Remote write took, s :     .2384185791015625E-04
	img: 9x:9999999999
	Remote read took, s :     .1192092895507812E-04
	img: 10x:10101010101010101010
	Remote write took, s :     .2217292785644531E-04
	img: 10x:10101010101010101010
	Remote read took, s :     .1001358032226562E-04
	img: 11x:11111111111111111111
	Remote write took, s :     .2384185791015625E-04
	img: 11x:11111111111111111111
	Remote read took, s :     .1096725463867188E-04
	img: 12x:12121212121212121212
	Remote write took, s :     .2288818359375000E-04
	img: 12x:12121212121212121212
	Remote read took, s :     .1001358032226562E-04
	img: 13x:13131313131313131313
	Remote write took, s :     .2408027648925781E-04
	img: 13x:13131313131313131313
	Remote read took, s :     .8821487426757812E-05
	img: 14x:14141414141414141414
	Remote write took, s :     .2193450927734375E-04
	img: 14x:14141414141414141414
	Remote read took, s :     .9059906005859375E-05
img: 15x:15151515151515151515
	Remote write took, s :     .2312660217285156E-04
	img: 15x:15151515151515151515
	Remote read took, s :     .1001358032226562E-04
	img: 16x:16161616161616161616
	Remote write took, s :     .2193450927734375E-04
	img: 16x:16161616161616161616
	Remote read took, s :     .1902410984039307
	img: 17x:17171717171717171717
	Remote write took, s :     .1580715179443359E-03
	img: 17x:17171717171717171717
	Remote read took, s :     .4100799560546875E-04
	img: 18x:18181818181818181818
	Remote write took, s :     .1480579376220703E-03
	img: 18x:18181818181818181818
	Remote read took, s :     .4005432128906250E-04
	img: 19x:19191919191919191919
	Remote write took, s :     .1471042633056641E-03
	img: 19x:19191919191919191919
	Remote read took, s :     .3600120544433594E-04
	img: 20x:20202020202020202020
	Remote write took, s :     .1480579376220703E-03
	img: 20x:20202020202020202020
	Remote read took, s :     .4196166992187500E-04
	img: 21x:21212121212121212121
	Remote write took, s :     .1480579376220703E-03
	img: 21x:21212121212121212121
	Remote read took, s :     .4315376281738281E-04
	img: 22x:22222222222222222222
	Remote write took, s :     .1480579376220703E-03
	img: 22x:22222222222222222222
	Remote read took, s :     .3886222839355469E-04
	img: 23x:23232323232323232323
	Remote write took, s :     .1478195190429688E-03
	img: 23x:23232323232323232323
	Remote read took, s :     .4196166992187500E-04
	img: 24x:24242424242424242424
	Remote write took, s :     .1480579376220703E-03
	img: 24x:24242424242424242424
	Remote read took, s :     .3504753112792969E-04
	img: 25x:25252525252525252525
	Remote write took, s :     .1192092895507812E-03
	img: 25x:25252525252525252525
	Remote read took, s :     .3695487976074219E-04
	img: 26x:26262626262626262626
	Remote write took, s :     .1280307769775391E-03
	img: 26x:26262626262626262626
	Remote read took, s :     .4291534423828125E-04
	img: 27x:27272727272727272727
	Remote write took, s :     .1170635223388672E-03
	img: 27x:27272727272727272727
	Remote read took, s :     .4315376281738281E-04
	img: 28x:28282828282828282828
	Remote write took, s :     .1189708709716797E-03
	img: 28x:28282828282828282828
	Remote read took, s :     .4291534423828125E-04
	img: 29x:29292929292929292929
	Remote write took, s :     .1189708709716797E-03
	img: 29x:29292929292929292929
	Remote read took, s :     .4291534423828125E-04
	img: 30x:30303030303030303030
	Remote write took, s :     .1161098480224609E-03
	img: 30x:30303030303030303030
	Remote read took, s :     .4196166992187500E-04
	img: 31x:31313131313131313131
	Remote write took, s :     .1189708709716797E-03
	img: 31x:31313131313131313131
	Remote read took, s :     .5912780761718750E-04
	img: 32x:32323232323232323232
	Remote write took, s :     .1139640808105469E-03
	img: 32x:32323232323232323232
	Remote read took, s :     .5888938903808594E-04
img: 33x:33333333333333333333
	Remote write took, s :     .1428127288818359E-03
	img: 33x:33333333333333333333
	Remote read took, s :     .4220008850097656E-04
	img: 34x:34343434343434343434
	Remote write took, s :     .1418590545654297E-03
	img: 34x:34343434343434343434
	Remote read took, s :     .4601478576660156E-04
	img: 35x:35353535353535353535
	Remote write took, s :     .1418590545654297E-03
	img: 35x:35353535353535353535
	Remote read took, s :     .4506111145019531E-04
	img: 36x:36363636363636363636
	Remote write took, s :     .1428127288818359E-03
	img: 36x:36363636363636363636
	Remote read took, s :     .4386901855468750E-04
	img: 37x:37373737373737373737
	Remote write took, s :     .1420974731445312E-03
	img: 37x:37373737373737373737
	Remote read took, s :     .4506111145019531E-04
	img: 38x:38383838383838383838
	Remote write took, s :     .1409053802490234E-03
	img: 38x:38383838383838383838
	Remote read took, s :     .4792213439941406E-04
	img: 39x:39393939393939393939
	Remote write took, s :     .1471042633056641E-03
	img: 39x:39393939393939393939
	Remote read took, s :     .4196166992187500E-04
	img: 40x:40404040404040404040
	Remote write took, s :     .1418590545654297E-03
	img: 40x:40404040404040404040
	Remote read took, s :     .4887580871582031E-04
	img: 41x:41414141414141414141
	Remote write took, s :     .1149177551269531E-03
	img: 41x:41414141414141414141
	Remote read took, s :     .3910064697265625E-04
	img: 42x:42424242424242424242
	Remote write took, s :     .1330375671386719E-03
	img: 42x:42424242424242424242
	Remote read took, s :     .3790855407714844E-04
	img: 43x:43434343434343434343
	Remote write took, s :     .1099109649658203E-03
	img: 43x:43434343434343434343
	Remote read took, s :     .3910064697265625E-04
	img: 44x:44444444444444444444
	Remote write took, s :     .1099109649658203E-03
	img: 44x:44444444444444444444
	Remote read took, s :     .4220008850097656E-04
	img: 45x:45454545454545454545
	Remote write took, s :     .1099109649658203E-03
	img: 45x:45454545454545454545
	Remote read took, s :     .4291534423828125E-04
	img: 46x:46464646464646464646
	Remote write took, s :     .1101493835449219E-03
	img: 46x:46464646464646464646
	Remote read took, s :     .4100799560546875E-04
	img: 47x:47474747474747474747
	Remote write took, s :     .1111030578613281E-03
	img: 47x:47474747474747474747
	Remote read took, s :     .4386901855468750E-04
	img: 48x:48484848484848484848
	Remote write took, s :     .1118183135986328E-03
	img: 48x:48484848484848484848
	Remote read took, s :     .5507469177246094E-04
	img: 49x:49494949494949494949
	Remote write took, s :     .1418590545654297E-03
	img: 49x:49494949494949494949
	Remote read took, s :     .4196166992187500E-04
	img: 50x:50505050505050505050
	Remote write took, s :     .1440048217773438E-03
	img: 50x:50505050505050505050
	Remote read took, s :     .4816055297851562E-04
Remote read took, s :     .4816055297851562E-04
	img: 51x:51515151515151515151
	Remote write took, s :     .1430511474609375E-03
	img: 51x:51515151515151515151
	Remote read took, s :     .4506111145019531E-04
	img: 52x:52525252525252525252
	Remote write took, s :     .1418590545654297E-03
	img: 52x:52525252525252525252
	Remote read took, s :     .4506111145019531E-04
	img: 53x:53535353535353535353
	Remote write took, s :     .1420974731445312E-03
	img: 53x:53535353535353535353
	Remote read took, s :     .4386901855468750E-04
	img: 54x:54545454545454545454
	Remote write took, s :     .1428127288818359E-03
	img: 54x:54545454545454545454
	Remote read took, s :     .4506111145019531E-04
	img: 55x:55555555555555555555
	Remote write took, s :     .1440048217773438E-03
	img: 55x:55555555555555555555
	Remote read took, s :     .4696846008300781E-04
	img: 56x:56565656565656565656
	Remote write took, s :     .1420974731445312E-03
	img: 56x:56565656565656565656
	Remote read took, s :     .4506111145019531E-04
	img: 57x:57575757575757575757
	Remote write took, s :     .1139640808105469E-03
	img: 57x:57575757575757575757
	Remote read took, s :     .4506111145019531E-04
	img: 58x:58585858585858585858
	Remote write took, s :     .1301765441894531E-03
	img: 58x:58585858585858585858
	Remote read took, s :     .4196166992187500E-04
	img: 59x:59595959595959595959
	Remote write took, s :     .1120567321777344E-03
	img: 59x:59595959595959595959
	Remote read took, s :     .3790855407714844E-04
	img: 60x:60606060606060606060
	Remote write took, s :     .1120567321777344E-03
	img: 60x:60606060606060606060
	Remote read took, s :     .3886222839355469E-04
	img: 61x:61616161616161616161
	Remote write took, s :     .1130104064941406E-03
	img: 61x:61616161616161616161
	Remote read took, s :     .4100799560546875E-04
	img: 62x:62626262626262626262
	Remote write took, s :     .1130104064941406E-03
	img: 62x:62626262626262626262
	Remote read took, s :     .4100799560546875E-04
	img: 63x:63636363636363636363
	Remote write took, s :     .1120567321777344E-03
	img: 63x:63636363636363636363
	Remote read took, s :     .4196166992187500E-04
	img: 64x:64646464646464646464
	Remote write took, s :     .1130104064941406E-03
	img: 64x:64646464646464646464
The fragment in question was this:
sync all
do i = 2, nimgs
	  if ( img .eq. 1 ) then
	    time1 = mytime()
	    x = x(:)
	    sync images ( i )
	    time2 = mytime()
	    write (*,"(a,g)") "Remote read took, s : ", time2-time1
	    write (*,"(a,i0,a,10(i0))") "img: ", i, "x:", x(:)
	  else if ( img .eq. i ) then
	    sync images( 1 )
	  end if
  if ( img .eq. 1 ) then
	    time1 = mytime()
	    x(:) = x
	    sync images ( i )
	    time2 = mytime()
	    write (*,"(a,g)") "Remote write took, s : ", time2-time1
	    write (*,"(a,i0,a,10(i0))") "img: ", i, "x:", x(:)
	  else if ( img .eq. i ) then
	    sync images( 1 )
	  end if
	end do
sync all
Thanks
Anton
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Ron is out of the office today - I've asked him to revisit this when he returns. I saw Bill Long's explanation.
 
					
				
				
			
		
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
