- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
So, two questions:
1. Does MKL have an API for doing fast 3D DCTs?
2. A (real->real) DCT should ideally be about 4x faster than a complex->complex FFT on the same number of elements. Does MKL exhibit this behavior? How efficiently does it reduce the DCT to an FFT.
thanks a lot!!
-Nikunj.
--
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
thanks for the reply!
-Nikunj.
--
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Is there any updates related to this thread question?
Is the 3D DCT available in MKL? If yes, please give an example.
Thanks
Andriy
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Is there any updates related to this thread question?
Is the 3D DCT available in MKL? If yes, please give an example.
Thanks
Andriy
Hi Andriy,
3D DCT is not in MKL exactly but there is a variant to create it by combination of 1D DCT, for example:
tt_type = MKL_COSINE_TRANSFORM;
d_init_trig_transform(&n_x,&tt_type_x,ipar_x,dpar_x,&ir);
d_init_trig_transform(&n_y,&tt_type_y,ipar_y,dpar_y,&ir);
d_init_trig_transform(&n_z,&tt_type_z,ipar_z,dpar_z,&ir);
d_commit_trig_transform(f,&handle_x,ipar_x,dpar_x,&ir);
d_commit_trig_transform(f,&handle_y,ipar_y,dpar_y,&ir);
d_commit_trig_transform(f,&handle_z,ipar_z,dpar_z,&ir);
Loop over j,k {d_backward_trig_transform(f(:,j,k),&handle_x,ipar_x,dpar_x,&ir);}
Loop over i,k {d_backward_trig_transform(f(i,:,k),&handle_y,ipar_y,dpar_y,&ir);}
Loop over i,j {d_backward_trig_transform(f(i,j,:),&handle_z,ipar_z,dpar_z,&ir);}
free_trig_transform(&handle_x,ipar_x,&ir);
free_trig_transform(&handle_y,ipar_y,&ir);
free_trig_transform(&handle_z,ipar_z,&ir);
But if you want to see 3D DCT functionality in MKL you can file feature request at https://premier.intel.com
With best regards,
Alexander
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
associated with openmp? I have 3D variable and 2D of that variable need to be trig_transformed. So I want to
use such subroutines with openmp.
Could you please give me an example? I appreciate that. I tried many forms but I only succeed in without openmp case.
thanks!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi!
There is not 2D version of TT with OpenMP parallelization but one can construct it from 1D TT himself. For example:
tt_type = MKL_COSINE_TRANSFORM;
d_init_trig_transform(&n_x,&tt_type_x,ipar_x,dpar_x,&ir);
d_init_trig_transform(&n_y,&tt_type_y,ipar_y,dpar_y,&ir);
ipar_x[9] = number_of_threads;
ipar_y[9] = number_of_threads;
d_commit_trig_transform(f,&handle_x,ipar_x,dpar_x,&ir);
d_commit_trig_transform(f,&handle_y,ipar_y,dpar_y,&ir);
#OMP parallel for
Loop over j {d_backward_trig_transform(f(:,j,k),&handle_x,ipar_x,dpar_x,&ir);}
#OMP end parallel
#OMP parallel for
#OMP end parallel
free_trig_transform(&handle_x,ipar_x,&ir);
free_trig_transform(&handle_y,ipar_y,&ir);
Ipar[9] Specifies the number of OpenMP threads to run TT routines in the OpenMP
environment of the Poisson Library.
Is this variant suitable for you or not?
With best regards,
Alexander Kalinkin
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I really make it work with 8 threads, but, only with variable f(33,33,10) , for f of other size, the results
are incorrect, strange. Could you please help me with some hints?
At the same time, I notice that the overhead problem for trig_transform is serious---with 8 threads, it
is even slower than using single threads! Surpringly! Maybe I make something wrong here?
thanks, zhigang
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The machine model I am using is: Intel Xeon CPU E5430 @ 2.66GHz
In fact, I was delighted by the performace of trig_transform at first. Then I want to see what is its performance
with openMP. I appreciate that if you can help me with this. I tried in many ways but failed.
thanks!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page