Assume I have the following:
Ipp64fVector my_all_real_vector Ipp64fcVector my_complex_vector
and I want to multiply them.
Is it faster to split complex and do 2 real multiplies and rebuild complex, or make real into complex and do complex multiplies??
ippsCplxToReal_64fc(my_complex_vector, my_real_parts, my_imag_parts); ippsMul_64f_I(my_all_real_vector, my_real_parts); ippsMul_64f_I(my_all_real_vector, my_imag_parts); ippsRealToCplx_64f(my_real_parts, my_imag_parts, my_complex_vector);
ippsRealToCplx_64f(my_all_real_vector, NULL, my_real_but_complex_vec); ippsMul_64fc_I(my_real_but_complex_vec, my_complex_vector);
I did a quick test to compare and they came out close. I expected the real multiplies to beat the complex, but maybe the overhead to repack is significant. What is the recommended best practice in this scenario? (my vector lengths ~1k to ~26k)
It is a shame there is not a 64-bit implementation of ippsMul_32f32fc_I() as that appears to be what you need!
Maybe feature request it from Intel? Alternatively, is there a way you could represent your data with 32-bit precision instead?
I think the appropriate scenario for this case is this:
ippsMul_64f_I(my_all_real_vector, (Ipp64f *)my_complex_vector, length);
ippsMul_64f_I(my_all_real_vector, (Ipp64f *)my_complex_vector + length, length);
My scenario is not appropriate.
Perhaps right scenario is such:
ippsRealToCplx_64f(my_all_real_vector, my_all_real_vector,, tmp_complex_vector, length);
ippsMul_64f_I((Ipp64f *)tmp_complex_vector, (Ipp64f *)my_complex_vector, length+length);
I'm sorry once more.