Intel® Integrated Performance Primitives
Deliberate problems developing high-performance vision, signal, security, and storage applications.
6708 Discussions

Question about best performance using ipp CrossCorrNorm function.

DoS
Beginner
1,016 Views

Hi,

 

I have question about function which is documented well here.

 

1. Is it true that using different RAM types affects the computing speed of this function?

 

For example.

Using Intel i9-12900K

DDR4 64GB 3200 MHz CL18   vs  DDR5 64GB 4800MHz CL38

 

Looking at the example above, will the computing speed be faster if I use ddr5 memory?

 

 

2. Which is the best Intel processor that will have fastest computing speed per thread?

 

 

0 Kudos
4 Replies
ShanmukhS_Intel
Moderator
903 Views

Hi,

 

Thank you for posting on Intel Communities.

 

We are working on your issue internally. We will get back to you soon with an update.

 

Best Regards,

Shanmukh.SS

 

0 Kudos
Abhinav_S_Intel
Moderator
834 Views

We can better answer your questions if you can provide us with some more information on the input image size, input template size, data type and which function switches are used which means algorithm (auto, FFT, direct) along with correlation used is valid or full. Please post as much information you can provide with the query.

0 Kudos
DoS
Beginner
812 Views
Hi Abhinav_S,

thanks for your answer.

I use:

function = ippsCrossCorrNorm_32f
method = ippAlgFFT
mode =  ippiROIValid
normmode = ippiNorm
data type = float32
values for input image and template = random from 0 to 20


About template, using Python it can be created like:
```
ref_template =  [np.asarray(np.random.randint(5000, 50000)) for _ in range(50000)]
```

I attached txt file with templates created.

You can load it and create values of each template this way:
```
with open(txt_file) as f:
    lines = f.read().splitlines()
ref_templ =  [np.random.randint(low=0, high=20, size=int(ref_size)).astype(np.float32) for ref_size in lines]
```


input image:
```
input_image = np.random.randint(low=0, high=20, size=(500000)).astype(np.float32)
```
0 Kudos
Abhinav_S_Intel
Moderator
751 Views

Analysis needs to be done on how many threads are effective for a particular set of {image size, template size, data type}. The functionality is not threaded in the main IPP library but it has a threaded implementation available with the TL libs. The no. of threads depends on ratios images.width/tpl.width & image.height/tpl.height. The higher is the ratio, the more threads can be used. We can't answer on exactly how many threads can be used.

 

The memory type (memory speed) may depend on the load size (input images) which if doesn't fit into the 0.5 of LLC (last level cache). There will be a lot of memory-to-memory operations,  therefore the higher is the speed of memory relative to CPU frequency - the more execution speed.

0 Kudos
Reply