Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.
The Intel sign-in experience has changed to support enhanced security controls. If you sign in, click here for more information.

Data thinning



Can someone recommend routine(s) that would help with data thinning based on Euclidean distance between points? The locations may be  e.g. latitudes and longitudes.



0 Kudos
2 Replies
Black Belt

I don't know of any such routines, but it appears to me that no routine is needed. Please expand on what you mean by "data thinning", and why you wish to use euclidean distance instead of geodesic distance.

If you really wish to use euclidean distance, use the elementary transformation from spherical polar coordinates to rectangular coordinates. If you only wish to rank items by distance, the distance2 is equally good and more convenient for sorting.

If the distances are large (relative to the radius of the sphere), you really should use the great circle distance since the flat-earth approximation is quite inaccurate. See the Wikipedia article on the Haversine Formula: . Again, if ranking is the goal, it may be sufficient to rank by {sin(distance/diameter)}2 instead of the distances, You may also consider pre-computing and storing a table of haversines.

The distance calculation is more complicated if you wish to account for deviations from sphericity.


Thanks mecej4 for your response.

Yes, I will use the geodesic distance but wanted to make it simple since I thought more people would understand what I mean.

The thinning is for satellite observations which might be available at very high spatial resolutions (i.e. very close distance from each other) but such density is required for certain applications. So I am looking for a code that would retain sample of these observations. E.g. data come in pixels that are 1 km apart but keeping just pixels that are 50 km apart is sufficient for this purpose.  There are some thinning algorithms but writing own code that would include parallel processing can be quite tedious. So, I am looking for a shortcut to use a code that is already available possibly for a similar purpose. Thanks for any suggestions.