- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I'm currently working on the CLPP library (http://code.google.com/p/clpp/) and mainly on the CPU implementation of the 'sort' algorithm.
I currently have a 'special' sort algorithm for the CPU only but after launching the benchmark I have
see that the std::sort algorithm is faster than the OpenCL one.
So, it simply mean that it remain some work to have a really fast sort on the CPU. I would like to know if the Intel Engineers has some councils to give, some papers to reference etc...
An important part would be use this algorithm on the Sandy Bridge too !
Thanks
Krys
Link Copied
2 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
Which sorting algorithm do you use? There's a lot of literature abouve parallel sorting algorithms and their relative merits.
Which sorting algorithm do you use? There's a lot of literature abouve parallel sorting algorithms and their relative merits.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
for SIMD machines (like CPUs) BitonicSort is typically used. Try one from Intel OCL SDK, there is a dedicated sample there.
Also from the general point, the specific sorting algo to select should match the task, e.g. BubbleSort is fastest approachif the input sequence is almost sorted, demostrating O(N). While the same BubbleSort is the sloweset with O(N*N), when the input sequence is simply inversed.
From this side the Bitonic exhibits the same garaunteed complexity of n*log(n) and fits most tasks.
for SIMD machines (like CPUs) BitonicSort is typically used. Try one from Intel OCL SDK, there is a dedicated sample there.
Also from the general point, the specific sorting algo to select should match the task, e.g. BubbleSort is fastest approachif the input sequence is almost sorted, demostrating O(N). While the same BubbleSort is the sloweset with O(N*N), when the input sequence is simply inversed.
From this side the Bitonic exhibits the same garaunteed complexity of n*log(n) and fits most tasks.
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page