- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I wanted to ask for feedback on an emulation of the VP2INTERSECT instructions:
https://arxiv.org/abs/2112.06342
The emulation is faster than the native instructions when only one of the output masks is returned. I consider the following three applications of VP2INTERSECT instructions:
- computing the intersection (common elements) of two arrays of integers (whether sorted or unsorted),
- computing the size of the intersection of two arrays of integers,
- removing common elements from two arrays of integers.
Only 3. requires both output masks, while 1. and 2. only need one.
Since the name of the instructions is VP2INTERSECT, I presume that the main application is 1. (possibly 2.), in which case a fast emulation could be useful.
But I may be wrong, so would like to ask if the two cases above (computing the intersection, or the size of the intersection of two arrays of integers) are the intended (or expected most frequent) use cases for these instructions?.
Thank you.
Link Copied

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page