Sounds reasonable. I did not really expect to find such an operation either. I though that there might have been a routine that could e.g. load a 32bit value from 4 memory locations at the same time, but that migtht not be that easy to implement
. Doing this by using a regular for loop works OK though.
What I am doing is this:
iKhat = (int)floor(pTi
iKhat = max(0,iKhat); iKhat = min(iKhat,pLength-2);
iTfractional = pTi
= iCoefs[iKhat+1]*iTfractional + iCoefs[iKhat]*(1.0f-iTfractional);
The flooring,max/min, the subtraction,addition and multiplication can be done using SIMD, but its the retrieving of the coefs from the remapped positions kHat which is not possible to do using some kind of SIMD. But this might not be so performance degrading.