- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
hmm ... I have just found simple solution to operator "==".
bool operator == (const F32vec4 &a)
{
F32vec4 v = _mm_cmpeq_ps(vec, a);
return v.x && v.y && v.z && v.w;
}
I don't know if it's much more efficient then using ordinary c code
I guess doing just
bool operator == (const F32vec4 &a)
{
return x == v.x && y == v.y && z == v.z && w == v.w;
}
is probably faster & sompler, but finding simd way was educational :).
bool operator == (const F32vec4 &a)
{
F32vec4 v = _mm_cmpeq_ps(vec, a);
return v.x && v.y && v.z && v.w;
}
I don't know if it's much more efficient then using ordinary c code
I guess doing just
bool operator == (const F32vec4 &a)
{
return x == v.x && y == v.y && z == v.z && w == v.w;
}
is probably faster & sompler, but finding simd way was educational :).
Link Copied
4 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The former case (v.x && v.y && v.z && v.w) computes 0. or Junk.Junk. And the return is therefore either
return 0.;
or
return Junk.Junk;
But the return type is bool therefore an implicit conversion will be attempted where 0. implies FALSE and Junk.Junk implies TRUE. IMHO the first operator returns .NOT. what you want it to.
You are on the right track though where you want to perform the 4xDWORD xor then test for all 0's.
Jim
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
yeah - you are right Jim.
having vectors with all values same or all values equal yielded correct result
however vectors where some values are equal and some not gave wrong result (like v1 = 1,0,3,1 & v2 = 1,3,4,1 would give bad result.
I have changed comparison to be
inline bool _vec4f::operator == (const _vec4f &vec)
{
_vec4f v = _mm_cmpeq_ps(vec, m);
return (int)v.x & (int)v.y & (int)v.z & (int)v.w;
}
or in case of != operator:
inline bool _vec4f::operator != (const _vec4f &vec)
{
_vec4f v = _mm_cmpneq_ps(vec, m);
return (int)v.x | (int)v.y | (int)v.z | (int)v.w;
}
which seems to give correct result in all cases
however that code doesn't seem like something that benefits from sse optimization :).
having vectors with all values same or all values equal yielded correct result
however vectors where some values are equal and some not gave wrong result (like v1 = 1,0,3,1 & v2 = 1,3,4,1 would give bad result.
I have changed comparison to be
inline bool _vec4f::operator == (const _vec4f &vec)
{
_vec4f v = _mm_cmpeq_ps(vec, m);
return (int)v.x & (int)v.y & (int)v.z & (int)v.w;
}
or in case of != operator:
inline bool _vec4f::operator != (const _vec4f &vec)
{
_vec4f v = _mm_cmpneq_ps(vec, m);
return (int)v.x | (int)v.y | (int)v.z | (int)v.w;
}
which seems to give correct result in all cases
however that code doesn't seem like something that benefits from sse optimization :).
Message Edited by memory_leak on 11-05-2005 03:09 PM
Message Edited by memory_leak on 11-05-2005 03:10 PM
Message Edited by memory_leak on 11-05-2005 03:11 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
for the Christ - where those smileys came from??? :D
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Use _mm_movemask_ps which creates a 4 bit mask for the comparison result. You can check the mask then like this:
if(_mm_movemask_ps(_mm_cmpeq(v, zero)))
return true;
else
return false;
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page