- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
As hinted here: https://github.com/MetaScale/nt2/commit/f398ddb886cd4c9526276431dcadbeb066c9fd00
Code using _mm256_unpacklo_pd/_mm256_unpackhi_pd in cunjunction with _mm256_permute2f128_pd produce wrong code. Moreover, the exact same code pattern being used elsewhere with a different constant mask for permute is working correctly. The workaroudn usign volatile is of course underperforming.
Same code works and codegen correctly on gcc and clang.
Link Copied
0 Replies
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page