- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Sorry,
i've errounesly posted on HPC forum, but i thinks this is the rigth section.
Dear Intel developers,
i'm trying to do cache alignement over struct array defined as is:
[bash]struct complex_32 { float32 r; float32 i; }; typedef struct complex_32 complex32; static complex32 **traces; traces = (complex32 **)malloc( *num_elems * sizeof(complex32 *)); for (i = 0; i < *num_elems; i++) traces = (complex32 *)malloc( *num_samples * sizeof(complex32));[/bash]
i want to align for 16 bytes. Which is the right syntax using __declspec(align(16))) ?
Actually, using _mm_malloc instead of malloc, the code crashes on forst _mm_load_ps intrinsic.
Thanks in advance for the help.
Link Copied
5 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
__declspec will affect only variables creatd on stack or in global scope. When you allocate memory using malloc() it has no slightest clue what is it for, so __declspec is not affect it.
Though, normally, malloc returns pointers to blocks that are 16-bytes aligned.
Using _mm_malloc instead your first malloc, where you allocate array of POINTERS, is not good idea. But probably second malloc may be replaced. I can not say for sure because I haven't found _mm_malloc in my help files.
Though, normally, malloc returns pointers to blocks that are 16-bytes aligned.
Using _mm_malloc instead your first malloc, where you allocate array of POINTERS, is not good idea. But probably second malloc may be replaced. I can not say for sure because I haven't found _mm_malloc in my help files.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi archie,
thansk for the reply. My matrix has a global scope. You told that malloc return pointers alrready aligned. So, when i have to use _mm_malloc instead of standard malloc?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It's not question of scope, it's about how you create object of a class (structue). If you create object as a variable -- __declspec will work. For example:
complex_32 _i_ = { 0.0, 1.0 };
will use __declspec, because compiler knows the type of object to be created.
When u use void* malloc(size_t) -- the function can not know what for it does allocate memory. It just allocates it.
About _mm_malloc, as I said, I know nothing, so I just made wild guess based on its name.
complex_32 _i_ = { 0.0, 1.0 };
will use __declspec, because compiler knows the type of object to be created.
When u use void* malloc(size_t) -- the function can not know what for it does allocate memory. It just allocates it.
About _mm_malloc, as I said, I know nothing, so I just made wild guess based on its name.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
beware that you can't count on malloc 16B alignment, see for example this thread :
http://software.intel.com/en-us/forums/showthread.php?t=74181&o=d&s=lr
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I would never blindly rely on any assuptions about malloc, I've just said that it tends to return aligned pointers, but I understand it can change at any time. Anyway, it's trivial to request 16 bytes more than necessary, check alignment and adjust pointer if necessary.
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page