- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
James,
The inline assembly syntaxis really a property of the native compiler and not Intel architecture. You should consult Microsoft docsfor inline assembly details on Windows and GCC (AT&T assembly syntax)docs for gcc assembly.
The Intel Binary Compatibility Specification defines what registers must be saved by a called function - I don't believe the xmm registers are covered here. With GCC assembly it is possible to specify constraints and it may be possible to communicate to the compiler that particular xmm registers are clobbered (I have never tried this so that's why I say may).
Another option is to use intrinsics, see xmmintrin.h and emmintrin.h in our include directory (/opt/intel_cc_80/include). Using intrinsics, you get the benefits of inline asm, but allow the compiler to be aware of the registers in use.
Max
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Max,
Thanks for the reply. Unfortunately, I don't have access to current Microsoft assembler documentation (I work on Linux clusters these days, and am of course using the Intel compiler), and while I have written a good bit of assembly, thatstopped about the time the 486 came out. As for Gnu assembly... well, it's like anything else those people do when not constrained by the necessity of matching an existing interface: totally incomprehensible. In any case, my question really isn't so much about the syntax - I can figure that out with a bit of work - but about the best way to do it.
I wouldn't mind translating the code into intrinsics, IF I understood what it's doing. I agree with you on the benefits, but I don't see how to do it without some understanding of the code. I've had a message posted here for several weeks, requestingbackground or sourcefor thealgorithm, but have gotten no useful response.
In any case, my question really isn't so much about syntax -I can figure that out with a bit of work - but about the most efficient way to save the xmm regs. I don't mind tweaking the existing assembly to save the xmm regs, but how should Ido it? Can they be PUSHed & POPped, or moved to stack or tempspace somehow? Is there some documentation that coversthis sort of thing?
Thanks,
James
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The Microsoft ABI for AMD64 (I assume Intel is working to change that designation) specifies that xmm registers are included in those which are automatically saved and restored, while the x87 fp registers are not.
Now that you say that you are working on linux, it could make a difference which kernel and glibc you are using. I think that current 2.4.xx kernels should accomplish this, and certainly all x86-64 kernels will. The kernels would have to be built with a gcc which supports the xmm registers correctly, which rules out 2.9x and some early 3.0 versions, I think. Evidently, if you go back far enough, you will find kernels and gcc versions which were totally unaware of xmm.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Wouldn't that automatic saving only apply to task switching? This is just for a simple function call -I don't really see why (or how) the kernel would be involved.
That's all the more so because the code in the function is only about 50 instructions, hence should execute in only several tens of nanoseconds per call, where task switching granularity would, IIRC, be on the order of milliseconds.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
James,
To confirm: you are using inline assembly on with our Linux compiler. Correct?
With gcc-style inline assembly there is a way of specifying input and output registers and thus communicate to the compiler which registers need to be saved before entry into the inline asm block (which I believe is the core of what you want to know).
Can you post a snippet of your code, preferably code that compiles and provides some indication that it either worked or didn't work? If you do that, I can get back to you with syntax for the input-output section using xmm registers.
Thanks,
Max
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
James,
I played around with this for fun and have a program that shows the idea in regard to gcc style assembly.
Compile the program below and run it.
icc foo4.cpp; a.out
icc -DWORKS foo4.cpp; a.out
You will see different behavior because the xmm register gets overwritten. Take a look at the program and let me know if you have any questions. For more details on gnu style assembly see www.ibiblio.org/ldp/GCC-Inline-Assembly-HOWTO.html#s6
Max
#include
#include
inline int foobar(void)
{
int i;
int r1 = 0;
__m128 y = _mm_set_ps(1.0, 1.0, 1.0, 1.0);
__asm ("movups %1, %%xmm0
"
"movups %1, %%xmm1
"
"movups %1, %%xmm2
"
"movups %1, %%xmm3
"
"movups %1, %%xmm4
"
"movups %1, %%xmm5
"
"movups %1, %%xmm6
"
: "=r" (r1)
: "x" (y)
#ifdef WORKS
: "%xmm0", "%xmm1", "%xmm2", "%xmm3", "%xmm4", "%xmm5", "%xmm6"
#else
: "%xmm0"
#endif
);
return r1;
}
int main()
{
int i;
float z = 0.0;
float value = 0.0;
F32vec4 y;
y = F32vec4(1.0,2.0,3.0,4.0);
foobar();
value = add_horizontal(y);
printf ("%f
", value);
return z;
}
Message Edited by mjdomeik on 04-22-2004 11:25 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Max,
Yes, that's correct. Inline assembly - standard format, not Gnu - with the Linux compiler. Version 8.
The code is on my home machine, so I can't post it now. (And I can't access this site from home. Because of Intel's Internet Explorer only block on access, I have to come to the lab and borrow the secretary's Windoze machine.) I'll work up something over the weekend, and post it Monday.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Well, that was interesting :-) Right in the middle of typing my message, IE thinks I've written enough, and decides to post it. Yet another reason I hate Windoze.
Anyway, the test program looks like
int main ()
{
__m128 X, Y;
float *xp, *yp;
xp = (float *) &X;
yp = (float *) &Y;
*(xp) = *(xp+1) = *(xp+2) = *(xp+2) = 1.0;
Y = am_exp_ps (X);
printf ("%f, %f, %f, %f ", *(yp), *(yp+1), *(yp+2), *(yp+3));
}
When linked with AMaths.c, it returns approximately correct values.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
James,
O.k. To recap what you are doing and your options:
You have taken the Approximate Math library and have attempted to port it from Windows to Linux.
I believe you have one of two options:
1. Keep the function call non-inline. Have you confirmed whether or not that works?
2. Define input and output registers in your gcc asm statement. I have provided an example of how to do this.
Please let me know if these two options make sense and which you decide to try.
I will also see if I can ping the author of the library and see if he has any comments.
Regards,
Max
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Got sidetracked there for a while, but I think I have found the problem and a fix. The problem is the undocumented (as far as I can find, anyway) "_declspec (naked)" directive in the functions. It seems to cause the code to be compiled without a return, so the function just keeps on going until it hits some other return instruction, at which point the xmm0 register has beensomeother value
The asm code had a "ret 16" instruction, but that didn't work for some reason.
Anyway, I think I can get it working rignt from here.
Thanks,
James
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
James,
Great to hear you have a route to success!
Max
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page