- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If you want to request the compiler do simple unrolling, you can simply use the pragma (choose your number):
#pragma unroll(4)
The compiler's own automatic unrolling is more likely to be excessive than conservative. Loops which are suitable are usually unrolled by 8 already. A likely reason you might want to dictate the unrolling is to fit evenly into an expected loop count.Depending on which architecture you are using, various reasons may exist why excessive unrolling prevents optimization. On an SSE architecture, you want to fitbuilt-in parallelism.Xeon trace cache also has some automatic unrolling properties. I don't know whether the compiler foresees that.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes, I doknow the compiler can do automatic, or pragma-specified,unrolling. What I'm doing right now is trying to get a handle on the underlying performance of the code when it's unrolled a specified amount. Or in other words, how much benefit do I get if I unroll by 2, 4, 8, or whatever? Since the application I'm working on can have loop counts in the millions, there's room for a lot :-) And I ran across this very strange behavior in the process...
Which brings up an interesting thought: suppose I tell it, via the pragma, to unroll 20 times. Will the compiler merrily convert the unrolling it does to function calls? Stay tuned...
James
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page