- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We have released a new whitepaper explaining the architecture of Intel(r) Processor Graphics Gen7.5, specifically the components associated with running compute applications on Processor Graphics. This is the architecture that supports compute APIs such as OpenCL, DirectX Compute Shader, Renderscript, C++AMP, etc.
- You can find it, 3rd link down, on this page: https://software.intel.com/en-us/articles/intel-graphics-developers-guides
- Or more directly, here: https://software.intel.com/en-us/file/compute-architecture-of-intel-processor-graphics-gen7dot5-aug4-2014pdf
We'd be grateful to hear your feedback on the whitepaper contents. Tell us what you think!
regards -stephen
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Great job!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'd be a lot more interested in Intel's opencl if
1) There was GPU supported fp64. Ideally 1:2 performance ratio
2) The CPU supported a vector fp64 rsqrt
3) The native CPU and GPU fp64 rsqrt had opencl level numerical precision.
It would be good if there was a document comparing the native (double) precision (in ulp) of the different hardware.Nvidia, AMD and Intel)
My perception is that AMD and Nvidia have higher native precision. Or at least they are constantly striving to improve them. eg the 290X has "precision improvements to the native LOG and EXP operations" and so on.
My impression is that Intel expects you to buy a software library that has to do many more iterations to achieve the same result.
Maybe this is just a case of insufficient documentation?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Stephen,
Your Gen8 presentation and whitepaper are excellent.
The PDFs were posted here: https://intel.activeevents.com/sf14/connect/sessionDetail.ww?SESSION_ID=1312
The Gen8 IGP looks like it has become a truly general-purpose compute platform.
One question, are FP16 FMA operations available? If so, can each EU perform 16 FP16 FMAs per clock?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Architecture, yes; driver not yet.
Intel processor graphics architecture supports 16bit float FMAs as of Gen8.
In the API example of OpenCL 1.2, 2.0 the OpenCL C “half” data type for 16 bit floats is currently an “optional” extension feature enabled by Khronos cl_khr_fp16 extension.
Perhaps as feedback for feature prioritization decisions, could you possibly say what OS platform, which compute API, and what kind of device (tablet, laptop, server…) you seek to target with FP16?
Thanks!
regards, -stephen
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for the clarification.
I'm looking to use FP16 FMA's in an OpenCL kernel on both Windows and Linux (and someday OS X).
As far as devices, anything with a display -- tablet/laptop/workstation.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page