- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Is anyone else seeing this question...?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi
We are checking on this internally. Will get back to you soon with an update.
Thanks
Arun
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@H-Takeda could you run "clinfo" command and see what the "Max compute units" is? "clinfo" comes with OpenCL driver.
e.g. my clinfo output below.
Platform Name Intel(R) OpenCL HD Graphics
Number of devices 1
Device Name Intel(R) Graphics [0x3e92]
Device Vendor Intel(R) Corporation
Device Vendor ID 0x8086
Device Version OpenCL 3.0 NEO
Driver Version 21.01.18793
Device OpenCL C Version OpenCL C 3.0
Device Type GPU
Device Profile FULL_PROFILE
Device Available Yes
Compiler Available Yes
Linker Available Yes
Max compute units 24
Max clock frequency 1200MHz
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@JenniferJ OK.
I have attached the results of running clinfo as a text file.
Here is an excerpt of the important part.
Platform Name Intel(R) OpenCL HD Graphics
Number of devices 1
Device Name Intel(R) Gen9 HD Graphics NEO
Device Vendor Intel(R) Corporation
Device Vendor ID 0x8086
Device Version OpenCL 2.1 NEO
Driver Version 19.41.14441
Device OpenCL C Version OpenCL C 2.0
Device Type GPU
Device Profile FULL_PROFILE
Device Available Yes
Compiler Available Yes
Linker Available Yes
Max compute units 47
Max clock frequency 1050MHz
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@H-Takeda From your clinfo output, your driver version "19.41.14441" is really old. do you have to use this version? can you try the latest driver to see if it works?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@H-Takeda, the EU count that's reported via clinfo under OpenCL is correct. The value reported comes directly from the i915 kernel mode driver on Linux. This kernel mode driver queries fuse registers to determine which EUs are valid and functional as expected.
In some SKUs, there can be an additional EU -- sometimes two or more. Product SKUs are defined as having a minimum number of EUs but can be manufactured from a larger die. If the die has too many defects to meet the minimum requirements, it is not shipped to customers. However, in some cases, the die has no defects and can yield a 'bonus' EU or two.
This is similar to how some CPUs can be overclocked to a higher frequency than others. However, the minimum clock frequency reported is a product requirement. To avoid possible errors and unhappy customers the "true" minimum frequency is usually higher than the minimum on the box. This happens with EU counts as well. In this case, it results in a 'bonus' EU for those lucky enough to receive them.
I hope this helps clarify the situation!
Regards,
Brandon Fliflet
GPU Software Architect
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Brandon_F_Intel in this case the EU is 1 count less than the expected number. it should be 48, but clinfo reports 47. does it mean one EU is bad? Thanks!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@JenniferJ, I tried to look up the official specs for Iris Plus Graphics 655 and the ones that were visible didn't mention any specific quantities of EUs to confirm the claim a 655 is supposed to have no less than 48 EUs. An EU is fused off during the chip test and sorting during the manufacturing process. If an EU was bad, it is fused off and gets assigned to a specific SKU such as 655 or others based on what's functioning correctly prior to reaching the customer. It's not uncommon to have many 'downbins' from a single die based on expected manufacturing defects. As the manufacturing process improves over time, there are fewer and fewer downbins for a certain master die.
In this case, the real question becomes what are the Intel product specifications for 655. I suspect 655 can have either 47 or 48, if you're lucky. I suspect his isn't a defective part. @H-Takeda just wasn't one of the lucky ones -- no different than some CPUs from the same die being capable of overclocking to 4.5GHz or 5Ghz. This also happens with GPU frequency as well. In some cases, that same 48 EU part can get sorted into a higher class SKU especially if it's also stable at a higher GPU frequency. i7s are close to perfection, i5s less, and even less with i3s. A specific SKU is guaranteed to meet its minimum requirements. I hope this helps!
Regards,
Brandon Fliflet
GPU Software Architect
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you @Brandon_F_Intel
Oh...
So everyone has the possibility to buy the Iris Plus Graphics 655, which is only 47EUs...
If we believe your research, Intel's product specs don't guarantee that it will have 48EUs.
This is troubling to me...
Does Intel post this anywhere publicly?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you, @H-Takeda, for the prompt response. The specifications for Intel(r) Core(tm) i3-8109U are publicly located at intel-core-i3-8109u-processor-4m-cache-up-to-3-60-ghz. Silicon manufacturing product binning, https://en.wikipedia.org/wiki/Product_binning, is a regular technique to provide multiple SKUs at several different price-performance points to maximize useful and beneficial products to customers by virtually all semiconductor manufacturers. The product specifications define a minimum bar of functional and quality requirements. As Intel(r) prides itself on quality products, just clearing the minimum bar is not an option as it can cause potentially unsatisfied customers. In some cases, people recognize extra features at no additional cost.
If you are seeing different product specifications for your processor, please let us know so we can correct any gaps in expectation. We're extremely pleased with Iris Plus Graphics 655's capabilities and want you to feel the same way.
Kind regards,
Brandon Fliflet
GPU SW Architect
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Another comment in regards to 'Max compute units' as reported by clinfo and OpenCL capabilities. It is often inferred as the # of EUs in Intel(r) GPUs. The definition of a 'compute unit' can and will change over time and shouldn't be construed so precisely as an execution unit, EU. A 'compute unit' as reported within OpenCL means something different to different vendors -- including GPUs, CPUs, FPGAs, etc. At best it can only be inferred as a coarse relative indicator of compute performance in various SKUs from a single vendor within a certain family. A new family could introduce a wider or narrower 'compute unit' with different instructions and capabilities. I hope this helps and look forward to your benchmarking results!
Regards,
Brandon Fliflet
GPU SW Architect
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you for your kind reply.
For now, I'm satisfied.
As an extra question, what kind of impact do you think missing one EU would have on performance?
(especially the impact on benchmarks).
If you have any predictions, please let me know.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @H-Takeda,
The performance impact of a single EU (out of 48) is about 2% in purely compute limited workload. If the workload is memory bandwidth limited, it won't have any performance impact. If the workload is limited by complex math such as transcendentals, it also won't have any impact. If you're trying to construct a benchmark, I would encourage you to focus on a specific aspect or feature in isolation or multiple aspects each in isolation. Once you can do that comfortably, then you can tie it to a real world scenario. That's how we benchmark our products internally. If you encounter puzzling results, please let us know here and we can help understand what's happening. Happy benchmarking!
Regards,
Brandon Fliflet
GPU Software Architect
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page