What are all the Model/Family numbers possible for the upcoming Nehalem line of CPUs? I am currently optimizing the encoding application x264 using a pre-release Nehalem chip, but I only know of the model/family number of this chip I have here.
Is there anywhere I can find an exhaustive list that I can use for the CPU identification code?
SSE4.2 actually sounds like a decent way to do it also; detecting the presence of SSE4.2 will surely detect the presence of a Nehalem core (and, likely, all later CPUs). I'll try using that instead.
Then again, I can't commit the code until Nehalem release anyways, since the Nehalem-specific optimizations I'm working on probably contain enough information to violate the NDA if I released it earlier.
I assume your interest in incorporating microarchitecture-specific optimization is the motivation for seeking family/model information. If I understand correctly, you might be pursuing SSE4 code in your encoding application with different code paths to optimize for Penryn versus Nehalem microarchitecture.
You can find in Appendix C of the Optimization manual, Table C-1 lists SIMD instruction set support for different processor families. However, Table C-1 only covers processor families that have launched. Nehalem family will be rolling out soon, so we will be updating our documentation in the near future. Butat the current time,that still falls under unreleased product information.
Please note that, un-released product detail is generally not covered in public docs, but may be available under non-disclosure channel. The channel to get access to un-released product information is the same as you would with your Nehalem prototype system.
Additionally, some of the newer processor family may span more thanone value of"enhanced model"/"model" encoding. For example, the six-core Dunnington processor, known as Intel Xeon processor 7400 series, has a different "enhanced model-model encoding" than the other Penryn family (Intel Xeon processor 5400 series and several product lines in the Intel Core 2 processor family). The Penryn family has a signature of DisplayFamily = 6 and DisplayModel = 17H (where enhanced model encoding is 1, model encoding is 7). The Dunnington has the same DisplayFamily encoding, the DisplayModel encoding is 1DH.
For Nehalem processor family, which will also span multiple values of DisplayFamily/DisplayModel encodings, please follow up through your contact for the Nehalem prototype system. The same channel can also provide you with access to Tuning information on SSE4.2 and Nehalem microarchitecture.
Thanks for the detailed information! I will potentially both be doing some SSE4 optimizations, along with a number of changes to which assembly functions are loaded due to the changes in load latencies and instruction timings on Nehalem.
It sounds like SSE4.2 detection is the best way to go here, as you said there will be a large numbers of models in the Nehalem series with rather complex model/family encoding.
I asked here because the channel through which I have to get information on the prototype system has been rather slow in coming, most likely because I have to go through a number of people to reach the person who has direct contact with Intel (organizational issues on my side I would suspect).
You mentioned "tuning information"; what kind of information falls under this category? I have already done in-depth analysis using mubench (a full run of --pairs for analysis of execution unit relationships, among other things); what other information is available under that category--uop breakdowns?
More details on coding guidelines for SSE4.2 and code examples, including strstr equivalent implementation using SSE4.2, token extraction and a few others are covered there.