Summary: USB endpoints based around the USB 2.0 highspeed Cypress FX2LP (http://www.cypress.com/products/ez-usb-fx2lp EZ-USB FX2LP™ | Cypress) using bulk transfers from endpoint to Intel PCH results in FX2LP buffer overflows due to insufficient polling rate by PCH. Previous generation of Intel processors and also "less powerful" solutions (such as Broadcom SoC based host controllers) did not present this issue. Product is existing in the marketplace.
More Detail: USB endpoint is a data streaming device. The expectation is that the data will not be lost as even "small" 16bit microcontroller based USB hosts have been shown to consume the data fast enough to prevent data loss. The theory is that some power/thermal/management/etc process is preventing PCH from polling fast enough to keep up with the required data rate (approx. 20MB/s). I do have Lecroy USB protocol analyzer traces available which demonstrate a request with lost data and the ability to provide more traces if needed.
Our endpoint uses a "bundled" solution with the Microsoft Surface + endpoint. The issue was first found when the (Skylake based) Surface Pro 4 was tested (Surface Pro 3 - which is based on Gen5 Intel is fine). A second Skylake solution, the Lenovo P70 (Xeon), was tested and found to have the same issue. Otherwise, the solution has been tested on everything from SoC/ARM/previous gen Intel and similar failures did not occur. Therefore, the issue has been identified as specific to Skylake.
Over time, the ability to reproduce the issue has gone from trivial (failures "left and right") in earlier firmware to more difficult. Likely some changes in the SP4 firmware/BIOS/OS/etc have improved performance. The Lenovo is not refreshed as often so some regression testing may be called for to see if it still more easily demonstrated to fail.
The request is for some guided debug which allows for shutting off throttling/state changes/etc. which are unique to Skylake which are affecting the polling frequency of USB. Since my solution is the endpoint and not the host controller, I do not have XDP access on the host side in order to force register changes before the BIOS takes control.
Note that "bonehead" issues have been eliminated as we have been working this issue for several months also working with Cypress. The channel is good (not an SI issue) and the problem has been proven to data not extracted fast enough by the host controller resulting in data loss rather than any other means. Using known test patterns sent by our endpoint, we can identify which data is lost.
Thanks for your communication.
We are trying to determine the best support path for your issue, as this is highly complex and will likely require sharing Intel Confidential material.
My first recommendation for you is to fill the form to acquire EDC Priviledge account: http://www.intel.com/content/www/us/en/forms/design/registration-privileged.html http://www.intel.com/content/www/us/en/forms/design/registration-privileged.html
Also it is necessary to determine exactly what type of Skylake processors you have been testing so it is neccessary to clarify the following information:
1) What is the exact model of the Surface Pro 4 that you are using?
2) Also you mention a Lenovo P70 however according to this page: http://shop.lenovo.com/ae/en/smartphones/p-series/p70/# tab-tech_specs Lenovo P70 - P70 Lenovo Smartphone with Dual Cameras | Lenovo UAE, this is a smartphone that uses a MediaTek processor not an Skylake Xeon processor.
3)Additionally if you were able to test or indicate if you have tested on other Skylake based platforms (and indicate the models of the platforms would be very useful)
Also if possible provide the following information:
1) What Operating System(s) have you been using for your device?
2) Can you replicate the issue using a different Operating System?
Housekeeping: I signed up for the privileged account.
1.1) Both i5 and i7 versions of the SP4 have tested as failing. Most functional testing is done off-site at another location, but I can work to gather the details. The system I have for local debug is an i5 128GB model. (i5-6300U CPU, 4GB RAM) running Windows 10 Pro / 64b
1.2) The reuse of marketing names is unfortunate as I have to sort through the phone search hits myself. http://shop.lenovo.com/us/en/laptops/thinkpad/p-series/p70/ ThinkPad P70 | Mobile Workstation | | Lenovo US
This model has a Xeon E3-1500M v5 w/16GB RAM running Windows 10 Pro / 64b
1.3) Thus far availability is the workstation, and two models of SP4 (i5 and i7)
2.1) Windows 10 Pro (64b). I'm going to be testing booting from a Linux CD in the future. This work was queued up for Friday.
We face exactly the same issue with our camera modules based on the Cypress FX2LP.
As soon as the CPU reaches a certain amount of idle (around 2% CPU load) the USB fifo of the FX2LP overflows due to reduced PCH polling.
We are using a Surface Pro 4 with an i5-6300U CPU@2.40Ghz. Camera is connected to the USB port of the tablet and operates in bulk mode.
As a workaround it seems that loading one core of the CPU brings the transfer back to normal. As this will drain the battery very soon I would appreciate if you could
share your findings regarding the query of JasonHWDesign here in this thread.
Thank you very much in advance.
Note that the relationship to CPU load is also a component of failure as we have found that running some applications concurrently with our own application will reduce or eliminate failure. The specific application did not seem to matter as it seemed more bound to CPU loading as described by CamHW.
Hello CamHW and JasonHWDesign
Thanks for your notifications and contributions to this thread.
I have contacted with the Product Application Engineers (PAE) for Skylake in order to confirm if they are aware of the issues that you are reporting, and if there is any fix or workaround available.
At the moment I can just offer general advice, please contact the manufacturers of the platforms that you are using and let them know about this issue, so they can also add more pressure to the request.
And make sure that you are working with the latest BIOS and microcode updates.
As soon as I have more information from the PAEs I will let you know.
Hello CamHW and JasonHWDesign
After consulting with the next level of support, their advice is that this case should be handled on a different support channel called Intel Premier Support.
If you already have an Intel Software Product you can get access to the channel by registering the product license: https://software.intel.com/en-us/faq/premier-support https://software.intel.com/en-us/faq/premier-support
Otherwise you can request access via an Intel Field Application Engineer (FAE).
If you don't have one appointed to you, you can fill the following form to get incontac with an FAE:
I apologize for not being able to provide additional information for you, and I hope you find the answers you are looking for via our Intel Premier Support channel.
Thanks - I signed up for a privileged account when suggested but that has not been granted yet. It seems there is an issue with the verification system. Is Privileged == Premier?
Still, I used the link provided (https://www-ssl.intel.com/content/www/us/en/secure/forms/design-assistance.html https://www-ssl.intel.com/content/www/us/en/secure/forms/design-assistance.html) and filled out the information again.
I just registered via the design-assistance link. I briefly described the issue and added a link to this thread.
Hopefully we won't get different FAE assigned to the same issue.
JasonHWDesign, it would be nice if we sync each other regarding the replys/findings from the FAE here in this thread.
Any mid-stream updates on your end?
Also, trading notes - what different operating systems have you been able to reproduce the issue under? Thus far, we have seen the issue with Win 10 as this is the standard shipping OS with our Skylake systems. Testing under Linux (Ubuntu - version ?) we were not able to reproduce although we did not spend a long time attempting to reproduce under Linux once we saw it working with our initial configuration. Are you able to reproduce under Win 7, Win 8? Also, any flavor(s)/release/version of Linux which also demonstrates the issue (and which one(s))?
On my end, I am still working on transferring the known information including USB traces, scope traces, etc. Previous versions of Windows have not been tested although this testing is queued.
Sorry but no news so far. I did not even receive a reply after registration to get support from an FAE.
Anyway: We have observed this issue also in Win8.1. Linux is not in our scope right now.
Under Win7/Skylake we've seen no issue so far.
It might be a pure SW issue which at the end of the day may fall in Microsofts basket. Who knows...
Please keep me updated.
Thanks and best regards.
After investigating with the Sales Contact, you were not contacted because there is not enough information on your account to make an assessment.
You could try upgrading your account on the following link: https://www-ssl.intel.com/content/www/us/en/forms/design/registration-privileged.html Register for a Privileged Intel Account
Could you please test your issue under the following scenarios:
1) Using Linux distribution (It can be a LiveCD)
2) Using Windows 8.1/Windows 10 fresh installations vs Using Windows 8.1/Windows 10 with the following drivers/downloadcenter.intel.com/download/20775/Intel-Chipset-Device-Software-INF-Update-Utility- https://downloadcenter.intel.com/download/20775/Intel-Chipset-Device-Software-INF-Update-Utility-
3)Using Windows 7 fresh installation vs Using Windows 7 with the following drivers/downloadcenter.intel.com/download/20775/Intel-Chipset-Device-Software-INF-Update-Utility- https://downloadcenter.intel.com/download/20775/Intel-Chipset-Device-Software-INF-Update-Utility-
As from our internal communications I'm aware that you have not observed the issue on Linux, could you please test the points # 2 and # 3 on your systems?
Carlos_A on behalf of AdolfoS.
As a quick update:
I have purchased an Intel NUC in order to reproduce the issue on a 3rd Skylake platform and also to decouple any Microsoft or 3rd party vendors from the major components (BIOS, Main Board, CPU). This also allows for us to test on a system with a wider range of operating system supported than the Microsoft Surface Pro 4.
Our target hardware continues to be SP4. However, I predict if reproduced on the Nuc the fix for the Nuc will translate directly to 3rd party platforms.
The plan is to start with Windows 7.