SCH US15W chipset - What EHCI USB controller programming might cause transaction error status?

ErikWlfn · ‎12-22-2020

'm a software engineer with RTP Corp. We are using the Cypress Semiconductor CY7C68001 EZ-USB SX2 High Speed USB Interface Device. We have an embedded software design used for process control and have written a simple embedded USB driver. Our devices are limited to one USB port with one USB High Speed device (the CY7C68000). The USB device is on-board with the CPU chip-set (Intel SCH US15W) We transfer data using four endpoints. Two of the endpoints transfer about 1072 bytes of output data, and 245 bytes of input data continually at a 1 millisecond rate. The other two endpoints transfer 512 bytes or less of output or input data randomly to communicate with flash memory. The USB host controller is an Intel EHCI device built into the chip-set. The software mostly polls the EHCI, and the only interrupt is the Interrupt On Completion of transfer descriptors. The interrupt service routine examines the descriptors to see which ones have completed, and sets operating system even flags associated with the completed transfers. Then it clears the interrupt request. So long as we use only one of the endpoints to transfer the 1 millisecond repetitive data, we see no problem. When we perform output transfers to the endpoint for flash in addition to the 1 millisecond periodic outputs, we start seeing intermittent errors on the 1 ms. output. The endpoint transferring data every millisecond gets transaction errors in the status for the USB host controller transfer descriptors. Those transaction errors appear to be recoverable. The transfer completes successfully. The transaction errors continue at random times after doing a single data transfer on the flash endpoints. Along with the intermittent but frequent transaction errors on the 1 ms. outputs we sometimes see lost or incorrect data. My understanding of a transaction error is that the USB device did not respond on USB or responded with an incorrect USB token. That would indicate either a logic error in the CY7C68001, or check code failures on tokens or output data. I have not found any configuration of the host controller that explains the intermittent recoverable transaction errors. Why would communicating to a second output endpoint cause transaction errors on a different endpoint? Is there some violation of the OUT handshaking that could explain the transaction errors on USB? For example, if an OUT FIFO in the USB device goes from not-full to full without any output data being transferred, would that cause a transaction error. In that case, the CY7C68001 would return ACK instead of NYET in responses to OUT transfer, but then later NAK the OUT data transfer instead of sending ACK as expected. Would that be considered a transaction error? Is there any other scenario that might explain a transaction error detected by the host controller? The other odd thing about this problem is that a power cycle makes the problem stop occurring until we perform another transfer to the flash output endpoint. Doing a software restart or hardware reset does not make the problem stop occurring. Our host controller driver is doing a reset of the EHCI host controller and a USB reset to the CY7C68001 in both cases. Our hardware engineer has verified that even on a software restart, the CY7C6801 and the logic connected to that are also being reset via a hardware signal. We have been able to verify that communicating with either of the pairs of OUT/IN endpoints and not the other avoids the problem. By synchronizing the transfers for the sets of endpoints, and moving the timing relationship, we can make the problem happen more or less frequently. Even when the transfers are done separately from each other in time the problem still happens. Since I am the software engineer, I am only somewhat familiar with the CY7C68001. I am mostly concerned about any EHCI configuration or programming errors that might explain the transaction errors. I want to verify that the simple USB driver that I wrote is not the cause of this issue. We are preserving the PING status and data toggle across transfers to each endpoint. We only use bulk transfers, either output or input, and we only use the asynchronous schedule. Other than timing, there is nothing different about the transfers at 1 ms. versus the flash transfers. We have at most three transfers active in the asynchronous list. The 1 ms. OUT and IN transfers are started at nearly the same time, with the OUT slightly preceding the IN. The OUT transfer always completes before the IN transfer, since the input data is not placed into the CY7C68001 FIFO until the expected output data has been completely received by the CY7C68001. The third transfer that may be present in the asynchronous schedule is either an OUT or IN to one of the flash communication endpoints. The only case where the flash transfers took significant time to complete was during a flash erase. The IN transfer would not complete until the erase completed about 300 ms. later. We changed the flash communication so that an IN completes almost immediately, within 125 microseconds or less. Changing the flash IN transfer to use less time on the USB bus did not have any effect, and an OUT to the flash endpoint still caused the other OUT endpoint to get transaction errors continually afterward. If there are hardware considerations that might explain this problem of transaction errors and incorrect output data, I would like to pass the information on to our hardware engineer. I can see how a violation of the FIFO handshaking in the CY7C68001 could cause incorrect data, but I can't explain how that would cause USB transaction errors. Should we be looking for electrical issues on the USB bus between the host controller and the CY7C68001? Does this even sound like a problem being caught by error check codes? Any suggestions will be appreciated.

CarlosAM_INTEL · ‎12-23-2020

Hello, @ErikWlfn:

Thank you for contacting Intel Embedded Community.

You should verify that your implementation fulfills the requirements stated in section 6.8 of the Intel® System Controller Hub [Intel® SCH] External Design Specification [EDS] Addendum and Specification Update Addendum for US15WP and US15WPT document # 386599. You can find it when you are logged into your Resource & Design Center (RDC) privileged account on the following website:

https://cdrdv2.intel.com/v1/dl/getContent/386599

You should fill out the RDC Account Support form to process your account update request or report any inconveniences with the provided sites. You can be found on the following website:

https://www.intel.com/content/www/us/en/forms/support/my-intel-sign-on-support.html

Best regards,

@CarlosAM_INTEL.