- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have been looking into IOMMU and IOTLB and noticed that the IOTLB invalidation operation in IOMMU takes many CPU clock cycles (average=1680, max=56090, min=516).
Note that this is purely the time the OS kernel waits till a single IOTLB invalidation is completed by the IOMMU.
I went through the Intel VT-d specification to see a justifiable reason for IOTLB invalidation taking this many clock cycles. I saw that there are page structure caches that cache the intermediate memory accesses when doing a page table walk which should be invalidated when a relevant IOTLB entry gets invalidated. I can understand that this operation may take additional clock cycles.
However, I am not fully convinced that this is the only reason for IOTLB invalidation taking this many clock cycles on average.
Are there any other operations/scenarios that contribute towards the high cycle count (or high latency) of IOTLB invalidation?
My setup:
CPU: Intel(R) Xeon(R) Silver 4114 CPU @ 2.20GHz
OS: Linux 6.2.8
Link Copied

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page