Software Tuning, Performance Optimization & Platform Monitoring
Discussion regarding monitoring and software tuning methodologies, Performance Monitoring Unit (PMU) of Intel microprocessors, and platform updating.

interrupt coalescing in Xeon processors

zhengda1936
Beginner
1,344 Views

Hello,

I'm not sure if this is the right place where I should ask this question. I applogize if it's not. It may not be an Intel-specific question.

I try to enable interrupt coalescing in the LSI controller. Right now I don't see interrupt coalescing (I saw 1 million interrupts when issuing 1 million IO requests). When I talked to the tech support of LSI, I was told that interrupt coalescing is enabled by the driver by default, but in order to enable this feature, processors also have to support it. I use Xeon E5-4620, so I believe the processor should support it if it requires CPU's support. I don't know if the Linux kernel has any parameters that can enable or disable interrupt coalescing in the processor. I don't see any in the system's BIOS for sure. Does anyone have any knowledge about interrupt coalescing? I'm quite confused how it works.

Thanks,
Da 

0 Kudos
18 Replies
Bernard
Valued Contributor I
1,344 Views

Afaik newest Win OS use timer coalescing which is itself represented by clock interrupt.

0 Kudos
Bernard
Valued Contributor I
1,344 Views

IIRC interrupt coalescing is managed by OS  I do not know if it needs hardware support.

0 Kudos
Patrick_F_Intel1
Employee
1,344 Views

Hello Da,

It is hard to say where the problem lies. You haven't supplied much info and I'm not sure this is the correct forum anyway.

Here is a URL that briefly talks about interrupt coalescing. http://publib.boulder.ibm.com/infocenter/pseries/v5r3/index.jsp?topic=/com.ibm.aix.prftungd/doc/prftungd/interrupt_coal.htm

Whether interrupts get coalesced seems to primarly depend on how many packets or IOs occur per second. So your one million IOs and one million interrupts, if they occur over a long enough interval, might not meet the device's criteria for coalescing. In the BIOS or in the OS device config there should be some sort of settings for your device that specify a 'rx int delay' (receive interrupt delay) parameter. This will set the threshold such that, if you recieve more than rx_int_delay packets/sec then the system should start coalescing the interrupts.

Check for a paramter like this and see if your test case is putting more than rx_int_delay packets/sec.
Good luck,

Pat

0 Kudos
Bernard
Valued Contributor I
1,344 Views

I was not able to find any references to hardware support of interrupt coalescing.What I learnt that on Windows coalescing clock interrupt is performed by kernel mode software.

0 Kudos
zhengda1936
Beginner
1,344 Views

Thank you for your replies.

Sorry for the unclearness. I was really confused how it works. In my test, an LSI controller can serve around 300,000 requests per second, so around 300,000 interrupts per second. I think it's a very high rate. When I googled interrupt coalescing, I found a lot of documents about it on network interface. However, I'm dealing with storage devices. I couldn't find a parameter similar to rx_int_delay.

If it is implemented by software, I'll imagine that it works just like NAPI in Linux (when the kernel receiving an interrupt, it disables interrupts and uses polling to read multiple packets from the network interface). If so, I really should ask in this forum. Sorry.

Thanks,
Da 

0 Kudos
Bernard
Valued Contributor I
1,344 Views

Agree with you that searching internet will provide you with some info mainly related to coalescing NIC interrupt.For learning how it is done on Win please consult Windows Internals book in its 6 edition part one.

0 Kudos
SergeyKostrov
Valued Contributor II
1,344 Views
>>... I use Xeon E5-4620, so I believe the processor should support it if it requires CPU's support... Please take a look at a Datasheet for that CPU on ark.intel.com. When a web page for the CPU is displayed all available Datasheets are usually on the right part of the web page.
0 Kudos
Bernard
Valued Contributor I
1,344 Views

Hi zhengda1936,

if you are interested I have found a few details about the interrupt coalescing(clock interrupt) on Win OS.Rationale behind coalescing interrupt is to make longer C-states  and thus reducing the frequency of processor wake up intervals needed to process those expirations..performed at DISPATCH_LEVEL.Not all timer interrupts are coalescable and it is up to driver to decide to use this feature or not.Windows kernel exposes two functions KeSetCoalescableTimer and SetWaitableTimerEx for this.

0 Kudos
zhengda1936
Beginner
1,344 Views

Thanks.

After I read some docuements, I'm more convinced that interrupt coalescing is a hardware feature, and should be implemented in the IO device.

vIC: Interrupt Coalescing for Virtual Machine Storage Device IO
Analysis of Interrupt Coalescing Schemes for Receive-Livelock Problemin Gigabit Ethernet Network Hosts

I think it makes sense to be implemented in the IO device.

0 Kudos
Bernard
Valued Contributor I
1,344 Views

I think that you could be right.Unfortunately there is no freely available information related toWindows implementation of clock timer interrupt coalescing.I need to check similiar implementation in Linux.

0 Kudos
Bernard
Valued Contributor I
1,344 Views

>>>vIC: Interrupt Coalescing for Virtual Machine Storage Device IO
Analysis of Interrupt Coalescing Schemes for Receive-Livelock Problemin Gigabit Ethernet Network Hosts>>>

Are these titles of some documentation?

0 Kudos
SergeyKostrov
Valued Contributor II
1,344 Views
>>...If it is implemented by software, I'll imagine that it works just like NAPI in Linux ( when the kernel receiving an interrupt, >>it disables interrupts and uses polling to read multiple packets from the network interface )... By the way, I just realized that I used that technique many years ago ( in 1992! ) to read data from RS-232 port in MS-DOS operating system. We had a similar problem because it was very inefficient, time consuming ( we also had timing constraints ) to process 1024 interrupts to get, for example, 1024 bytes of data. It worked as follows: - Some Hardware sends a Control Character to a Processing Computer - When interrupt is generated an Interrupt Handler disables all the rest interrupts for that RS-232 port and receives a package of 1024 bytes by reading data from RS-232 port ( directly ) - As soon as processing is done the system state is restored and Processing Computer waits for another Control Character from the Hardware
0 Kudos
Bernard
Valued Contributor I
1,344 Views

>>>We had a similar problem because it was very inefficient, time consuming ( we also had timing constraints ) to process 1024 interrupts to get, for example, 1024 bytes of data.>>>

Wa it so by design?I mean to interrupt cpu for every byte of received/sent data.

0 Kudos
zhengda1936
Beginner
1,344 Views

iliyapolak wrote:

>>>vIC: Interrupt Coalescing for Virtual Machine Storage Device IO
Analysis of Interrupt Coalescing Schemes for Receive-Livelock Problemin Gigabit Ethernet Network Hosts>>>

Are these titles of some documentation?

These are academic papers. From their statement about interrupt coalescing, I infer that it is usually implemented inside the hardware.

0 Kudos
zhengda1936
Beginner
1,344 Views

Sergey Kostrov wrote:

It worked as follows:

- Some Hardware sends a Control Character to a Processing Computer
- When interrupt is generated an Interrupt Handler disables all the rest interrupts for that RS-232 port and receives a package of 1024 bytes by reading data from RS-232 port ( directly )
- As soon as processing is done the system state is restored and Processing Computer waits for another Control Character from the Hardware

This sounds similar to the design of NAPI. I guess this design is common, but I believe it's not the technique called interrupt coalescing.

0 Kudos
Bernard
Valued Contributor I
1,344 Views

zhengda1936 wrote:

Quote:

iliyapolakwrote:

>>>vIC: Interrupt Coalescing for Virtual Machine Storage Device IO
Analysis of Interrupt Coalescing Schemes for Receive-Livelock Problemin Gigabit Ethernet Network Hosts>>>

Are these titles of some documentation?

These are academic papers. From their statement about interrupt coalescing, I infer that it is usually implemented inside the hardware.

Thank you for the info

0 Kudos
SergeyKostrov
Valued Contributor II
1,344 Views
>>...I mean to interrupt cpu for every byte of received/sent data... Yes and this is how many communication programs for MS-DOS worked 20 years ago. However, our case was more complex because we needed that kind of processing for an embedded system. In Windows OS Win32 API for serial communications simplifies many things and a processing looks like: ... if( ClearCommError( pCdp->hDevice, ( RTulong * )&uiErrorFlags, &pCdp->Cms ) == RTFALSE ) { CrtPrintf( RTU("Failed to Clear Error [ Comm Device: %s ]\n"), pCdp->szDeviceName ); break; } uiBytesToRead = CrtMin( ( RTuint )pCdp->iSize, pCdp->Cms.cbInQue ); bOk = ( RTbool )ReadFile( pCdp->hDevice, pCdp->pubData, uiBytesToRead, ( RTulong * )&uiBytesRead, &pCdp->Ovl ); pCdp->uiError = SysGetLastError(); if( uiBytesRead != 0 ) { pCdp->iSize = uiBytesRead; ... However, it is done at a higher level in a user application and I didn't try to implement a special Windows driver.
0 Kudos
Bernard
Valued Contributor I
1,344 Views

Today there is tendency to offload cpu from unneeded work and pass it to the hardware controllers.For example imagine that cpu would need to perform HDD virtual linear to physical addressing space translation also to perform sector's data checksum and to control head servo mechanism it could be huge impact on  cpu performance.Afaik disk.sys is doing low level job related to programming MMIO HDD registers and  lowest level disk internal management is done by built-in controller.

0 Kudos
Reply