Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Intel Community
- Software Development Technologies
- Intel® ISA Extensions
- How long does a 6700K take to multiply two integers?

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

Nosh_N_

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

11-06-2015
10:13 AM

120 Views

How long does a 6700K take to multiply two integers?

Hi,

I just read on Wikipedia that an IBM 1620 took 17ms to multiple two integers, and I was wondering how long a modern CPU takes to execute the same operation.

I hope I'm in the right forum. I found this question from 2008 ( https://software.intel.com/en-us/forums/intel-academic-community-forum/topic/299987 ), which, going by Google, seems to suggest that I should ask my question here.

Regardless, I'm looking forward to your answers.

Link Copied

3 Replies

MarkC_Intel

Moderator

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

11-06-2015
10:53 AM

120 Views

McCalpinJohn

Black Belt

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

11-06-2015
01:48 PM

120 Views

The comparison is especially difficult because the IBM 1620 was a variable word-length system. Fixed-point numbers could be anywhere from 2 decimal digits to 10's of thousands of decimal digits (depending on the machine size).

One reference stated that the system could multiply two 10-digit numbers in 17.7 milliseconds. 10 digits is slightly larger than what can be held in a 32-bit integer, but is very easily held in a 64-bit integer, so multiplication of two 64-bit binary numbers seems like a fair comparison.

According to appendix C of the Intel Optimization Reference Manual, the "MUL" instruction can multiply two 64-bit (unsigned) values and put the 128-bit result in two 64-bit output registers with a latency of 4 cycles on recent processors (Nehalem/Westmere, Sandy Bridge/Ivy Bridge, and Haswell/Broadwell). At a "typical' frequency of 2 GHz, this is a latency of 2 nanoseconds -- almost 9 million times faster than the IBM.

Unlike the IBM 1620, modern processors can also perform many of these multiplication operations concurrently, using pipelining, SIMD vectors, and multiple cores.

- Pipelining: All recent Intel processors can issue one "MUL" instruction every cycle, so the throughput is four times higher if you have four independent operations that can be launched consecutively.
- SIMD Vectors: There are several approaches that can be used here, depending on the data layout and the application's requirements. With 256-bit vector instructions it should be possible to get at least a 2x improvement in throughput.
- Multiple Cores: All of the cores can run independent integer multiplications concurrently.

Combining these factors gives a (peak theoretical) throughput increase of at least an additional factor of ~32x on a quad-core processor. Whether this can be sustained depends on where the input and output data is located in memory, whether the code is using signed or unsigned integers, whether the code needs the full 128-bit output or just the low-order 64 bits, etc....

Overall, a relatively inexpensive quad-core processor should be between 10 million and 300 million times as fast as the IBM 1620 for the arithmetic operation. Time required for memory accesses will likely reduce these ratios, since memory has not increased in performance as much as the computational logic has increased in performance and available concurrency.

Bernard

Black Belt

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

12-15-2015
09:18 AM

120 Views

For more complete information about compiler optimizations, see our Optimization Notice.