Memory & Storage
Engage with Intel professionals on Memory & Storage topics
21 Discussions

Tiered Memory: Low-Power Solution for Larger Capacity

John_Hubbard
Employee
0 0 18.7K

Today more than ever, infrastructure and data architects are looking for ways to increase memory capacity for data-hungry workloads; but environmental concerns such as power consumption are also important. Using tiered memory—a small amount of DRAM and large capacity Intel® Optane™ persistent memory (PMem)—is a great way to meet both goals.

Tiered Memory Boosts Memory Capacity

Intel Optane PMem configurations allow up to 8 TB of memory capacity in a dual-socket server (more than what is possible with DRAM), and at a significantly lower cost.[i]

In a tiered memory configuration, a small amount of DRAM serves as a memory cache for very hot data, while hot data resides in Intel Optane PMem (Memory Mode). The tiering is transparent to the operating system and applications. Simply verify that the BIOS is configured for Memory Mode, currently the default for most OEMs, and the CPU’s memory controller will handle the rest.

Typical configurations include 1 TB and 2 TB of Intel Optane PMem, employing 128 GB Intel Optane PMem DIMMs, while 4 TB and 8 TB configurations require 256 GB and 512 GB Intel Optane PMem DIMMs, respectively.[ii] At the time of this writing, DRAM DIMMs in 512 GB capacity do not exist, limiting DRAM-only configurations to 4 TB. It is a best practice to configure a tiered memory system so that the active memory footprint fits within the DRAM cache.

Caching Active Memory

Simply put, if 256 GB of DRAM is available, the active memory footprint should be less than 256 GB. A system with 256 GB of DRAM will accommodate 1 TB of Intel Optane PMem. The result is a system with a 25% cache—or a ratio of 1:4 DRAM to Intel Optane PMem—providing 1 TB of total memory. This ratio is the “sweet spot” that keeps the majority of the frequently accessed data in the DRAM, minimizing activity to Intel Optane PMem (i.e., cache misses).

Tiered Memory Can Actually Have Lower Power Consumption than DRAM-Only Configurations

So, we’ve covered how tiered memory configurations can increase memory capacity, but how power efficient are they? Power is a key metric for data center efficiency, rivaled only by cost and performance. Regional regulations regarding power usage, along with power availability constraints, demand that data center architects and admins look closely at the projected power consumption of new infrastructure investments.

So, the logical next step would be to start looking through product spec sheets to compare the power numbers, right?

Not quite.

First, unlike Intel Optane PMem, DRAM vendors don’t publicly disclose power usage in their product specifications.

Second, even if vendors disclosed power numbers, they wouldn’t represent real-world usage. They would typically only include the absolute worst-case scenario. Tallying up all the maximum values for power, or power economy, would be analogous to measuring a vehicle’s fuel economy while at maximum RPM. Yes, this would reveal the results under extreme conditions, but not what a driver should expect under normal daily use. This is why we need to measure power, in a controlled fashion, when the systems are in use.

What Does the Spec Sheet Say?

The Intel Optane PMem 200 series spec sheet states that each module can consume up to 15 watts (W). However, power consumption varies from 3.5W to 15W, based on utilization. As long as there is room in the DRAM cache, cache misses to Intel Optane PMem should be negligible, minimizing utilization and the power required.

Active Memory Footprint

Tiered memory is best suited for workloads that have a low active memory footprint. That is to say, the ratio of memory that is actively being read or written should be less than 25% of the memory that is being consumed. Workloads that are 25% or less active will primarily store their very hot data in the DRAM cache with minimal to no impact to performance as compared to DRAM-only performance. A workload with an active memory footprint greater than 25% is considered to be memory-intensive, and will spill out of the DRAM cache, negatively impacting performance.

Comparing Power Usage

To compare power consumption, we must first choose a workload, and then the systems to test.

Workload

Intel specifically chose a memory-intensive workload, greater than 80% active, to demonstrate the impacts of following and not following best practices as they pertain to power and performance. Multiple working set sizes were tested to illustrate various cache utilizations, ranging from 50% to 188%. It’s important to remember that over-subscribing the cache is not recommended, and best practices recommend 80%.

Systems

Intel tested 512 GB configurations—both DRAM-only and tired memory. To increase measurement accuracy, single-socket configurations were tested, where only CPU socket 0 was populated. This choice affords 16 DIMM slots, which are populated differently for each configuration. Two DRAM-only configurations were tested, utilizing 16x 32 GB DIMMs and 8x 64 GB DIMMs; tiered memory utilized 8x 16 GB DRAM DIMMs and 4x 128 GB Intel Optane PMem DIMMs.

You Can’t Have It All

When faced with choosing between price, performance and power consumption, you can’t have it all. Choosing price and performance will sacrifice power. Choosing performance and power will impact price.

It is worth noting that the test configurations compare a different number of DIMMs, potentially lending a power advantage to the tiered memory configuration. These configuration choices were based on cost, governed solely by the price differences of DRAM DIMM capacities. The configurations being tested were the lowest cost at the time of this writing.

The Results[iii]

Each experiment ran for two hours with a steady-state load. The figure below indicates system average energy consumption in watts per hour or watt hours (Wh).[iv]

Avg-Graph.jpg

 

When following best practices with an active working set size at 80% or below, we observed the following:

  • Tiered memory consumed up to 5% less power than the lowest-cost DRAM alternative and had essentially the same power usage as the 8 DIMM DRAM configuration.
  • Tiered memory performance is within 4% of the lowest-cost DRAM alternative and 2% of the 8 DIMM DRAM configuration.

When not following best practices and comparing to the lowest-cost DRAM alternative:

  • Tiered memory performance drops between 15% and 35% with 125% and 188% active working set sizes, respectively.
  • Tiered memory power consumed between 1% and 3% less power for 125% and 188% active working set sizes, respectively.
  •  

In summary, using the maximum power consumption of an Intel Optane PMem module to calculate system energy consumption leads to an overestimation of the system’s energy demand. When best practices are followed, keeping cache misses to a minimum, tiered memory configurations can have a comparable power consumption to DRAM-only while also providing comparable performance. Infrastructure and data architects can use high-density Intel Optane PMem DIMMs to enable larger capacities while still limiting their power demands.

To learn more, visit the following resources:

 

 

 


[i] 2 TB Tiered 1:4 ratio (16x 32 GB DRAM + 16x 128 GB Intel® Optane™ PMem) – DRAM cost = $9,381.00, PMem cost = $22,352.00; Total memory cost = $31,733.00. 

2 TB DRAM-only (32x 64 GB DRAM) – Total memory cost = $37,610.00.

4 TB Tiered 1:4 ratio (16x 64 GB DRAM + 16x 256 GB Intel Optane PMem) – DRAM cost = $18,805.00, PMem cost = $58,410.00; Total memory cost = $77,215.00.

4 TB DRAM-only (32x 128 GB DRAM) – Total memory cost = $120,906.00.

Pricing source is Intel® Optane™ persistent memory advisor tool, available through your Intel representative.

Intel Optane PMem pricing and DRAM pricing referenced in total cost of ownership (TCO) calculations are provided for guidance and planning purposes only and do not constitute a final offer. Pricing guidance is subject to change and may revise up or down based on market dynamics. Please contact your original equipment manufacturer (OEM)/distributor for actual pricing. DRAM pricing as of March 2022.

[ii] In 3rd Generation Intel® Xeon® Scalable processors with Intel® Optane™ PMem 200 series.

[iii] Testing by Intel as of February 2022. Results may vary.

Common Configuration Details: Intel® Server System M50CYP2SB2U, microcode = 0x0d0002b1, BIOS = SE5C6200.86B.0022.D64.2105220049, boot drive = 1x 800 GB SSD, storage drive = 1x Intel® Optane™ SSD DC P4800X, NIC = Intel® Ethernet Converged Network Adapter X540, Host OS = VMware ESXi v7.0.2u2-17867351, Guest OS = Fedora 30, kernel = 5.6.13-100.fc30.x86_64, workload = SPECjbb2015 v1.02.

Configuration #1 (DRAM-only): 1x Intel® Xeon® Platinum 8352Y processor (32 cores, 2.4 GHz), 512 GB (8x 64 GB @ 3200 MHz).

Configuration #2 (DRAM-only): 1x Intel Xeon Gold 6336Y processor (24 cores, 2.4 GHz), 512 GB DDR4 (16x 32 GB @ 3200 MHz).

Configuration #3 (Tiered Memory): 1x Intel Xeon Gold 6336Y processor (24 cores, 2.4 GHz), 128 GB DDR4 (8x 16 GB @ 3200 MHz) + 512 GB Intel® Optane™ persistent memory (PMem) (4x 128 GB @ 3200 MHz).

[iv] Results are calculated as half of ~700W consumed by the system during the 2-hour experiment. Watt hours (Wh) is not a standard unit in any formal system, but it is commonly used in electrical applications.