While databases have been the backbone of websites, applications, and more for years, organizations are facing an unprecedented influx of data from a variety of places. Whether for AI workloads, business analytics, or applications, companies are gathering and storing more data than ever. More data means more databases and the need for higher performance to get through larger data sets. Companies often opt to run these databases in the cloud to take advantage of benefits such as flexibility, scalability, and accessibility. While the advantages of using a public cloud such as Amazon Web Services (AWS) may be obvious, determining which instances to choose and how to configure them may be less so. In this blog, I aim to highlight some of the newer Amazon Elastic Cloud Compute (EC2) instances featuring 4th Gen Intel® Xeon® Scalable processors and how different databases perform on them. Additionally, I cover some tips and tricks, and things to keep in mind when choosing the right configuration for your database.
AWS: An Overview
Amazon Web Services (AWS) is still the largest public cloud provider in the world, with 31% of the worldwide market share in February 2024.(1) Even if your company is already running workloads and applications in the AWS cloud, its vast array of services and offerings can make it overwhelming to choose the right configuration for your database workload. With multiple instance families—each with as many as 22 separate offerings—in addition to fully-managed database services, it can be difficult to know how to get the best performance or the most cost-efficient option for your needs. I’ll run through a high-level overview of the options most likely to fit database workload needs, then we’ll dive into more specific database types with performance data and specifics.
First, a quick overview of Amazon EC2 instance families. At the highest level, AWS has instance categories based on the type of optimization a workload may need. These categories include General Purpose, Compute Optimized, Memory Optimized, and HPC Optimized. Depending on the type of database you’re running, you’re most likely to choose from the General Purpose, Memory Optimized, or Storage Optimized categories. Within each broad optimization family, the next level of categorization relates to the processor vendor—such as Intel or Graviton—and processor generation, such as 3rd Gen or 4th Gen Intel Xeon. Finally, a few categories include instances that target special requirements. These include M7i-flex instances, which don’t guarantee 100% maximum performance, but save money, and M5n instances, which offer increased network bandwidth caps.
The performance tests discussed below show the kinds of relative performance your database workload could get depending on which processor generation you choose. The final step in the decision-making process is determining the instance size or how many vCPUs you need. We show sample database benchmark results at various instance sizes to help guide your choices. Overall, you should look at the average and peak consumption of our database workload to determine which size instance best meets your requirements.
Another important thing to keep in mind while perusing the options to make your choice is that while cloud options may seem infinite, every choice comes with specific configuration settings, many of which are fixed. For example, your workload might require relatively few vCPUs but high network speed. In pretty much every instance family, smaller instances will have lower network bandwidth caps. Similarly, instances limit the number of disks you can attach to them as well as the storage bandwidth available. You could accidentally pay for a high-performance storage volume only to learn that the instance you chose can use only 20% of that performance. Some of the smaller instances, especially, may come with only 30 minutes of guaranteed maximum performance. When you think you’ve landed on an instance that suits your needs, be sure to read all the footnotes and fine print to ensure that you don’t accidentally hamstring your performance or unnecessarily inflate your costs.
Finally, let’s touch on managed database services such as Amazon Relational Database Service (RDS). These services are popular because they shift even more of the management responsibility of the environment to AWS. With Amazon EC2 instances, users must still perform OS updates, database installation and updates, database backups, and other related tasks. With services such as RDS, AWS takes on those responsibilities, leaving customers with more time to manage the application itself. While our performance tests did not utilize managed database services, customers can choose which instance they want to host their database within some of these services. Thus, knowing which instances perform better is still valuable. Note, however, that managed database services often offer a limited subset of Amazon EC2 instances. For example, at the time of writing this blog, when I log into the AWS console to create a PostgreSQL database with RDS, the drop-down menu shows no instances with the latest generation processor from any vendor. Whether you opt for an infrastructure-as-a-service approach or a managed database service such as RDS, read on to better understand how different instances can affect your database performance.
MySQL and PostgreSQL Performance
Both MySQL and PostgreSQL have ranked in or near the top five most popular databases over the last decade.(2) Nearly every company, regardless of industry, uses multiple transactional databases to maintain customer information, employee information, website backend data, and much more. While their uses and performance needs vary greatly, these databases typically work best with lots of memory and/or high-performance storage. For our PostgreSQL and MySQL performance testing, we chose the memory-optimized Amazon EC2 R-series instances, which provide more memory per vCPU than other EC2 instances. Testing revealed that choosing newer instances with better processors will increase transactional database performance, regardless of instance size.
The PostgreSQL tests were performed by a third party, Principled Technologies, using the HammerDB database benchmark. For full test results, see the report. In tests comparing the memory-optimized R7i instances with 4th Gen Intel Xeon Scalable processors to R6i instances with 3rd Gen Intel Xeon Scalable processors, the newer R7i instances achieved up to 1.138 times as many new orders per minute on PostgreSQL than the older instances.
Figure 1: Normalized PostgreSQL new orders per minute on Amazon EC2 R7i instances with 4th Gen Intel Xeon Scalable processors vs. R6i instances with 3rd Gen Intel Xeon Scalable processors.
In our internal testing using the same HammerDB benchmark on MySQL databases, we saw the R7i instances deliver up to 1.39 times the MySQL performance of the R6i instances.(3)
Figure 2: Normalized MySQL new orders per minute on Amazon EC2 R7i instances with 4th Gen Intel Xeon Scalable processors vs. R6i instances with 3rd Gen Intel Xeon Scalable processors.
Greater database throughput can mean several things depending on your database needs. It could mean supporting more users if your usage has increased. It could allow you to weather higher peak usage periods without slowing down. It could enable you to fit more databases on a single instance, which would lead to savings on instance costs over time. Choosing the right instance for your MySQL and PostgreSQL databases is crucial to keeping up with growing user interaction while maintaining your budget.
MongoDB Performance
With nearly 50,000 customers, MongoDB is a popular NoSQL database that stores data as documents, an approach that provides more flexibility than traditional table-based relational databases.(4) Users often deploy these databases in distributed clusters providing a high level of resiliency. Because of this distribution, you can often cluster smaller instances rather than having to use one large instance as might be necessary with a large transactional database. While our tests using the Yahoo! Cloud Serving Benchmark (YCSB) did not use replication to simplify the environment, they reflect the typical use case by running on smaller instances from four to sixteen vCPUs. We also chose the compute-optimized C-series instances to highlight performance on instances for database workloads that may need less memory. While databases typically perform better with more memory, small databases or parts of databases in a replicated solution need less total RAM than larger databases. Thus, if your database fits in smaller amounts of RAM, the compute-optimized offerings could save you money. Our testing shows that choosing C7i instances with 4th Gen Intel Xeon Scalable processors could provide up to 1.42 times the MongoDB performance of the previous-gen instances.(5)
Figure 3: Normalized MongoDB YCSB throughput on Amazon EC2 C7i instances with 4th Gen Intel Xeon Scalable processors vs. C6i instances with 3rd Gen Intel Xeon Scalable processors.
Redis Performance
Redis is an open-source, in-memory database that can function in several roles, including operating as a cache or a streaming engine. While it can be persistent with regular writes to storage, Redis databases are limited to the size of available memory. While memory-optimized instances with their higher RAM-to-vCPU ratios would be a good option, we chose to run our Redis Memtier benchmark tests on 4vCPU General Purpose M-series to represent smaller Redis applications. These could represent a smaller company or a smaller application in a larger corporation. As with the other database types, our tests show the impact newer instances can have on your performance. The m7i.xlarge instance with 4th Gen Intel Xeon Scalable processors offered 1.57 times the Redis performance of the two-generation-older m5.xlarge instance and 1.26 times the performance of the previous-generation m6i.xlarge instance.(6)
Figure 4: Normalized Redis Memtier throughput on Amazon EC2 M7i instances with 4th Gen Intel Xeon Scalable processors vs. M6i instances with 3rd Gen Intel Xeon Scalable processors and M5 instances with 2nd Gen Intel Xeon Scalable processors.
Conclusion
When it comes to hosting your database applications on the AWS cloud, there’s a lot to consider. While this blog cannot answer every question for your specific workload, we have shown that for the three databases tested by Intel or a third party, choosing the latest hardware will reap benefits for your workloads. Save money, support peak times, or ensure room for growth by choosing instances with 4th Gen Intel Xeon Scalable processors on AWS for your databases.
(2) https://db-engines.com/en/ranking_trend
(4) https://www.mongodb.com/who-uses-mongodb
Notices and Disclaimers
Performance varies by use, configuration, and other factors. Learn more on the Performance Index site.
Performance results are based on testing as of dates shown in configurations and may not reflect all publicly available updates. See backup for configuration details. No product or component can be absolutely secure.
Your costs and results may vary.
Intel technologies may require enabled hardware, software, or service activation.
© Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of others.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.