Cloud
Examine critical components of Cloud computing with Intel® software experts
134 Discussions

Best Virtual Machine Size for Self-Managed MongoDB on Microsoft Azure

IzabellaRaulin
Employee
3 0 16.9K

Authored by: Michał Prostko (Intel) and Izabella Raulin (Intel)

 

Last time, we focused on identifying the best instance type on the Google Cloud Platform for MongoDB[1]. Given the significant traction of the article, reaching over 165k views already, and considering the strong interest in instance selection concerns, which is understandable as it is critical for business maintenance in terms of performance and ongoing costs, we have decided to conduct a comparable study, but this time our attention turns to another widely used cloud platform, namely Microsoft Azure.

In this post, we explore the performance of MongoDB on Microsoft Azure examining various Virtual Machine (VM) sizes from the D-series as they are recommended for general-purpose needs.

Benchmarks were conducted on the following Linux VMs: Dpsv5, Dasv5, Dasv4, Dsv5, and Dsv4. They have been chosen to represent both the DS-Series v5 and DS-Series v4, showcasing a variety of CPU types. The scenarios included testing instances with 4 vCPUs, 8 vCPUs, and 16 vCPUs to provide comprehensive insights into MongoDB performance and performance-per-dollar across different compute capacities.

Our examination showed that, among instances with the same number of vCPUs, the Dsv5 instances consistently delivered the most favorable performance and the best performance-per-dollar advantage for running MongoDB.

 

 Keywords: Cloud, Azure, MongoDB, Performance, Performance-per-dollar, Xeon

adobestock-260664429.jpeg

 

 

MongoDB Leading in NoSQL Ranking

MongoDB stands out as the undisputed leader in the NoSQL Database category, as demonstrated by the DB-Engines Ranking[2]. MongoDB emerges as the clear frontrunner in the NoSQL domain, with its closest competitors, namely Amazon DynamoDB and Databricks, trailing significantly in scores. Thus, MongoDB is supposed to maintain its leadership position.


Figure1. Top 10 NoSQL databases according to the DB-Engines Ranking[2]. The snapshots were captured on May 28, 2024Figure1. Top 10 NoSQL databases according to the DB-Engines Ranking[2]. The snapshots were captured on May 28, 2024

 

MongoDB Adoption in Microsoft Azure

Enterprises utilizing Microsoft Azure can opt for a self-managed MongoDB deployment or leverage the cloud-native MongoDB Atlas service. MongoDB Atlas is a fully managed cloud database service that simplifies the deployment, management, and scaling of MongoDB databases. Naturally, this convenience comes with additional costs. Additionally, it restricts us, for example, we cannot choose the instance type to run the service on. 

In this study, the deployment of MongoDB through self-managed environments within Azure's ecosystem was deliberately chosen to retain autonomy and control over Azure's infrastructure. This approach allowed for comprehensive benchmarking across various instances, providing insights into performance and the total cost of ownership associated only with running these instances.

 

Methodology

In the investigation into MongoDB's performance across various Microsoft Azure VMs, the same methodology was followed as in our prior study[1] conducted on the Google Cloud Platform. Below is a recap of the benchmarking procedures along with the tooling information necessary to reproduce the tests.

 

Benchmarking Software – YCSB

The Yahoo! Cloud Serving Benchmark (YCSB)[3], an open-source benchmarking tool, is a popular benchmark for testing MongoDB’s performance. The most recent release of the YCSB package, version 0.17.0, was used.

The benchmark of MongoDB was conducted using a workload comprising 90% read operations and 10% updates to reflect, in our opinion, the most likely distribution of operations. To carry out a comprehensive measurement and ensure robust testing of system performance, we configured the YCSB utility to populate the MongoDB database with 10 million records and execute up to 10 million operations on the dataset. This was achieved by configuring the recordcount and operationcount properties within YCSB. To maximize CPU utilization on selected instances and minimize the impact of other variables such as disk and network speeds we configured each MongoDB instance with at least 12GB of WiredTiger cache. This ensured that the entire database dataset could be loaded into the internal cache, minimizing the impact of disk access. Furthermore, 64 client threads were set to simulate concurrency. Other YCSB parameters, if not mentioned below, remained as default.

 

Setup

Each test consisted of a pair of VMs of identical size: one VM running MongoDB v7.0.0 designated as the Server Under Test (SUT) and one VM running YCSB designed as the load generator. Both VMs ran in the Azure West US Region as on-demand instances, and the prices from this region were used to calculate performance-per-dollar indicators.

Figure 2. The MongoDB workload on Microsoft Azure - the diagram on the left illustrates the setup employed, while the table on the right describes in detail the configuration used.Figure 2. The MongoDB workload on Microsoft Azure - the diagram on the left illustrates the setup employed, while the table on the right describes in detail the configuration used.

 

 

Scenarios

MongoDB performance on Microsoft Azure was evaluated by testing various Virtual Machines from the D-series, which are part of the general-purpose machine family. These VMs are recommended for their balanced CPU-to-memory ratio and their capability to handle most production workloads, including databases, as per Azure’s documentation[4].

The objective of the study is to compare performance and performance-per-dollar metrics across different processors for the last generation and its predecessor. Considering that the newer Dasv6 and Dadsv6 series are currently in preview, the v5 generation represents the latest generally available option. We selected five VM sizes that offer a substantively representative cross-section of choices in the general-purpose D-Series spectrum: Dsv5 and Dsv4 powered by Intel® Xeon® Scalable Processors, Dasv5 and Dasv4 powered by AMD EPYC™ processors, and Dpsv5 powered by Ampere® Altra® Arm-based processors. The testing scenarios included instances with 4, 8, and 16 vCPUs.

 

Challenges in VM type selection on Azure

In Microsoft Azure instances are structured in a manner where a single VM size can accommodate multiple CPU families. This means that different VMs created under the same VM Size can be provisioned on different CPU types. Azure does not provide a way to specify the desired CPU during instance creation, neither through the Azure Portal nor API. The CPU type can only be determined once the instance is created and operational from within the operating system. It turned out that it required multiple tries to get matching instances as we opted for an approach where both the SUT and the client instance have the same CPU type. What was observed is that larger instances (with more vCPUs) tended to have newer generations of CPU more frequently, while smaller instances were more likely to have the older ones. Consequently, for the smaller instances of Dsv5 and Dsv4 we have never come across VMs with 4th Generation Intel® Xeon® Scalable Processors. 


VM-representatives.png

More details about VM sizes used for testing are provided in Appendix A. For each scenario, a minimum of three runs were conducted. If the results showed variations exceeding 3%, an additional measurement was taken to eliminate outlier cases. This approach ensures the accuracy of the final value, which is derived from the median of these three recorded values.

 

Results

The measurements were conducted in March 2024, with Linux VMs running Ubuntu 22.04.4 LTS and kernel 6.5.0 in each case. To better illustrate the differences between the individual instance types, normalized values were computed relative to the performance of the Dsv5 instance powered by the 3rd Generation Intel® Xeon® Scalable Processor. The raw results are shown in Appendix A.

Figure 3. The normalized performance (on the left) and performance-per-dollar (on the right) across selected general-purpose Microsoft Azure VMs with 4 vCPUs for MongoDB workload.Figure 3. The normalized performance (on the left) and performance-per-dollar (on the right) across selected general-purpose Microsoft Azure VMs with 4 vCPUs for MongoDB workload.

 

Figure 4. The normalized performance (on the left) and performance-per-dollar (on the right) across selected general-purpose Microsoft Azure VMs with 8 vCPUs for MongoDB workload.Figure 4. The normalized performance (on the left) and performance-per-dollar (on the right) across selected general-purpose Microsoft Azure VMs with 8 vCPUs for MongoDB workload.

 

Figure 5. The normalized performance (on the left) and performance-per-dollar (on the right) across selected general-purpose Microsoft Azure VMs  with 16 vCPUs for MongoDB workload.Figure 5. The normalized performance (on the left) and performance-per-dollar (on the right) across selected general-purpose Microsoft Azure VMs with 16 vCPUs for MongoDB workload.

 

Whether both 16 vCPUs Dsv4 and Dsv5 VMs are powered by 3rd Generation Intel® Xeon® Scalable Processors 8370C and, moreover, they share the same compute cost of $654.08/month, the discrepancy in MongoDB workload performance scores is observed, favoring the Dsv5 instance. This difference can be attributed to the fact that the tested 16 vCPUs Dsv4, as a representation of the 4th generation of D-series, is expected to be more aligned with other representatives of its generation (see Table 1). Analyzing results for Dasv4 VMs vs Dasv5 VMs, powered by 3rd Generation AMD EPYC™ 7763v, similar outcomes can be noted - in each tested case, Dasv5-series VMs overperformed Dasv4-series VMs.

 

Observations:
  • Dsv5 VMs, powered by 3rd Generation Intel® Xeon® Scalable Processor, offer both the most favorable performance and the best performance-per-dollar among the other instances tested in each scenario (4vCPUs, 8vCPUs, and 16 vCPUs).
  • Dasv5 compared to Dsv5 is less expensive, yet it provides lower performance. Therefore, the Total Cost of Ownership (TCO) is in favour of the Dsv5 instances.
  • Dpsv5 VMs, powered by Ampere® Altra® Arm-based processors, have the lowest costs among the tested VM sizes. However, when comparing performance results, that type of VM falls behind, resulting in the lowest performance-per-dollar among the tested VMs.

 

Conclusion

The presented benchmark analysis covers MongoDB performance and performance-per-dollar across 4vCPUs, 8vCPUs, and 16 vCPUs instances representing general-purpose family VM sizes available on Microsoft Azure and powered by various processor vendors. Results show that among the tested instances, Dsv5 VMs, powered by 3rd Generation Intel® Xeon® Scalable Processors, provide the best performance for the MongoDB benchmark and lead in performance-per-dollar.

 

Appendix A

appendixA.png

 

References

[1] MongoDB-Best-choice-of-instance-type-on-GCP

[2] https://db-engines.com/en/ranking/document+store

[3] https://github.com/brianfrankcooper/YCSB/

[4] https://azure.microsoft.com/en-us/pricing/details/virtual-machines/series/

[5] https://azure.microsoft.com/en-us/pricing/calculator

All links listed above were accessible as of May 28, 2024.

 

Disclosure Text:

Remember that MongoDB's performance can be highly dependent on factors like data structure, query patterns, indexes, and more. It's a good practice to test your application with different instance types and configurations to find the optimal setup that balances performance and cost for your specific use case.

 

Notices & Disclaimers:

Performance varies by use, configuration, and other factors. Learn more on the Performance Index site

Performance results are based on testing as of dates shown in configurations and may not reflect all publicly available updates.  See backup for configuration details.  No product or component can be absolutely secure. 

Your costs and results may vary.  For further information please refer to  Legal Notices and Disclaimers.

Intel technologies may require enabled hardware, software, or service activation.

© Intel Corporation.  Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries.  Other names and brands may be claimed as the property of others.