Artificial Intelligence (AI)
Discuss current events in AI and technological innovations with Intel® employees
491 Discussions

Low Precision Inference: High-Performance Fault Detection with 3D Seismic Data and the Intel® Distribution of OpenVINO™ toolkit

MaryT_Intel
Employee
0 0 1,887

Authors:

Flaviano Christian Reyes, Deep Learning R&D Intern, Intel

Ravi Panchumarthy, Ph.D., Machine Learning Engineer, Intel

Manas Pathak, Ph.D., Global AI Lead for Oil & Gas, Intel

Key Takeaways

  • Learn how to reduce datatype precision to INT8 in order to achieve greater performance when deploying Fault Detection models with 3D Seismic Data using the Post-Training Optimization Tool in Intel® Distribution of OpenVINO™ toolkit.
  • Get started with open-sourced fault segmentation models and datasets paired with benchmarking and accuracy checker tools with the Intel® Distribution of OpenVINO™ toolkit.
  • Discover how this low-precision pipeline can help reduce the time to detect faults and thereby speed up oil and gas exploration.
Seismic Graphic

Introduction

The success of Deep Learning has led to a proliferation of Artificial Intelligence (AI) applications and advancing the state-of-the-art performance in myriad of domains. Typically, deep learning applications use 32-bits of floating-point precision for training and inference. However, recent research proved that both deep learning training and inference can be performed with lower numerical precision, maintaining the same or similar accuracy with increased performance.[1] Intel is rapidly innovating in this research space and recently introduced Intel® Deep Learning Boost (Intel® DL Boost), a new set of embedded processor technologies on 2nd generation Intel® Xeon® Scalable processors (Cascade Lake), designed to accelerate deep learning inference.[2] Intel DL Boost includes new Vector Neural Network Instructions (VNNI) which enable INT8 deep learning inference.[3] INT8’s lower precision increases power efficiency by decreasing compute and memory bandwidth requirements and produces significant performance benefits. In this blog, we showcase over 3x performance boost with INT8 precision inference over FP32 precision with a convolutional neural network (CNN) model for detecting Faults in 3D Seismic Dataset using Intel® Distribution of OpenVINO™ Toolkit [4,5].

Fault Detection in 3D Seismic Dataset

In the current work, we have used a pre-trained model from Wu et al., 2019 [6] to accelerate the detection of faults on data from the F3 Dutch block [7] in the North Sea. The FaultSeg model is based on 3D U-Net architecture with input shape 128x128x128 and the output shape is 128x128x128. Following up on the workflow established in our previous blog, “Accelerating fault detection in 3D Seismic data using OpenVINO - Reducing time to the first Oil”, we created docker containers to perform these benchmarks. The method to create an OpenVINO™ toolkit benchmark docker container followed previous work by Intel [9]. Benchmarks were performed on the validation datasets provided with the FaultSeg model [11] by running experiments on a different set of hardware.

Two docker containers were made for each of the two hardware configurations – CPU and GPU. The benchmark script outputs average inference time per image and average balanced cross-entropy loss. The average inference time per image was calculated in seconds and converted into milliseconds, as displayed in Figure 1. The average balanced cross-entropy loss was calculated to assess inference differences between representations. Each model ran inference on the validation set for at least 1 minute.

Benchmarking Results with the Intel® Distribution of OpenVINO™ toolkit

Based on the Ubuntu 18.04 base image, the CPU docker container uses Intel® Distribution of OpenVINO™ toolkit 2020.4 release to run inference on their FaultSeg representations. Prior to benchmarking, the original Keras FaultSeg model was converted to a Tensorflow frozen graph, and then to an Intermediate Representation (IR) format using the Intel® Distribution of OpenVINO™ toolkit Model Optimizer.

Next, we performed model calibration with Intel® Distribution of OpenVINO™ toolkit Post-Training Optimization Tool using the Default Quantization algorithm [4,5] pipeline to produce a quantized IR in INT8 format. We used the FaultSeg Validation dataset [11] as the calibration dataset. See Figure 1 for the end-to-end workflow of INT8 inference using Intel® Distribution of OpenVINO™ Toolkit. 

Seismic end to end workflow

Figure 1: End to end workflow showing deep learning performed on a seismic dataset. The training/trained model must be in  OpenVINO supported Frameworks and Formats . The inference in this case is fault detection performed on F3 Seismic data.

Tensorflow-TensorRT Benchmarking Method

Based on the Nvidia CUDA 10.0 and cuDNN 7.3.6 base image, the GPU docker container uses Tensorflow-GPU v1.15.3 and Tensorflow with TensorRT v5.1.5 GA to run inference on their FaultSeg representations. Docker container access to GPUs on internal infrastructure was handled via Nvidia-docker installed on the internal server. Steps for benchmarking these models were otherwise consistent with that of the Intel® Distribution of OpenVINO™ toolkit and TensorFlow models.

Results

The benchmarks indicated that with INT8 precision, Intel® Xeon® Gold 6252N using Intel® Distribution of OpenVINO™ toolkit 2020.4 produced the best inference when compared to Tensorflow on NVIDIA V100 optimized by TensorRT, as shown in Figure 2. Intel® Distribution of OpenVINO™ toolkit 2020.4 support for 3D kernels in lower precision resulted in a significant performance boost while maintaining the same accuracy. The graphs in Figure 2 shows over 3x improvement with INT8 precision. Figure 3 shows the layer-wise breakdown of execution times. A significant portion of computation is in Convolution 3D layers. 

Low Precision Inference-Fig 2.png

Figure 2: Performance graph (lower is better). The graph shows the latency comparison of TensorRT (on NVIDIA V100) vs OpenVINO (on Intel® Xeon® Gold 6252N). See the bottom of the page for configuration details. For more complete information about performance and benchmark results, visit www.Intel.com/PerformanceIndex.

Normalized layer-wise execution time

Figure 3: The graphs show the comparison of FP32 vs INT8 precision layer-wise execution time of the Intel® Distribution of OpenVINO™ toolkit on Intel® Xeon® Gold 6252N. See the bottom of the page for configuration details. For more complete information about performance and benchmark results, visit www.Intel.com/PerformanceIndex.

Conclusion

As illustrated in Figure’s 1 and 2 above, the conversion process is straight forward and the performance improvements are significant without sacrificing accuracy. With 2nd generation Intel® Xeon® Scalable processors (Cascade Lake) and Intel® Distribution of OpenVINO™ Toolkit, resulted the best performance for deep learning inference with lower precision (INT8) on FaultSeg model predicting faults in a 3D seismic volume. This reduces the time to detect faults and thereby speed up oil and gas exploration.

Get the Intel® Distribution of OpenVINO™ toolkit today and start deploying high-performance, deep learning applications with a write-once-deploy-anywhere efficiency. If you have any ideas in ways we can improve the product, we welcome contributions to the open-sourced OpenVINO™ toolkit. Finally, join the conversation to discuss all things Deep Learning and OpenVINO™ toolkit in our community forum.

Join us on September 17th from 11:30 AM to 12:00 PM at Embedded Vision Summit, where we will present a session on “Acceleration of Deep Learning Using OpenVINO: 3D Seismic Case Study”.

Manas Pathak

References:

1.       Lower Numerical Precision Deep Learning Inference and Training : https://software.intel.com/content/www/us/en/develop/articles/lower-numerical-precision-deep-learning-inference-and-training.html

2.       Increasing AI Performance and Efficiency with Intel® DL Boost: https://www.intel.com/content/www/us/en/artificial-intelligence/posts/increasing-ai-performance-intel-dlboost.html

3.       Introduction to Intel® Deep Learning Boost on Second Generation Intel® Xeon® Scalable Processors: https://software.intel.com/content/www/us/en/develop/articles/introduction-to-intel-deep-learning-boost-on-second-generation-intel-xeon-scalable.html

4.       Introducing Int8 Quantization for Fast CPU Inference Using OpenVINO : https://www.intel.com/content/www/us/en/artificial-intelligence/posts/introducing-int8-quantization-for-fast-cpu-inference-using-openvino.html

5.       Enhanced low-precision pipeline to accelerate inference: https://www.intel.com/content/www/us/en/artificial-intelligence/posts/open-vino-low-precision-pipeline.html

6.       Xinming Wu, Luming Liang, Yunzhi Shi, and Sergey Fomel, (2019), "FaultSeg3D: Using synthetic data sets to train an end-to-end convolutional neural network for 3D seismic fault segmentation," GEOPHYSICS 84: IM35-IM45.

7.       F3 Dutch Data Block: https://terranubis.com/datainfo/Netherlands-Offshore-F3-Block-Complete

8.       Accelerating fault detection in 3D Seismic data using OpenVINO - Reducing time to the first Oil: https://www.intel.com/content/www/us/en/artificial-intelligence/posts/accelerating-fault-detection-in-3d-seismic-data-using-openvino.html

9.       Scaling Edge Inference Deployments on Enterprise IoT Implementations: https://www.wwt.com/api/attachments/5e99b8b600cd970084c45ed2/file

10.    FaultSeg Model: https://github.com/xinwucwp/faultSeg/

11.    FaultSeg Validation dataset https://github.com/xinwucwp/faultSeg/tree/master/data/validation/fault

 

Notices & Disclaimers:

Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Learn more at intel.com, or from the OEM or retailer. Performance results are based on testing as of dates reflected in the configurations and may not reflect all publicly available security updates. See configuration disclosure for details. No product can be absolutely secure.

Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. 

Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more complete information, see Performance Benchmark Test Disclosure.

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Your costs and results may vary.

© Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of others.

Optimization Notice

Intel’s compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice. Notice Revision #2010804

Config1
Config2
Test by​
Intel
Intel
Test date​
08/06/2020​
08/06/2020​
Platform​
Intel(R) Xeon(R) Gold 6252N CPU @ 2.30GHz​
Intel(R) Xeon(R) Gold 5220 CPU @2.20GH​
GPU​
n/a​
NVIDIA V100​
# Nodes​
1​
1​
# Sockets​
2​
2​
CPU​
96​
72​
Cores/socket, Threads/socket​
24/48​
18/36​
Serial No cpu0​
Serial No cpu1​
ucode​
0x5002f01​
0x5002f01​
HT​
On​
On​
Turbo​
On ​
On​
BIOS version (including microcodeverison: cat /proc/cpuinfo | grep microcode –m1)​
4.1.13,0x5002f01​
3.1, 0x5002f01​
System DDR Mem Config: slots / cap / run-speed​
DDR4: 12 / 16GiB / 2933 MHz​
DDR4: 6 / 32GiB / 2666 MHz​
DDR4: 8 / 16GiB / 2666 MHz​
System DCPMM Config: slots / cap /  run-speed​
n/a​
n/a​
Total Memory/Node (DDR+DCPMM)​
192 GB​
320 GB​
Total GPU Memory​
n/a​
32GB​
Storage - boot​
Storage - application drives​
439.56GB​
7TB​
NIC​
PCH​
Other HW (Accelerator)​
OS​
Ubuntu 18.04.4 LTS
Ubuntu 16.04.6 LTS
Kernel​
4.15.0-108-generic​
4.15.0-106-generic​
Mitigation variants (1,2,3,3a,4, L1TF)https://github.com/speed47/spectre-meltdown-checker​
Mitigated​
Mitigated​

 

8/31/23 Edits: Authors edited or added.

About the Author
Mary is the Community Manager for this site. She likes to bike, and do college and career coaching for high school students in her spare time.