murphl15@W10F84D6M3:~/inference_results_v4.0/closed/Intel/code/automation$ python3 run.py -n bert-99 -d ${DATA_DIR}/bert/dataset -m ${DATA_DIR}/bert/model -t ${OUTPUT_DIR} -x ${SUFFIX} [2024-09-24 19:19:03,388][INFO] run.py:23   - Loading configurations from /home/murphl15/inference_results_v4.0/closed/Intel/code/automation/config.yaml. [2024-09-24 19:19:03,425][INFO] run.py:141  - Parsing args: [2024-09-24 19:19:03,426][INFO] run.py:144  - args: Namespace(accuracy_only=False, ci_run=False, compliance_only=False, container_name_suffix='cpu', dataset_dir='/data//bert/dataset', dtype='int8', implementation='pytorch-cpu', model='bert-99', model_dir='/data//bert/model', offline_only=False, output='/data/Intel/', performance_only=False, server_only=False, skip_create_container=False, skip_data_preprocess=False, skip_docker_build=False) Path mapping: ╒════════════════════╤══════════════════════════════════════════╤═══════════════════════════════════════╕ │                    │ Local                                    │ Container                             │ ╞════════════════════╪══════════════════════════════════════════╪═══════════════════════════════════════╡ │ code dir           │ /home/murphl15/inference_results_v4.0/cl │ /opt/workdir/code/bert-99/pytorch-cpu │ │                    │ osed/Intel/code/bert-99/pytorch-cpu      │                                       │ ├────────────────────┼──────────────────────────────────────────┼───────────────────────────────────────┤ │ automation kit dir │ /home/murphl15/inference_results_v4.0/cl │ /opt/workdir/code/bert-99/pytorch-    │ │                    │ osed/Intel/code/automation               │ cpu/automation                        │ ├────────────────────┼──────────────────────────────────────────┼───────────────────────────────────────┤ │ data dir           │ /data//bert/dataset                      │ /data/mlperf_data/bert/dataset        │ ├────────────────────┼──────────────────────────────────────────┼───────────────────────────────────────┤ │ model dir          │ /data//bert/model                        │ /data/mlperf_data/bert/model          │ ├────────────────────┼──────────────────────────────────────────┼───────────────────────────────────────┤ │ output dir         │ /data/Intel/                             │ /output                               │ ╘════════════════════╧══════════════════════════════════════════╧═══════════════════════════════════════╛ Runtime options: ╒═════════════╤═════════╕ │ Option      │ Value   │ ╞═════════════╪═════════╡ │ prepare     │ True    │ ├─────────────┼─────────┤ │ performance │ True    │ ├─────────────┼─────────┤ │ accuracy    │ True    │ ├─────────────┼─────────┤ │ compliance  │ True    │ ├─────────────┼─────────┤ │ offline     │ True    │ ├─────────────┼─────────┤ │ server      │ True    │ ├─────────────┼─────────┤ │ sensors     │ True    │ ╘═════════════╧═════════╛ [2024-09-24 19:19:03,427][INFO] run.py:281  - Collecting environment information on baremetal. [2024-09-24 19:19:03,441][INFO] run.py:212  - Building docker image for bert-99/pytorch-cpu/int8. [+] Building 6103.2s (28/31)                                                                                               docker:default => [internal] load build definition from Dockerfile                                                                                 0.0s => => transferring dockerfile: 5.18kB                                                                                               0.0s => resolve image config for docker-image://docker.io/docker/dockerfile:experimental                                                 0.6s => CACHED docker-image://docker.io/docker/dockerfile:experimental@sha256:600e5c62eedff338b3f7a0850beb7c05866e0ef27b2d2e8c02aa468e7  0.0s => [internal] load build definition from Dockerfile                                                                                 0.0s => [internal] load .dockerignore                                                                                                    0.0s => => transferring context: 2B                                                                                                      0.0s => [internal] load metadata for docker.io/library/rockylinux:8.7                                                                    0.5s => CACHED [dev-base 1/3] FROM docker.io/library/rockylinux:8.7@sha256:68bef3459bbb8c33841575a7f71c4de94718b7bbd103fd0417a537395d40  0.0s => [internal] load build context                                                                                                    0.0s => => transferring context: 7.94kB                                                                                                  0.0s => [dev-base 2/3] RUN --mount=type=cache,id=yum-dev,target=/var/cache/yum     DEBIAN_FRONTEND=noninteractive dnf install -y     c  84.0s => [dev-base 3/3] RUN echo "alias ll='ls -l'" >> /root/.bashrc                                                                      0.4s => [conda 1/8] RUN wget -O ~/miniconda.sh https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh [repo.anaconda.com]  &&     chmod +x   61.0s => [conda 2/8] RUN /opt/conda/bin/conda config --add channels https://software.repos.intel.com/python/conda [software.repos.intel.com]                         0.6s => [conda 3/8] RUN /opt/conda/bin/conda install -y -c conda-forge ninja==1.10.2 cmake==3.22.2                                      38.1s => [conda 4/8] RUN /opt/conda/bin/conda install -y -c conda-forge llvm-openmp==12.0.1                                               8.3s => [conda 5/8] RUN /opt/conda/bin/conda install -y -c https://software.repos.intel.com/python/conda [software.repos.intel.com] mkl==2023.1.0                  90.7s => [conda 6/8] RUN /opt/conda/bin/conda clean -ya                                                                                   1.7s => [conda 7/8] RUN cd /opt && git clone https://github.com/llvm/llvm-project.git [github.com] &&     cd llvm-project && git checkout llvmorg-  545.2s => [conda 8/8] RUN cd /opt/llvm-project && mkdir build && cd build &&     conda list | grep ninja &&     cmake ../llvm -GNinja   2017.4s => [build 1/4] COPY --from=conda /opt/conda /opt/conda                                                                             17.8s => [build 2/4] WORKDIR /opt/workdir                                                                                                 0.0s => [build 3/4] COPY ./code/bert-99/pytorch-cpu/patches patches                                                                      0.1s => [build 4/4] RUN git clone https://github.com/pytorch/pytorch.git [github.com] &&     cd pytorch && git checkout v1.12.0 &&     git submod  2936.2s => [setup 1/6] COPY --from=build /opt/conda /opt/conda                                                                             16.1s => [setup 2/6] WORKDIR /opt/workdir                                                                                                 0.0s => [setup 3/6] COPY ./code/bert-99 code/bert-99                                                                                     0.1s => [setup 4/6] COPY ./code/run_clean.sh code/bert-99/pytorch-cpu/run_clean.sh                                                       0.0s => [setup 5/6] COPY ./code/user_config.py code/user_config.py                                                                       0.0s => ERROR [setup 6/6] RUN cd code/bert-99/pytorch-cpu/ &&     if [ -d "inference" ];then rm -rf inference ;fi &&     git clone --  265.3s ------ > [setup 6/6] RUN cd code/bert-99/pytorch-cpu/ &&     if [ -d "inference" ];then rm -rf inference ;fi &&     git clone --recursive https://github.com/mlcommons/inference.git [github.com]  &&     cp inference/mlperf.conf . &&     cd mlperf_plugins && if [ -d "onednn" ];then rm -rf onednn ; fi && git clone https://github.com/oneapi-src/oneDNN.git [github.com] onednn&&     cd onednn && git checkout v2.6 && git apply ../../patches/onednnv2_6.patch &&     cd ../../ && rm -rf /opt/conda/lib/cmake/mkl/* && mkdir build && cd build &&     cmake -DCMAKE_CXX_FLAGS="-march=native" -DCMAKE_CXX_COMPILER=clang++ -DCMAKE_C_COMPILER=clang -DBUILD_TPPS_INTREE=ON -DCMAKE_BUILD_TYPE=Release -DCMAKE_PREFIX_PATH="$(dirname $(python3 -c 'import torch; print(torch.__file__)'));../cmake/Modules" -GNinja -DUSERCP=ON .. &&     ninja && pip install boto3==1.34.35 tokenization==1.0.7: 0.381 Cloning into 'inference'... 132.8 Submodule 'language/bert/DeepLearningExamples' (https://github.com/NVIDIA/DeepLearningExamples.git [github.com]) registered for path 'language/bert/DeepLearningExamples' 132.8 Cloning into '/opt/workdir/code/bert-99/pytorch-cpu/inference/language/bert/DeepLearningExamples'... 162.7 Submodule path 'language/bert/DeepLearningExamples': checked out 'b03375bd6c2c5233130e61a3be49e26d1a20ac7c' 162.7 Submodule 'PyTorch/SpeechRecognition/Jasper/external/tensorrt-inference-server' (https://github.com/NVIDIA/tensorrt-inference-server.git [github.com]) registered for path 'language/bert/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/external/tensorrt-inference-server' 162.7 Submodule 'PyTorch/Translation/Transformer/cutlass' (https://github.com/NVIDIA/cutlass.git [github.com]) registered for path 'language/bert/DeepLearningExamples/PyTorch/Translation/Transformer/cutlass' 162.7 Cloning into '/opt/workdir/code/bert-99/pytorch-cpu/inference/language/bert/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/external/tensorrt-inference-server'... 173.4 Cloning into '/opt/workdir/code/bert-99/pytorch-cpu/inference/language/bert/DeepLearningExamples/PyTorch/Translation/Transformer/cutlass'... 185.5 Submodule path 'language/bert/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/external/tensorrt-inference-server': checked out '71f0771cb8cb2a2eb1c6a9433f9a56dd1f206c96' 185.7 Submodule path 'language/bert/DeepLearningExamples/PyTorch/Translation/Transformer/cutlass': checked out 'ed2ed4d667ce95e1371bd62db32b6a114e774336' 185.7 Submodule 'tools/external/googletest' (https://github.com/google/googletest.git [github.com]) registered for path 'language/bert/DeepLearningExamples/PyTorch/Translation/Transformer/cutlass/tools/external/googletest' 185.7 Cloning into '/opt/workdir/code/bert-99/pytorch-cpu/inference/language/bert/DeepLearningExamples/PyTorch/Translation/Transformer/cutlass/tools/external/googletest'... 189.7 Submodule path 'language/bert/DeepLearningExamples/PyTorch/Translation/Transformer/cutlass/tools/external/googletest': checked out '9077ec7efe5b652468ab051e93c67589d5cb8f85' 189.8 Cloning into 'onednn'... 247.3 Note: switching to 'v2.6'. 247.3 247.3 You are in 'detached HEAD' state. You can look around, make experimental 247.3 changes and commit them, and you can discard any commits you make in this 247.3 state without impacting any branches by switching back to a branch. 247.3 247.3 If you want to create a new branch to retain commits you create, you may 247.3 do so (now or later) by using -c with the switch command. Example: 247.3 247.3   git switch -c 247.3 247.3 Or undo this operation with: 247.3 247.3   git switch - 247.3 247.3 Turn off this advice by setting config variable advice.detachedHead to false 247.3 247.3 HEAD is now at 52b5f107dd src: cpu: x64: reorder: improve performance for shapes with large spatial 248.1 -- The C compiler identification is Clang 15.0.7 248.1 -- The CXX compiler identification is Clang 15.0.7 248.1 -- Detecting C compiler ABI info 248.2 -- Detecting C compiler ABI info - done 248.2 -- Check for working C compiler: /opt/conda/bin/clang - skipped 248.2 -- Detecting C compile features 248.2 -- Detecting C compile features - done 248.2 -- Detecting CXX compiler ABI info 248.2 -- Detecting CXX compiler ABI info - done 248.2 -- Check for working CXX compiler: /opt/conda/bin/clang++ - skipped 248.2 -- Detecting CXX compile features 248.2 -- Detecting CXX compile features - done 248.2 -- Looking for pthread.h 248.3 -- Looking for pthread.h - found 248.3 -- Performing Test CMAKE_HAVE_LIBC_PTHREAD 248.3 -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed 248.3 -- Looking for pthread_create in pthreads 248.4 -- Looking for pthread_create in pthreads - not found 248.4 -- Looking for pthread_create in pthread 248.4 -- Looking for pthread_create in pthread - found 248.4 -- Found Threads: TRUE 248.4 CMake Warning at /opt/conda/lib/python3.8/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:22 (message): 248.4   static library kineto_LIBRARY-NOTFOUND not found. 248.4 Call Stack (most recent call first): 248.4   /opt/conda/lib/python3.8/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:127 (append_torchlib_if_found) 248.4   CMakeLists.txt:9 (find_package) 248.4 248.4 248.4 -- Found Torch: /opt/conda/lib/python3.8/site-packages/torch/lib/libtorch.so 248.7 -- Found OpenMP_C: -fopenmp=libomp (found version "5.0") 248.7 -- Found OpenMP_CXX: -fopenmp=libomp (found version "5.0") 248.7 -- Found OpenMP: TRUE (found version "5.0") 248.7 -- Using C++ compiler flags: -march=native -O3 -W -Wall 248.7 -- Using C++ standard: 14 248.7 -- Using static linker flags: 248.7 -- Using shared linker flags: 248.7 -- Using output path: /opt/workdir/code/bert-99/pytorch-cpu/build 248.7 mlperf_loadgen v4.1 248.7 -- Found PythonInterp: /opt/conda/bin/python (found version "3.8.19") 248.7 -- Using Python interpreter: /opt/conda/bin/python 249.1 CMake Warning at /opt/conda/lib/python3.8/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:22 (message): 249.1   static library kineto_LIBRARY-NOTFOUND not found. 249.1 Call Stack (most recent call first): 249.1   /opt/conda/lib/python3.8/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:127 (append_torchlib_if_found) 249.1   mlperf_plugins/CMakeLists.txt:9 (find_package) 249.1 249.1 249.1 -- Found OpenMP_C: -fopenmp=libomp (found version "5.0") 249.1 -- Found OpenMP_CXX: -fopenmp=libomp (found version "5.0") 249.1 -- DNNL_TARGET_ARCH: X64 249.1 -- DNNL_LIBRARY_NAME: dnnl 249.1 -- Found OpenMP_C: -fopenmp=libomp (found version "5.0") 249.1 -- Found OpenMP_CXX: -fopenmp=libomp (found version "5.0") 249.1 -- Could NOT find Doxyrest (missing: DOXYREST_EXECUTABLE) 249.1 -- Found PythonInterp: /opt/conda/bin/python (found suitable version "3.8.19", minimum required is "2.7") 249.1 -- Could NOT find Sphinx (missing: SPHINX_EXECUTABLE) 249.1 -- Found Git: /usr/bin/git (found version "2.43.5") 249.1 -- Enabled workload: TRAINING 249.1 -- Enabled primitives: ALL 249.1 -- Enabled primitive CPU ISA: ALL 249.1 -- Enabled primitive GPU ISA: ALL 249.1 -- Primitive cache is disabled 249.2 -- The ASM compiler identification is Clang with GNU-like command-line 249.2 -- Found assembler: /opt/conda/bin/clang 249.2 Extra Torch C++ flags: -D_GLIBCXX_USE_CXX11_ABI=1 249.2 Extra OpenMP C++ flags: -fopenmp=libomp 249.2 Extra OpenMP C++ includes: /opt/conda/include 249.2 Extra OpenMP C++ libraries: /opt/conda/lib/libiomp5.so 249.2 Torch inlucde directories: /opt/conda/lib/python3.8/site-packages/torch/include;/opt/conda/lib/python3.8/site-packages/torch/include/torch/csrc/api/include 249.2 Torch linking libraries: torch;torch_library;/opt/conda/lib/python3.8/site-packages/torch/lib/libc10.so 249.2 Extra Torch C++ flags: -D_GLIBCXX_USE_CXX11_ABI=1 249.2 Extra OpenMP C++ flags: -fopenmp=libomp 249.2 Extra OpenMP C++ includes: /opt/conda/include 249.2 Extra OpenMP C++ Libraries: /opt/conda/lib/libiomp5.so 249.2 Torch Inlucde directories: /opt/conda/lib/python3.8/site-packages/torch/include;/opt/conda/lib/python3.8/site-packages/torch/include/torch/csrc/api/include 249.2 Torch linking libraries: torch;torch_library;/opt/conda/lib/python3.8/site-packages/torch/lib/libc10.so 249.2 -- Configuring done 249.4 -- Generating done 249.4 -- Build files have been written to: /opt/workdir/code/bert-99/pytorch-cpu/build 249.8 [1/800] Building CXX object CMakeFiles/bert_inference.dir/csrc/amx_init.cpp.o 249.8 [2/800] Building CXX object CMakeFiles/bert_inference.dir/csrc/hw_topology.cpp.o 249.8 [3/800] Building CXX object inference/loadgen/CMakeFiles/mlperf_loadgen.dir/__/__/version_generated.cc.o 250.0 [4/800] Building CXX object CMakeFiles/bert_inference.dir/csrc/kmp_launcher.cpp.o 250.0 [5/800] Building CXX object inference/loadgen/CMakeFiles/mlperf_loadgen.dir/bindings/c_api.cc.o 250.4 [6/800] Building CXX object inference/loadgen/CMakeFiles/mlperf_loadgen.dir/early_stopping.cc.o 250.8 [7/800] Building CXX object mlperf_plugins/CMakeFiles/mlperf_plugins.dir/csrc/cpu.cpp.o 251.0 [8/800] Building CXX object inference/loadgen/CMakeFiles/mlperf_loadgen.dir/utils.cc.o 251.2 [9/800] Building CXX object inference/loadgen/CMakeFiles/benchmark.dir/benchmark/repro.cpp.o 251.2 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/benchmark/repro.cpp:37:52: warning: unused parameter 'samples' [-Wunused-parameter] 251.2       const std::vector& samples) override {} 251.2                                                    ^ 251.2 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/benchmark/repro.cpp:39:52: warning: unused parameter 'samples' [-Wunused-parameter] 251.2       const std::vector& samples) override {} 251.2                                                    ^ 251.2 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/benchmark/repro.cpp:55:11: warning: comparison of integers of different signs: 'int' and 'std::vector::size_type' (aka 'unsigned long') [-Wsign-compare] 251.2     if (n > mResponses.size()) { 251.2         ~ ^ ~~~~~~~~~~~~~~~~~ 251.2 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/benchmark/repro.cpp:125:27: warning: comparison of integers of different signs: 'int' and 'size_t' (aka 'unsigned long') [-Wsign-compare] 251.2         for (int i = 0; i < actualSize; i++) { 251.2                         ~ ^ ~~~~~~~~~~ 251.2 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/benchmark/repro.cpp:171:11: warning: comparison of integers of different signs: 'int' and 'std::vector::size_type' (aka 'unsigned long') [-Wsign-compare] 251.2     if (n > reponses.size()) { 251.2         ~ ^ ~~~~~~~~~~~~~~~ 251.2 5 warnings generated. 251.5 [10/800] Building CXX object inference/loadgen/CMakeFiles/mlperf_loadgen.dir/version.cc.o 253.0 [11/800] Building CXX object mlperf_plugins/CMakeFiles/mlperf_plugins.dir/csrc/tpps/i_softmax_tpp.cpp.o 253.0 FAILED: mlperf_plugins/CMakeFiles/mlperf_plugins.dir/csrc/tpps/i_softmax_tpp.cpp.o 253.0 /opt/conda/bin/clang++ -DUSE_C10D_GLOO -DUSE_DISTRIBUTED -DUSE_RPC -DUSE_TENSORPIPE -Dmlperf_plugins_EXPORTS -Dusercp -I/opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/onednn/include -I/opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/libxsmm/include -I/opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/csrc/tpps -I/opt/workdir/code/bert-99/pytorch-cpu/build/mlperf_plugins/onednn/include -I/opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/onednn/src/../include -isystem /opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -march=native -O3 -DNDEBUG -fPIC -Wall -isystem /opt/conda/include -Wno-unused-function -march=native -mfma -D_GLIBCXX_USE_CXX11_ABI=1 -fopenmp=libomp -std=gnu++14 -MD -MT mlperf_plugins/CMakeFiles/mlperf_plugins.dir/csrc/tpps/i_softmax_tpp.cpp.o -MF mlperf_plugins/CMakeFiles/mlperf_plugins.dir/csrc/tpps/i_softmax_tpp.cpp.o.d -o mlperf_plugins/CMakeFiles/mlperf_plugins.dir/csrc/tpps/i_softmax_tpp.cpp.o -c /opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/csrc/tpps/i_softmax_tpp.cpp 253.0 In file included from /opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/csrc/tpps/i_softmax_tpp.cpp:1: 253.0 In file included from /opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/csrc/tpps/i_softmax_tpp.hpp:5: 253.0 /opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/csrc/tpps/el_common_intrin.hpp:37:31: error: unknown type name '__m256h' 253.0   static void _mm256_print_ph(__m256h a) { 253.0                               ^ 253.0 /opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/csrc/tpps/el_common_intrin.hpp:44:31: error: unknown type name '__m512h' 253.0   static void _mm512_print_ph(__m512h a) { 253.0                               ^ 253.0 /opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/csrc/tpps/el_common_intrin.hpp:48:35: error: use of undeclared identifier '_mm256_loadu_ph' 253.0     auto f_half = _mm512_cvtph_ps(_mm256_loadu_ph((void*)mem)); 253.0                                   ^ 253.0 /opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/csrc/tpps/el_common_intrin.hpp:49:35: error: use of undeclared identifier '_mm256_loadu_ph' 253.0     auto s_half = _mm512_cvtph_ps(_mm256_loadu_ph((void*)&mem[16])); 253.0                                   ^ 253.0 /opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/csrc/tpps/el_common_intrin.hpp:78:15: error: unknown type name '__m512h' 253.0 static inline __m512h _mm512_mlperf_erf_ph(__m512h x) { 253.0               ^ 253.0 /opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/csrc/tpps/el_common_intrin.hpp:78:44: error: unknown type name '__m512h' 253.0 static inline __m512h _mm512_mlperf_erf_ph(__m512h x) { 253.0                                            ^ 253.0 /opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/csrc/tpps/el_common_intrin.hpp:79:12: error: use of undeclared identifier '_mm512_set1_ph' 253.0   auto a = _mm512_set1_ph(-0.2888f); 253.0            ^ 253.0 /opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/csrc/tpps/el_common_intrin.hpp:80:12: error: use of undeclared identifier '_mm512_set1_ph' 253.0   auto b = _mm512_set1_ph(1.0217744f); 253.0            ^ 253.0 /opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/csrc/tpps/el_common_intrin.hpp:81:12: error: use of undeclared identifier '_mm512_set1_ph' 253.0   auto c = _mm512_set1_ph(0.0962405432f); 253.0            ^ 253.0 /opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/csrc/tpps/el_common_intrin.hpp:83:13: error: use of undeclared identifier '_mm512_set1_ph' 253.0   auto nb = _mm512_set1_ph(1.769f); 253.0             ^ 253.0 /opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/csrc/tpps/el_common_intrin.hpp:104:15: error: unknown type name '__m512h' 253.0 static inline __m512h _mm512_gelu_ph(__m512h x) { 253.0               ^ 253.0 /opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/csrc/tpps/el_common_intrin.hpp:104:38: error: unknown type name '__m512h' 253.0 static inline __m512h _mm512_gelu_ph(__m512h x) { 253.0                                      ^ 253.0 /opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/csrc/tpps/el_common_intrin.hpp:105:18: error: use of undeclared identifier '_mm512_set1_ph' 253.0   auto rsqrt_2 = _mm512_set1_ph(0.70710678); 253.0                  ^ 253.0 /opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/csrc/tpps/el_common_intrin.hpp:106:48: error: use of undeclared identifier '_mm512_set1_ph' 253.0   auto y = _mm512_mlperf_erf_ph(x * rsqrt_2) + _mm512_set1_ph(1); 253.0                                                ^ 253.0 /opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/csrc/tpps/el_common_intrin.hpp:108:14: error: use of undeclared identifier '_mm512_set1_ph' 253.0   return x * _mm512_set1_ph(0.5f) * y; 253.0              ^ 253.0 /opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/csrc/tpps/el_common_intrin.hpp:124:54: error: unknown type name '__m512h' 253.0 static inline __m512i _mm512_scale_minmax_gelu_i8_ph(__m512h x, __m512h vS, __m512h vS2) { 253.0                                                      ^ 253.0 /opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/csrc/tpps/el_common_intrin.hpp:124:65: error: unknown type name '__m512h' 253.0 static inline __m512i _mm512_scale_minmax_gelu_i8_ph(__m512h x, __m512h vS, __m512h vS2) { 253.0                                                                 ^ 253.0 /opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/csrc/tpps/el_common_intrin.hpp:124:77: error: unknown type name '__m512h' 253.0 static inline __m512i _mm512_scale_minmax_gelu_i8_ph(__m512h x, __m512h vS, __m512h vS2) { 253.0                                                                             ^ 253.0 /opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/csrc/tpps/el_common_intrin.hpp:125:14: error: use of undeclared identifier '_mm512_set1_ph' 253.0   auto max = _mm512_set1_ph(127.f); 253.0              ^ 253.0 fatal error: too many errors emitted, stopping now [-ferror-limit=] 253.0 20 errors generated. 253.9 [12/800] Building CXX object inference/loadgen/CMakeFiles/mlperf_loadgen.dir/results.cc.o 253.9 [13/800] Building CXX object inference/loadgen/CMakeFiles/mlperf_loadgen.dir/issue_query_controller.cc.o 253.9 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/issue_query_controller.cc:480:20: warning: lambda capture 'thread_idx' is not used [-Wunused-lambda-capture] 253.9         LogDetail([thread_idx](AsyncDetail& detail) { 253.9                    ^ 253.9 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/issue_query_controller.cc:247:7: note: in instantiation of function template specialization 'mlperf::loadgen::IssueQueryController::IssueQueriesInternal' requested here 253.9       IssueQueriesInternal(num_threads, thread_idx); 253.9       ^ 253.9 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/issue_query_controller.cc:480:20: warning: lambda capture 'thread_idx' is not used [-Wunused-lambda-capture] 253.9         LogDetail([thread_idx](AsyncDetail& detail) { 253.9                    ^ 253.9 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/issue_query_controller.cc:300:5: note: in instantiation of function template specialization 'mlperf::loadgen::IssueQueryController::IssueQueriesInternal' requested here 253.9     IssueQueriesInternal(1, 0); 253.9     ^ 253.9 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/issue_query_controller.cc:480:20: warning: lambda capture 'thread_idx' is not used [-Wunused-lambda-capture] 253.9         LogDetail([thread_idx](AsyncDetail& detail) { 253.9                    ^ 253.9 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/issue_query_controller.cc:300:5: note: in instantiation of function template specialization 'mlperf::loadgen::IssueQueryController::IssueQueriesInternal' requested here 253.9     IssueQueriesInternal(1, 0); 253.9     ^ 253.9 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/issue_query_controller.cc:480:20: warning: lambda capture 'thread_idx' is not used [-Wunused-lambda-capture] 253.9         LogDetail([thread_idx](AsyncDetail& detail) { 253.9                    ^ 253.9 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/issue_query_controller.cc:300:5: note: in instantiation of function template specialization 'mlperf::loadgen::IssueQueryController::IssueQueriesInternal' requested here 253.9     IssueQueriesInternal(1, 0); 253.9     ^ 253.9 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/issue_query_controller.cc:480:20: warning: lambda capture 'thread_idx' is not used [-Wunused-lambda-capture] 253.9         LogDetail([thread_idx](AsyncDetail& detail) { 253.9                    ^ 253.9 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/issue_query_controller.cc:300:5: note: in instantiation of function template specialization 'mlperf::loadgen::IssueQueryController::IssueQueriesInternal' requested here 253.9     IssueQueriesInternal(1, 0); 253.9     ^ 253.9 5 warnings generated. 254.0 [14/800] Building CXX object inference/loadgen/CMakeFiles/mlperf_loadgen.dir/test_settings_internal.cc.o 254.7 [15/800] Building CXX object inference/loadgen/CMakeFiles/mlperf_loadgen.dir/logging.cc.o 254.7 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/logging.cc:483:61: warning: unused parameter 'completion_time' [-Wunused-parameter] 254.7                                       PerfClock::time_point completion_time, 254.7                                                             ^ 254.7 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/logging.cc:601:68: warning: unused parameter 'expected_count' [-Wunused-parameter] 254.7 std::vector AsyncLog::GetTokenLatencies(size_t expected_count) { 254.7                                                                    ^ 254.7 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/logging.cc:607:72: warning: unused parameter 'expected_count' [-Wunused-parameter] 254.7 std::vector AsyncLog::GetTimePerOutputToken(size_t expected_count){ 254.7                                                                        ^ 254.7 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/logging.cc:613:58: warning: unused parameter 'expected_count' [-Wunused-parameter] 254.7 std::vector AsyncLog::GetTokensPerSample(size_t expected_count) { 254.7                                                          ^ 254.7 4 warnings generated. 255.4 [16/800] Building CXX object mlperf_plugins/CMakeFiles/mlperf_plugins.dir/csrc/softmax.cpp.o 255.4 FAILED: mlperf_plugins/CMakeFiles/mlperf_plugins.dir/csrc/softmax.cpp.o 255.4 /opt/conda/bin/clang++ -DUSE_C10D_GLOO -DUSE_DISTRIBUTED -DUSE_RPC -DUSE_TENSORPIPE -Dmlperf_plugins_EXPORTS -Dusercp -I/opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/onednn/include -I/opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/libxsmm/include -I/opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/csrc/tpps -I/opt/workdir/code/bert-99/pytorch-cpu/build/mlperf_plugins/onednn/include -I/opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/onednn/src/../include -isystem /opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -march=native -O3 -DNDEBUG -fPIC -Wall -isystem /opt/conda/include -Wno-unused-function -march=native -mfma -D_GLIBCXX_USE_CXX11_ABI=1 -fopenmp=libomp -std=gnu++14 -MD -MT mlperf_plugins/CMakeFiles/mlperf_plugins.dir/csrc/softmax.cpp.o -MF mlperf_plugins/CMakeFiles/mlperf_plugins.dir/csrc/softmax.cpp.o.d -o mlperf_plugins/CMakeFiles/mlperf_plugins.dir/csrc/softmax.cpp.o -c /opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/csrc/softmax.cpp 255.4 In file included from /opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/csrc/softmax.cpp:5: 255.4 In file included from /opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/csrc/tpps/i_softmax_tpp.hpp:5: 255.4 /opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/csrc/tpps/el_common_intrin.hpp:37:31: error: unknown type name '__m256h' 255.4   static void _mm256_print_ph(__m256h a) { 255.4                               ^ 255.4 /opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/csrc/tpps/el_common_intrin.hpp:44:31: error: unknown type name '__m512h' 255.4   static void _mm512_print_ph(__m512h a) { 255.4                               ^ 255.4 /opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/csrc/tpps/el_common_intrin.hpp:48:35: error: use of undeclared identifier '_mm256_loadu_ph' 255.4     auto f_half = _mm512_cvtph_ps(_mm256_loadu_ph((void*)mem)); 255.4                                   ^ 255.4 /opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/csrc/tpps/el_common_intrin.hpp:49:35: error: use of undeclared identifier '_mm256_loadu_ph' 255.4     auto s_half = _mm512_cvtph_ps(_mm256_loadu_ph((void*)&mem[16])); 255.4                                   ^ 255.4 /opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/csrc/tpps/el_common_intrin.hpp:78:15: error: unknown type name '__m512h' 255.4 static inline __m512h _mm512_mlperf_erf_ph(__m512h x) { 255.4               ^ 255.4 /opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/csrc/tpps/el_common_intrin.hpp:78:44: error: unknown type name '__m512h' 255.4 static inline __m512h _mm512_mlperf_erf_ph(__m512h x) { 255.4                                            ^ 255.4 /opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/csrc/tpps/el_common_intrin.hpp:79:12: error: use of undeclared identifier '_mm512_set1_ph' 255.4   auto a = _mm512_set1_ph(-0.2888f); 255.4            ^ 255.4 /opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/csrc/tpps/el_common_intrin.hpp:80:12: error: use of undeclared identifier '_mm512_set1_ph' 255.4   auto b = _mm512_set1_ph(1.0217744f); 255.4            ^ 255.4 /opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/csrc/tpps/el_common_intrin.hpp:81:12: error: use of undeclared identifier '_mm512_set1_ph' 255.4   auto c = _mm512_set1_ph(0.0962405432f); 255.4            ^ 255.4 /opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/csrc/tpps/el_common_intrin.hpp:83:13: error: use of undeclared identifier '_mm512_set1_ph' 255.4   auto nb = _mm512_set1_ph(1.769f); 255.4             ^ 255.4 /opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/csrc/tpps/el_common_intrin.hpp:104:15: error: unknown type name '__m512h' 255.4 static inline __m512h _mm512_gelu_ph(__m512h x) { 255.4               ^ 255.4 /opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/csrc/tpps/el_common_intrin.hpp:104:38: error: unknown type name '__m512h' 255.4 static inline __m512h _mm512_gelu_ph(__m512h x) { 255.4                                      ^ 255.4 /opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/csrc/tpps/el_common_intrin.hpp:105:18: error: use of undeclared identifier '_mm512_set1_ph' 255.4   auto rsqrt_2 = _mm512_set1_ph(0.70710678); 255.4                  ^ 255.4 /opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/csrc/tpps/el_common_intrin.hpp:106:48: error: use of undeclared identifier '_mm512_set1_ph' 255.4   auto y = _mm512_mlperf_erf_ph(x * rsqrt_2) + _mm512_set1_ph(1); 255.4                                                ^ 255.4 /opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/csrc/tpps/el_common_intrin.hpp:108:14: error: use of undeclared identifier '_mm512_set1_ph' 255.4   return x * _mm512_set1_ph(0.5f) * y; 255.4              ^ 255.4 /opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/csrc/tpps/el_common_intrin.hpp:124:54: error: unknown type name '__m512h' 255.4 static inline __m512i _mm512_scale_minmax_gelu_i8_ph(__m512h x, __m512h vS, __m512h vS2) { 255.4                                                      ^ 255.4 /opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/csrc/tpps/el_common_intrin.hpp:124:65: error: unknown type name '__m512h' 255.4 static inline __m512i _mm512_scale_minmax_gelu_i8_ph(__m512h x, __m512h vS, __m512h vS2) { 255.4                                                                 ^ 255.4 /opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/csrc/tpps/el_common_intrin.hpp:124:77: error: unknown type name '__m512h' 255.4 static inline __m512i _mm512_scale_minmax_gelu_i8_ph(__m512h x, __m512h vS, __m512h vS2) { 255.4                                                                             ^ 255.4 /opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/csrc/tpps/el_common_intrin.hpp:125:14: error: use of undeclared identifier '_mm512_set1_ph' 255.4   auto max = _mm512_set1_ph(127.f); 255.4              ^ 255.4 fatal error: too many errors emitted, stopping now [-ferror-limit=] 255.4 20 errors generated. 256.3 [17/800] Building CXX object mlperf_plugins/CMakeFiles/mlperf_plugins.dir/csrc/activation.cpp.o 259.6 [18/800] Building CXX object inference/loadgen/CMakeFiles/mlperf_loadgen.dir/loadgen.cc.o 259.6 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/loadgen.cc:137:47: warning: unused parameter 'response_cb' [-Wunused-parameter] 259.6                       const ResponseCallback& response_cb) override { 259.6                                               ^ 259.6 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/loadgen.cc:768:77: warning: missing field 'first_token_latency_min' initializer [-Wmissing-field-initializers] 259.6   PerformanceSummary u_perf_summary{sut->Name(), u_settings, std::move(u_pr)}; 259.6                                                                             ^ 259.6 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/loadgen.cc:820:77: warning: missing field 'first_token_latency_min' initializer [-Wmissing-field-initializers] 259.6   PerformanceSummary m_perf_summary{sut->Name(), m_settings, std::move(m_pr)}; 259.6                                                                             ^ 259.6 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/loadgen.cc:918:71: warning: missing field 'first_token_latency_min' initializer [-Wmissing-field-initializers] 259.6   PerformanceSummary perf_summary{sut->Name(), settings, std::move(pr)}; 259.6                                                                       ^ 259.6 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/loadgen.cc:989:58: warning: missing field 'first_token_latency_min' initializer [-Wmissing-field-initializers] 259.6                                        std::move(base_pr)}; 259.6                                                          ^ 259.6 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/loadgen.cc:1011:68: warning: missing field 'first_token_latency_min' initializer [-Wmissing-field-initializers] 259.6                                     std::move(base_perf_summary.pr)}; 259.6                                                                    ^ 259.6 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/loadgen.cc:1195:14: warning: lambda capture 'sut' is not used [-Wunused-lambda-capture] 259.6   LogDetail([sut, qsl, test_date_time, &sut_name, 259.6              ^~~~ 259.6 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/loadgen.cc:918:71: warning: missing field 'first_token_latency_min' initializer [-Wmissing-field-initializers] 259.6   PerformanceSummary perf_summary{sut->Name(), settings, std::move(pr)}; 259.6                                                                       ^ 259.6 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/loadgen.cc:1131:14: note: in instantiation of function template specialization 'mlperf::loadgen::RunPerformanceMode' requested here 259.6             (RunPerformanceMode), 259.6              ^ 259.6 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/loadgen.cc:1138:16: note: in instantiation of function template specialization 'mlperf::loadgen::RunFunctions::GetCompileTime' requested here 259.6         return GetCompileTime(); 259.6                ^ 259.6 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/loadgen.cc:989:58: warning: missing field 'first_token_latency_min' initializer [-Wmissing-field-initializers] 259.6                                        std::move(base_pr)}; 259.6                                                          ^ 259.6 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/loadgen.cc:1132:14: note: in instantiation of function template specialization 'mlperf::loadgen::FindPeakPerformanceMode' requested here 259.6             (FindPeakPerformanceMode)}; 259.6              ^ 259.6 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/loadgen.cc:1138:16: note: in instantiation of function template specialization 'mlperf::loadgen::RunFunctions::GetCompileTime' requested here 259.6         return GetCompileTime(); 259.6                ^ 259.6 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/loadgen.cc:1011:68: warning: missing field 'first_token_latency_min' initializer [-Wmissing-field-initializers] 259.6                                     std::move(base_perf_summary.pr)}; 259.6                                                                    ^ 259.6 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/loadgen.cc:768:77: warning: missing field 'first_token_latency_min' initializer [-Wmissing-field-initializers] 259.6   PerformanceSummary u_perf_summary{sut->Name(), u_settings, std::move(u_pr)}; 259.6                                                                             ^ 259.6 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/loadgen.cc:1031:7: note: in instantiation of function template specialization 'mlperf::loadgen::FindBoundaries' requested here 259.6       FindBoundaries(sut, qsl, sequence_gen, base_perf_summary); 259.6       ^ 259.6 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/loadgen.cc:1132:14: note: in instantiation of function template specialization 'mlperf::loadgen::FindPeakPerformanceMode' requested here 259.6             (FindPeakPerformanceMode)}; 259.6              ^ 259.6 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/loadgen.cc:1138:16: note: in instantiation of function template specialization 'mlperf::loadgen::RunFunctions::GetCompileTime' requested here 259.6         return GetCompileTime(); 259.6                ^ 259.6 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/loadgen.cc:820:77: warning: missing field 'first_token_latency_min' initializer [-Wmissing-field-initializers] 259.6   PerformanceSummary m_perf_summary{sut->Name(), m_settings, std::move(m_pr)}; 259.6                                                                             ^ 259.6 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/loadgen.cc:1057:37: note: in instantiation of function template specialization 'mlperf::loadgen::FindPeakPerformanceBinarySearch' requested here 259.6   PerformanceSummary perf_summary = FindPeakPerformanceBinarySearch( 259.6                                     ^ 259.6 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/loadgen.cc:1132:14: note: in instantiation of function template specialization 'mlperf::loadgen::FindPeakPerformanceMode' requested here 259.6             (FindPeakPerformanceMode)}; 259.6              ^ 259.6 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/loadgen.cc:1138:16: note: in instantiation of function template specialization 'mlperf::loadgen::RunFunctions::GetCompileTime' requested here 259.6         return GetCompileTime(); 259.6                ^ 259.6 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/loadgen.cc:918:71: warning: missing field 'first_token_latency_min' initializer [-Wmissing-field-initializers] 259.6   PerformanceSummary perf_summary{sut->Name(), settings, std::move(pr)}; 259.6                                                                       ^ 259.6 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/loadgen.cc:1131:14: note: in instantiation of function template specialization 'mlperf::loadgen::RunPerformanceMode' requested here 259.6             (RunPerformanceMode), 259.6              ^ 259.6 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/loadgen.cc:1140:16: note: in instantiation of function template specialization 'mlperf::loadgen::RunFunctions::GetCompileTime' requested here 259.6         return GetCompileTime(); 259.6                ^ 259.6 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/loadgen.cc:989:58: warning: missing field 'first_token_latency_min' initializer [-Wmissing-field-initializers] 259.6                                        std::move(base_pr)}; 259.6                                                          ^ 259.6 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/loadgen.cc:1132:14: note: in instantiation of function template specialization 'mlperf::loadgen::FindPeakPerformanceMode' requested here 259.6             (FindPeakPerformanceMode)}; 259.6              ^ 259.6 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/loadgen.cc:1140:16: note: in instantiation of function template specialization 'mlperf::loadgen::RunFunctions::GetCompileTime' requested here 259.6         return GetCompileTime(); 259.6                ^ 259.6 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/loadgen.cc:1011:68: warning: missing field 'first_token_latency_min' initializer [-Wmissing-field-initializers] 259.6                                     std::move(base_perf_summary.pr)}; 259.6                                                                    ^ 259.6 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/loadgen.cc:768:77: warning: missing field 'first_token_latency_min' initializer [-Wmissing-field-initializers] 259.6   PerformanceSummary u_perf_summary{sut->Name(), u_settings, std::move(u_pr)}; 259.6                                                                             ^ 259.6 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/loadgen.cc:1031:7: note: in instantiation of function template specialization 'mlperf::loadgen::FindBoundaries' requested here 259.6       FindBoundaries(sut, qsl, sequence_gen, base_perf_summary); 259.6       ^ 259.6 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/loadgen.cc:1132:14: note: in instantiation of function template specialization 'mlperf::loadgen::FindPeakPerformanceMode' requested here 259.6             (FindPeakPerformanceMode)}; 259.6              ^ 259.6 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/loadgen.cc:1140:16: note: in instantiation of function template specialization 'mlperf::loadgen::RunFunctions::GetCompileTime' requested here 259.6         return GetCompileTime(); 259.6                ^ 259.6 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/loadgen.cc:820:77: warning: missing field 'first_token_latency_min' initializer [-Wmissing-field-initializers] 259.6   PerformanceSummary m_perf_summary{sut->Name(), m_settings, std::move(m_pr)}; 259.6                                                                             ^ 259.6 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/loadgen.cc:1057:37: note: in instantiation of function template specialization 'mlperf::loadgen::FindPeakPerformanceBinarySearch' requested here 259.6   PerformanceSummary perf_summary = FindPeakPerformanceBinarySearch( 259.6                                     ^ 259.6 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/loadgen.cc:1132:14: note: in instantiation of function template specialization 'mlperf::loadgen::FindPeakPerformanceMode' requested here 259.6             (FindPeakPerformanceMode)}; 259.6              ^ 259.6 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/loadgen.cc:1140:16: note: in instantiation of function template specialization 'mlperf::loadgen::RunFunctions::GetCompileTime' requested here 259.6         return GetCompileTime(); 259.6                ^ 259.6 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/loadgen.cc:918:71: warning: missing field 'first_token_latency_min' initializer [-Wmissing-field-initializers] 259.6   PerformanceSummary perf_summary{sut->Name(), settings, std::move(pr)}; 259.6                                                                       ^ 259.6 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/loadgen.cc:1131:14: note: in instantiation of function template specialization 'mlperf::loadgen::RunPerformanceMode' requested here 259.6             (RunPerformanceMode), 259.6              ^ 259.6 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/loadgen.cc:1142:16: note: in instantiation of function template specialization 'mlperf::loadgen::RunFunctions::GetCompileTime' requested here 259.6         return GetCompileTime(); 259.6                ^ 259.6 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/loadgen.cc:989:58: warning: missing field 'first_token_latency_min' initializer [-Wmissing-field-initializers] 259.6                                        std::move(base_pr)}; 259.6                                                          ^ 259.6 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/loadgen.cc:1132:14: note: in instantiation of function template specialization 'mlperf::loadgen::FindPeakPerformanceMode' requested here 259.6             (FindPeakPerformanceMode)}; 259.6              ^ 259.6 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/loadgen.cc:1142:16: note: in instantiation of function template specialization 'mlperf::loadgen::RunFunctions::GetCompileTime' requested here 259.6         return GetCompileTime(); 259.6                ^ 259.6 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/loadgen.cc:1011:68: warning: missing field 'first_token_latency_min' initializer [-Wmissing-field-initializers] 259.6                                     std::move(base_perf_summary.pr)}; 259.6                                                                    ^ 259.6 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/loadgen.cc:768:77: warning: missing field 'first_token_latency_min' initializer [-Wmissing-field-initializers] 259.6   PerformanceSummary u_perf_summary{sut->Name(), u_settings, std::move(u_pr)}; 259.6                                                                             ^ 259.6 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/loadgen.cc:1031:7: note: in instantiation of function template specialization 'mlperf::loadgen::FindBoundaries' requested here 259.6       FindBoundaries(sut, qsl, sequence_gen, base_perf_summary); 259.6       ^ 259.6 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/loadgen.cc:1132:14: note: in instantiation of function template specialization 'mlperf::loadgen::FindPeakPerformanceMode' requested here 259.6             (FindPeakPerformanceMode)}; 259.6              ^ 259.6 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/loadgen.cc:1142:16: note: in instantiation of function template specialization 'mlperf::loadgen::RunFunctions::GetCompileTime' requested here 259.6         return GetCompileTime(); 259.6                ^ 259.6 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/loadgen.cc:820:77: warning: missing field 'first_token_latency_min' initializer [-Wmissing-field-initializers] 259.6   PerformanceSummary m_perf_summary{sut->Name(), m_settings, std::move(m_pr)}; 259.6                                                                             ^ 259.6 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/loadgen.cc:1057:37: note: in instantiation of function template specialization 'mlperf::loadgen::FindPeakPerformanceBinarySearch' requested here 259.6   PerformanceSummary perf_summary = FindPeakPerformanceBinarySearch( 259.6                                     ^ 259.6 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/loadgen.cc:1132:14: note: in instantiation of function template specialization 'mlperf::loadgen::FindPeakPerformanceMode' requested here 259.6             (FindPeakPerformanceMode)}; 259.6              ^ 259.6 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/loadgen.cc:1142:16: note: in instantiation of function template specialization 'mlperf::loadgen::RunFunctions::GetCompileTime' requested here 259.6         return GetCompileTime(); 259.6                ^ 259.6 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/loadgen.cc:918:71: warning: missing field 'first_token_latency_min' initializer [-Wmissing-field-initializers] 259.6   PerformanceSummary perf_summary{sut->Name(), settings, std::move(pr)}; 259.6                                                                       ^ 259.6 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/loadgen.cc:1131:14: note: in instantiation of function template specialization 'mlperf::loadgen::RunPerformanceMode' requested here 259.6             (RunPerformanceMode), 259.6              ^ 259.6 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/loadgen.cc:1144:16: note: in instantiation of function template specialization 'mlperf::loadgen::RunFunctions::GetCompileTime' requested here 259.6         return GetCompileTime(); 259.6                ^ 259.6 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/loadgen.cc:989:58: warning: missing field 'first_token_latency_min' initializer [-Wmissing-field-initializers] 259.6                                        std::move(base_pr)}; 259.6                                                          ^ 259.6 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/loadgen.cc:1132:14: note: in instantiation of function template specialization 'mlperf::loadgen::FindPeakPerformanceMode' requested here 259.6             (FindPeakPerformanceMode)}; 259.6              ^ 259.6 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/loadgen.cc:1144:16: note: in instantiation of function template specialization 'mlperf::loadgen::RunFunctions::GetCompileTime' requested here 259.6         return GetCompileTime(); 259.6                ^ 259.6 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/loadgen.cc:1011:68: warning: missing field 'first_token_latency_min' initializer [-Wmissing-field-initializers] 259.6                                     std::move(base_perf_summary.pr)}; 259.6                                                                    ^ 259.6 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/loadgen.cc:768:77: warning: missing field 'first_token_latency_min' initializer [-Wmissing-field-initializers] 259.6   PerformanceSummary u_perf_summary{sut->Name(), u_settings, std::move(u_pr)}; 259.6                                                                             ^ 259.6 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/loadgen.cc:1031:7: note: in instantiation of function template specialization 'mlperf::loadgen::FindBoundaries' requested here 259.6       FindBoundaries(sut, qsl, sequence_gen, base_perf_summary); 259.6       ^ 259.6 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/loadgen.cc:1132:14: note: in instantiation of function template specialization 'mlperf::loadgen::FindPeakPerformanceMode' requested here 259.6             (FindPeakPerformanceMode)}; 259.6              ^ 259.6 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/loadgen.cc:1144:16: note: in instantiation of function template specialization 'mlperf::loadgen::RunFunctions::GetCompileTime' requested here 259.6         return GetCompileTime(); 259.6                ^ 259.6 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/loadgen.cc:820:77: warning: missing field 'first_token_latency_min' initializer [-Wmissing-field-initializers] 259.6   PerformanceSummary m_perf_summary{sut->Name(), m_settings, std::move(m_pr)}; 259.6                                                                             ^ 259.6 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/loadgen.cc:1057:37: note: in instantiation of function template specialization 'mlperf::loadgen::FindPeakPerformanceBinarySearch' requested here 259.6   PerformanceSummary perf_summary = FindPeakPerformanceBinarySearch( 259.6                                     ^ 259.6 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/loadgen.cc:1132:14: note: in instantiation of function template specialization 'mlperf::loadgen::FindPeakPerformanceMode' requested here 259.6             (FindPeakPerformanceMode)}; 259.6              ^ 259.6 /opt/workdir/code/bert-99/pytorch-cpu/inference/loadgen/loadgen.cc:1144:16: note: in instantiation of function template specialization 'mlperf::loadgen::RunFunctions::GetCompileTime' requested here 259.6         return GetCompileTime(); 259.6                ^ 259.6 27 warnings generated. 259.7 [19/800] Building CXX object CMakeFiles/bert_inference.dir/csrc/bert_model.cpp.o 260.8 [20/800] Building CXX object CMakeFiles/bert_inference.dir/csrc/bert_qsl.cpp.o 261.8 [21/800] Building CXX object CMakeFiles/bert_inference.dir/csrc/torch_sut.cpp.o 261.8 /opt/workdir/code/bert-99/pytorch-cpu/csrc/torch_sut.cpp:42:74: warning: field 'nInstances_' will be initialized after field 'warmUp_' [-Wreorder-ctor] 261.8   upper_watermark_(upper_watermark), nProcsPerInstance_(intra_parallel), nInstances_(inter_parallel), 261.8   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  ^~~~~~~~~~~~~~~~~~~~~~~~~~~ 261.8   watermark_(watermark)              upper_watermark_(upper_watermark)   nProcsPerInstance_(intra_parallel) 261.8 /opt/workdir/code/bert-99/pytorch-cpu/csrc/torch_sut.cpp:46:8: warning: unused variable 'amx_status' [-Wunused-variable] 261.8   auto amx_status = amx_init::amx_init(); 261.8        ^ 261.8 /opt/workdir/code/bert-99/pytorch-cpu/csrc/torch_sut.cpp:126:15: warning: variable 'sample_count' set but not used [-Wunused-but-set-variable] 261.8           int sample_count = 0; 261.8               ^ 261.8 /opt/workdir/code/bert-99/pytorch-cpu/csrc/torch_sut.cpp:127:15: warning: unused variable 'qos_count' [-Wunused-variable] 261.8           int qos_count = 0; 261.8               ^ 261.8 /opt/workdir/code/bert-99/pytorch-cpu/csrc/torch_sut.cpp:256:39: warning: field 'nInstances_' will be initialized after field 'warmUp_' [-Wreorder-ctor] 261.8   nProcsPerInstance_(intra_parallel), nInstances_(inter_parallel), warmUp_(warmup), 261.8   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  ^~~~~~~~~~~~~~~~~~~~~~~~~~~  ~~~~~~~~~~~~~~~ 261.8   watermark_(watermark)               nProcsPerInstance_(intra_parallel) nInstances_(inter_parallel) 261.8 /opt/workdir/code/bert-99/pytorch-cpu/csrc/torch_sut.cpp:259:8: warning: unused variable 'amx_status' [-Wunused-variable] 261.8   auto amx_status = amx_init::amx_init(); 261.8        ^ 261.8 In file included from /opt/workdir/code/bert-99/pytorch-cpu/csrc/torch_sut.cpp:18: 261.8 /opt/workdir/code/bert-99/pytorch-cpu/csrc/torch_sut.hpp:100:10: warning: private field 'mThreshold_' is not used [-Wunused-private-field] 261.8   size_t mThreshold_; 261.8          ^ 261.8 /opt/workdir/code/bert-99/pytorch-cpu/csrc/torch_sut.hpp:103:10: warning: private field 'slength_' is not used [-Wunused-private-field] 261.8   size_t slength_ {384}; 261.8          ^ 261.8 /opt/workdir/code/bert-99/pytorch-cpu/csrc/torch_sut.hpp:105:10: warning: private field 'qos_pointer' is not used [-Wunused-private-field] 261.8   size_t qos_pointer {0}; 261.8          ^ 261.8 /opt/workdir/code/bert-99/pytorch-cpu/csrc/torch_sut.hpp:110:8: warning: private field 'mHt_' is not used [-Wunused-private-field] 261.8   bool mHt_; 261.8        ^ 261.8 /opt/workdir/code/bert-99/pytorch-cpu/csrc/torch_sut.hpp:183:10: warning: private field 'mThreshold_' is not used [-Wunused-private-field] 261.8   size_t mThreshold_; 261.8          ^ 261.8 /opt/workdir/code/bert-99/pytorch-cpu/csrc/torch_sut.hpp:185:10: warning: private field 'slength_' is not used [-Wunused-private-field] 261.8   size_t slength_ {385}; 261.8          ^ 261.8 /opt/workdir/code/bert-99/pytorch-cpu/csrc/torch_sut.hpp:191:8: warning: private field 'mHt_' is not used [-Wunused-private-field] 261.8   bool mHt_; 261.8        ^ 261.8 13 warnings generated. 262.1 [22/800] Building CXX object mlperf_plugins/CMakeFiles/mlperf_plugins.dir/csrc/amx_linear.cpp.o 262.1 FAILED: mlperf_plugins/CMakeFiles/mlperf_plugins.dir/csrc/amx_linear.cpp.o 262.1 /opt/conda/bin/clang++ -DUSE_C10D_GLOO -DUSE_DISTRIBUTED -DUSE_RPC -DUSE_TENSORPIPE -Dmlperf_plugins_EXPORTS -Dusercp -I/opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/onednn/include -I/opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/libxsmm/include -I/opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/csrc/tpps -I/opt/workdir/code/bert-99/pytorch-cpu/build/mlperf_plugins/onednn/include -I/opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/onednn/src/../include -isystem /opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -march=native -O3 -DNDEBUG -fPIC -Wall -isystem /opt/conda/include -Wno-unused-function -march=native -mfma -D_GLIBCXX_USE_CXX11_ABI=1 -fopenmp=libomp -std=gnu++14 -MD -MT mlperf_plugins/CMakeFiles/mlperf_plugins.dir/csrc/amx_linear.cpp.o -MF mlperf_plugins/CMakeFiles/mlperf_plugins.dir/csrc/amx_linear.cpp.o.d -o mlperf_plugins/CMakeFiles/mlperf_plugins.dir/csrc/amx_linear.cpp.o -c /opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/csrc/amx_linear.cpp 262.1 In file included from /opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/csrc/amx_linear.cpp:10: 262.1 In file included from /opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/csrc/tpps/i_linear_tpp.hpp:10: 262.1 /opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/csrc/tpps/el_common_intrin.hpp:37:31: error: unknown type name '__m256h' 262.1   static void _mm256_print_ph(__m256h a) { 262.1                               ^ 262.1 /opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/csrc/tpps/el_common_intrin.hpp:44:31: error: unknown type name '__m512h' 262.1   static void _mm512_print_ph(__m512h a) { 262.1                               ^ 262.1 /opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/csrc/tpps/el_common_intrin.hpp:48:35: error: use of undeclared identifier '_mm256_loadu_ph' 262.1     auto f_half = _mm512_cvtph_ps(_mm256_loadu_ph((void*)mem)); 262.1                                   ^ 262.1 /opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/csrc/tpps/el_common_intrin.hpp:49:35: error: use of undeclared identifier '_mm256_loadu_ph' 262.1     auto s_half = _mm512_cvtph_ps(_mm256_loadu_ph((void*)&mem[16])); 262.1                                   ^ 262.1 /opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/csrc/tpps/el_common_intrin.hpp:78:15: error: unknown type name '__m512h' 262.1 static inline __m512h _mm512_mlperf_erf_ph(__m512h x) { 262.1               ^ 262.1 /opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/csrc/tpps/el_common_intrin.hpp:78:44: error: unknown type name '__m512h' 262.1 static inline __m512h _mm512_mlperf_erf_ph(__m512h x) { 262.1                                            ^ 262.1 /opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/csrc/tpps/el_common_intrin.hpp:79:12: error: use of undeclared identifier '_mm512_set1_ph' 262.1   auto a = _mm512_set1_ph(-0.2888f); 262.1            ^ 262.1 /opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/csrc/tpps/el_common_intrin.hpp:80:12: error: use of undeclared identifier '_mm512_set1_ph' 262.1   auto b = _mm512_set1_ph(1.0217744f); 262.1            ^ 262.1 /opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/csrc/tpps/el_common_intrin.hpp:81:12: error: use of undeclared identifier '_mm512_set1_ph' 262.1   auto c = _mm512_set1_ph(0.0962405432f); 262.1            ^ 262.1 /opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/csrc/tpps/el_common_intrin.hpp:83:13: error: use of undeclared identifier '_mm512_set1_ph' 262.1   auto nb = _mm512_set1_ph(1.769f); 262.1             ^ 262.1 /opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/csrc/tpps/el_common_intrin.hpp:104:15: error: unknown type name '__m512h' 262.1 static inline __m512h _mm512_gelu_ph(__m512h x) { 262.1               ^ 262.1 /opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/csrc/tpps/el_common_intrin.hpp:104:38: error: unknown type name '__m512h' 262.1 static inline __m512h _mm512_gelu_ph(__m512h x) { 262.1                                      ^ 262.1 /opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/csrc/tpps/el_common_intrin.hpp:105:18: error: use of undeclared identifier '_mm512_set1_ph' 262.1   auto rsqrt_2 = _mm512_set1_ph(0.70710678); 262.1                  ^ 262.1 /opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/csrc/tpps/el_common_intrin.hpp:106:48: error: use of undeclared identifier '_mm512_set1_ph' 262.1   auto y = _mm512_mlperf_erf_ph(x * rsqrt_2) + _mm512_set1_ph(1); 262.1                                                ^ 262.1 /opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/csrc/tpps/el_common_intrin.hpp:108:14: error: use of undeclared identifier '_mm512_set1_ph' 262.1   return x * _mm512_set1_ph(0.5f) * y; 262.1              ^ 262.1 /opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/csrc/tpps/el_common_intrin.hpp:124:54: error: unknown type name '__m512h' 262.1 static inline __m512i _mm512_scale_minmax_gelu_i8_ph(__m512h x, __m512h vS, __m512h vS2) { 262.1                                                      ^ 262.1 /opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/csrc/tpps/el_common_intrin.hpp:124:65: error: unknown type name '__m512h' 262.1 static inline __m512i _mm512_scale_minmax_gelu_i8_ph(__m512h x, __m512h vS, __m512h vS2) { 262.1                                                                 ^ 262.1 /opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/csrc/tpps/el_common_intrin.hpp:124:77: error: unknown type name '__m512h' 262.1 static inline __m512i _mm512_scale_minmax_gelu_i8_ph(__m512h x, __m512h vS, __m512h vS2) { 262.1                                                                             ^ 262.1 /opt/workdir/code/bert-99/pytorch-cpu/mlperf_plugins/csrc/tpps/el_common_intrin.hpp:125:14: error: use of undeclared identifier '_mm512_set1_ph' 262.1   auto max = _mm512_set1_ph(127.f); 262.1              ^ 262.1 fatal error: too many errors emitted, stopping now [-ferror-limit=] 262.1 20 errors generated. 263.5 [23/800] Building CXX object mlperf_plugins/CMakeFiles/mlperf_plugins.dir/csrc/amx_mha.cpp.o 264.0 [24/800] Building CXX object mlperf_plugins/CMakeFiles/mlperf_plugins.dir/csrc/amx_mha_concat.cpp.o 264.0 [25/800] Building CXX object mlperf_plugins/CMakeFiles/mlperf_plugins.dir/csrc/init.cpp.o 264.4 [26/800] Building CXX object mlperf_plugins/CMakeFiles/mlperf_plugins.dir/csrc/normalization.cpp.o 265.0 [27/800] Building CXX object mlperf_plugins/CMakeFiles/mlperf_plugins.dir/csrc/linear.cpp.o 265.2 [28/800] Building CXX object CMakeFiles/bert_inference.dir/csrc/main.cpp.o 265.2 ninja: build stopped: subcommand failed. ------ Dockerfile:98 --------------------   97 |     ENV CONDA_PREFIX "/opt/conda"   98 | >>> RUN cd code/${BENCHMARK}/${IMPL}/ && \   99 | >>>     if [ -d "inference" ];then rm -rf inference ;fi && \ 100 | >>>     git clone --recursive https://github.com/mlcommons/inference.git [github.com]  && \ 101 | >>>     cp inference/mlperf.conf . && \ 102 | >>>     cd mlperf_plugins && if [ -d "onednn" ];then rm -rf onednn ; fi && git clone https://github.com/oneapi-src/oneDNN.git [github.com] onednn&& \ 103 | >>>     cd onednn && git checkout ${ONEDNN_VERSION} && git apply ../../patches/onednnv2_6.patch && \ 104 | >>>     cd ../../ && rm -rf /opt/conda/lib/cmake/mkl/* && mkdir build && cd build && \ 105 | >>>     cmake -DCMAKE_CXX_FLAGS="-march=native" -DCMAKE_CXX_COMPILER=clang++ -DCMAKE_C_COMPILER=clang -DBUILD_TPPS_INTREE=ON -DCMAKE_BUILD_TYPE=Release -DCMAKE_PREFIX_PATH="$(dirname $(python3 -c 'import torch; print(torch.__file__)'));../cmake/Modules" -GNinja -DUSERCP=ON .. && \ 106 | >>>     ninja && pip install boto3==1.34.35 tokenization==1.0.7 107 | -------------------- ERROR: failed to solve: process "/bin/sh -c cd code/${BENCHMARK}/${IMPL}/ &&     if [ -d \"inference\" ];then rm -rf inference ;fi &&     git clone --recursive https://github.com/mlcommons/inference.git [github.com]  &&     cp inference/mlperf.conf . &&     cd mlperf_plugins && if [ -d \"onednn\" ];then rm -rf onednn ; fi && git clone https://github.com/oneapi-src/oneDNN.git [github.com] onednn&&     cd onednn && git checkout ${ONEDNN_VERSION} && git apply ../../patches/onednnv2_6.patch &&     cd ../../ && rm -rf /opt/conda/lib/cmake/mkl/* && mkdir build && cd build &&     cmake -DCMAKE_CXX_FLAGS=\"-march=native\" -DCMAKE_CXX_COMPILER=clang++ -DCMAKE_C_COMPILER=clang -DBUILD_TPPS_INTREE=ON -DCMAKE_BUILD_TYPE=Release -DCMAKE_PREFIX_PATH=\"$(dirname $(python3 -c 'import torch; print(torch.__file__)'));../cmake/Modules\" -GNinja -DUSERCP=ON .. &&     ninja && pip install boto3==1.34.35 tokenization==1.0.7" did not complete successfully: exit code: 1