Intel® Distribution of OpenVINO™ Toolkit
Community support and discussions about the Intel® Distribution of OpenVINO™ toolkit, OpenCV, and all things computer vision-related on Intel® platforms.

SSD_Detector Training Hangs

goh__richard
Beginner
217 Views

Hi

SSD_Detector training was running fine before but suddenly hangs (attached screen).

How do I find which version/branch of openvino_training_extension I am running? thanks.

 

== hangs here ===

loading annotations into memory...
Done (t=0.03s)
creating index...
index created!
Load images in the cache: ENCODED
loading annotations into memory...
Done (t=0.01s)
creating index...
index created!
100%|??????????????????????????????????????????????????????????????????????????????????????????????????????????????| 2727/2727 [00:01<00:00, 1985.76images/s, cache usage (GB)=1.17]
loading annotations into memory...
Done (t=0.01s)
creating index...
index created!
2021-08-31 07:53:08.331873: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2021-08-31 07:53:08.441285: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:998] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-31 07:53:08.441964: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x4fb7c00 executing computations on platform CUDA. Devices:
2021-08-31 07:53:08.441977: I tensorflow/compiler/xla/service/service.cc:158] StreamExecutor device (0): GeForce RTX 2080 Ti, Compute Capability 7.5
2021-08-31 07:53:08.448906: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3000000000 Hz
2021-08-31 07:53:08.449251: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x3b7d450 executing computations on platform Host. Devices:
2021-08-31 07:53:08.449261: I tensorflow/compiler/xla/service/service.cc:158] StreamExecutor device (0): <undefined>, <undefined>
2021-08-31 07:53:08.449385: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 0 with properties:
name: GeForce RTX 2080 Ti major: 7 minor: 5 memoryClockRate(GHz): 1.635
pciBusID: 0000:01:00.0
totalMemory: 10.76GiB freeMemory: 10.61GiB
2021-08-31 07:53:08.449395: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0
2021-08-31 07:53:08.450032: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-08-31 07:53:08.450043: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 0
2021-08-31 07:53:08.450046: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0: N
2021-08-31 07:53:08.450084: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 8815 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2080 Ti, pci bus id: 0000:01:00.0, compute capability: 7.5)

=====================================

# nvidia-smi
Tue Aug 31 07:53:41 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.91.03 Driver Version: 460.91.03 CUDA Version: 11.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 GeForce RTX 208... On | 00000000:01:00.0 Off | N/A |
| 25% 46C P2 53W / 260W | 2650MiB / 11019MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 2941 C python3 2647MiB |
+-----------------------------------------------------------------------------+
#

 

 

================

# glxgears
28604 frames in 5.0 seconds = 5720.583 FPS
28682 frames in 5.0 seconds = 5734.720 FPS
28618 frames in 5.0 seconds = 5722.166 FPS
28741 frames in 5.0 seconds = 5746.703 FPS
28782 frames in 5.0 seconds = 5756.205 FPS
28761 frames in 5.0 seconds = 5750.519 FPS
28782 frames in 5.0 seconds = 5754.186 FPS
28781 frames in 5.0 seconds = 5752.511 FPS
28782 frames in 5.0 seconds = 5755.321 FPS
28782 frames in 5.0 seconds = 5753.256 FPS
28772 frames in 5.0 seconds = 5754.400 FPS
28771 frames in 5.0 seconds = 5753.643 FPS
28788 frames in 5.0 seconds = 5756.774 FPS
28925 frames in 5.0 seconds = 5782.822 FPS
28925 frames in 5.0 seconds = 5782.545 FPS
28905 frames in 5.0 seconds = 5780.795 FPS
28925 frames in 5.0 seconds = 5783.528 FPS
28782 frames in 5.0 seconds = 5754.665 FPS
28925 frames in 5.0 seconds = 5782.157 FPS
28926 frames in 5.0 seconds = 5784.623 FPS
28925 frames in 5.0 seconds = 5781.641 FPS
28886 frames in 5.0 seconds = 5774.703 FPS

0 Kudos
2 Replies
Wan_Intel
Moderator
196 Views

Hi Goh__Richard,

Thanks for reaching out to us.


For your information, SSD_Detector from OpenVINO™ Training Extensions has been deprecated. You may refer to here for more information.


However, you may use the develop branch of OpenVINO™ Training Extensions to train Deep Learning models and convert them using OpenVINO™ toolkit for optimized inference.


Steps to setup OpenVINO™ Training Extensions is available at the following page:

https://github.com/openvinotoolkit/training_extensions/#setup-openvino-training-extensions


Regards,

Wan


Wan_Intel
Moderator
164 Views

Hi Goh__Richard,

 

This thread will no longer be monitored since we have provided a suggestion.

If you need any additional information from Intel, please submit a new question.

 

Regards,

Wan


Reply