- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi, I am using the file _object_detection_demo_"yolov3_async.py_ demo.py" file provided in the OpenVINO 2020.1.023. I am facing trouble in understanding how the results initially obtained as are :
```
#Layer | Feature map shape
#detector/yolo-v3-tiny/Conv_12/BiasAdd/YoloRegion | (1, 255, 26, 26)
#detector/yolo-v3-tiny/Conv_9/BiasAdd/YoloRegion | (1, 255, 13, 13)
```
But then, these results are flattened and then the confidence, coordinates are extracted very mysteriously using the function _obj_index = entry_index(params.side, params.coords, params.classes, n * side_square + i, params.coords)_
where the function definition is as:
```
def entry_index(side, coord, classes, location, entry):
side_power_2 = side ** 2
n = location // side_power_2
loc = location % side_power_2
return int(side_power_2 * (n * (coord + classes + 1) + entry) + loc)
```
Please help me understand the process. Why are the results extracted this way from the flattened blob ?
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi, thank you for feedback, we would simplify output postprocessing.
Back to your question. This is mostly legacy and was made for compatibility with V2 version. You should not flatten output in your app and could use array slices for access to some box, e.g:
def get_box(blob, i, j, n): return blob[0, n*85:(n+1)*85, i, j]
Or you can wait few days while we fix it here Open Model Zoo
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for reaching out. Can you explain how the i, j and n are determined? I mean where do they come from ?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Sorry for delay,
The YOLO output (in IR) can be described as 3D tensor with shape [Cy,Cx, N*B] (let set batch to 1 for simplification), where Cx,Cy is a grid size. So, for each cell net predicts N boundnig boxes (B), which contain coordinates, probabilities etc. And i,j,n are indexes of cell and bounding box number.
Also you could look model's description and wait for PR with simplified postprocessing
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page