openVINO extensions retrain a model

bartlino · ‎11-14-2022

Can someone give me a tip on how to retrain a model with the openVINO extensions?

I put in a git issue but havent had very much response...and it maybe because my question is too beginner sounding.

https://github.com/openvinotoolkit/training_extensions/issues/1336

I have a dataset and did my image annotation with labelimg, just curious if someone can give me a tip or respond to the git issue.

Are there any Google Collab notesbooks that outline this process or IPython?

Thanks

Iffa_Intel · ‎11-15-2022

Hi,

The best way to retrain a model in prespective of OpenVINO is by using OpenVINO™ Training Extensions (OTE).

This OTE allows you to export and convert the models to the needed format

You may refer to:

OpenVINO™ Training Extensions official documentation
OpenVINO Training Extensions Video tutorial

Sincerely,

Iffa

bartlino · ‎11-16-2022

Thank you! The YouTube video helps....

One thing that seems somewhat daunting is my picture files are individual XML files that the tool I use creates and looks like the OpenVINO format is one JSON files that contains everything.

Would I need to convert ALL my XML files into aJSON format?

For example this is one XML file for one picture in my custom dataset:

<annotation>
	<folder>pexel-stillimages</folder>
	<filename>pexels-abhishek-gaurav-829552.jpg</filename>
	<path>C:\labelImg\pexel-stillimages\pexels-abhishek-gaurav-829552.jpg</path>
	<source>
		<database>Unknown</database>
	</source>
	<size>
		<width>4083</width>
		<height>2758</height>
		<depth>3</depth>
	</size>
	<segmented>0</segmented>
	<object>
		<name>person</name>
		<pose>Unspecified</pose>
		<truncated>0</truncated>
		<difficult>0</difficult>
		<bndbox>
			<xmin>315</xmin>
			<ymin>156</ymin>
			<xmax>3493</xmax>
			<ymax>2693</ymax>
		</bndbox>
	</object>
</annotation>

And the JSON file in the open VINO extensions example annotations_train.json looks something like this: (chopped down snip)

{
	"annotations": [{
			"id": 25,
			"image_id": 0,
			"category_id": 1,
			"segmentation": null,
			"area": 26642.0,
			"bbox": [2310.0, 1920.0, 173.0, 154.0],
			"iscrowd": 0,
			"is_occluded": true,
			"attributes": {}
		},
		{
			"id": 26,
			"image_id": 0,
			"category_id": 1,
			"segmentation": null,
			"area": 16092.0,
			"bbox": [2231.0, 1896.0, 149.0, 108.0],
			"iscrowd": 0,
			"is_occluded": true,
			"attributes": {}
		}
	],
	"images": [{
			"id": 0,
			"width": 4160,
			"height": 3120,
			"file_name": "image_000200.jpg",
			"license": null,
			"flickr_url": null,
			"coco_url": null,
			"date_captured": null,
			"image": "image_000200.jpg",
			"dataset": "Mapillary_Vistas"
		},
		{
			"id": 1,
			"width": 3984,
			"height": 2988,
			"file_name": "image_000500.jpg",
			"license": null,
			"flickr_url": null,
			"coco_url": null,
			"date_captured": null,
			"image": "image_000500.jpg",
			"dataset": "Mapillary_Vistas"
		}
	],
	"categories": [{
			"id": 0,
			"name": "bg",
			"supercategory": ""
		},
		{
			"id": 1,
			"name": "vehicle",
			"supercategory": ""
		}
	]
}

Where it looks like there is keys for annotations, images, and categories. Would I need to create a script to convert my XML based annotations to this format or would there be some sort of tool made by Intel that could do the XML to JSON conversion for me? Or if there was specific image annotation tool that outputs this format direct it wouldn't be very hard to redo all my annotations, etc...

Thanks for any advice.

Iffa_Intel · ‎11-16-2022

Generally, most dataset annotations are in JSON format (e.g. coco dataset).

The coco dataset (once downloaded) would have 2 directories:

annotations folder that consists of JSON file configs and
value folder that contains images in jpg format.

XML file is also used by OpenVINO but in a different phase.

Model Optimizer would convert a trained model (Tensorflow, ONNX, Caffe, etc) into Intermediate Representation (IR) that consists of 2 crucial files, XML and BIN. This IR allows you to use/infer your model with OV.

NOTE: it's important to keep the same hierarchies and formats when bringing our own data into training extension env

Hence, referring to the video tutorial, it is best to keep the data & folder in the same format as the video depicted.

Perhaps you could choose any of these example data (the one that is closely related to yours) and then modify them to your needs.

Sincerely,

Iffa

Iffa_Intel · ‎11-22-2022

Intel will no longer monitor this thread since we have provided a solution. If you need any additional information from Intel, please submit a new question.

Sincerely,

Iffa