Intel® Distribution of OpenVINO™ Toolkit
Community assistance about the Intel® Distribution of OpenVINO™ toolkit, OpenCV, and all aspects of computer vision-related on Intel® platforms.

Mask RCNN with python API

Truong__Dien_Hoa
New Contributor II
1,219 Views

Hi,

I'm trying to run a mask RCNN model with python API. I succeed with the mask_RCNN_demo (C++ version), the out.png looks ok. 

However, when I try to run it following the segmentation_demo.py it is not as I expected. 

I have 2 outputs as follows:

{'masks': <openvino.inference_engine.ie_api.OutputInfo at 0x7fe0762d3eb8>,

 'reshape_do_2d': <openvino.inference_engine.ie_api.OutputInfo at 0x7fe0762d3f30>}

It should be 5 like the ouputs of mask-RCNN right ? ['num_detections', 'detection_boxes', 'detection_scores', 'detection_classes', 'detection_masks']

the dimension of 'masks' is: (100, 90, 15, 15) and 'reshape_do_2d' is (100, 7) . Why the size of image is (15,15), I think it equal to the dimension of an input image (800,800)

Thank you so much in advance,

Hoa

0 Kudos
14 Replies
Truong__Dien_Hoa
New Contributor II
1,219 Views

After digging into the C++ source code of mask_rcnn_demo I fond that the output of mask is the same as (100, 90, 15, 15) and it can mask all the source image. Seems like I misunderstand something, I will continue to read to demo code but very appreciated if someone can guide me to some tutorial about this. It is a concept of mask_rcnn or something open vino make it. Thank you in advance

0 Kudos
Ponchon__François
1,219 Views

I asked exactly the same question a few weeks ago but i face some issues too.

https://software.intel.com/en-us/forums/computer-vision/topic/801692

0 Kudos
Truong__Dien_Hoa
New Contributor II
1,219 Views

Thanks @François :D Yeah quite similar question but you are ahead me a little bit. Can I ask you about how to redraw your image from the masks output ? You mentioned object_detection methods but I don't know how to do it. Thank you in advance.

Hoa

0 Kudos
Ponchon__François
1,219 Views

Truong, Dien Hoa wrote:

Thanks @François :D Yeah quite similar question but you are ahead me a little bit. Can I ask you about how to redraw your image from the masks output ? You mentioned object_detection methods but I don't know how to do it. Thank you in advance.

Hoa



I'am currently at work then i don't have my "Python code" on this computer.
I'll be at home tonight then i'll send you tonight.

The idea is that the output mask is a 4x15x15 image of the mask. I remembered i follow this step :

  1. Convert 4x15x15 image to greyscale
  2. Resize from (15x15) to (BboxW, BboxH)
  3. Threshold the mask output
  4. Create an image (ImageW,ImageH) with pixels coresponding to BBox values to 1.
  5. Use tensorflow object_detection_api (Github) method in order to draw the mask (utils.visualisation utils from there. )

I will send you the code in about 6-7hours if you don't have any answer this time !

0 Kudos
Truong__Dien_Hoa
New Contributor II
1,219 Views

Thank you so much @François . I think I understand the concept of drawing mask now. I will try to implement it based on your suggestion. Can I ask you one more question ? How do you retrain your model. I am now just using the original mask RCNN. I have an error with my own model training by tensorflow API.

I have posted my problem on this thread : https://software.intel.com/node/804892

It includes my .pb model also the config file. Can you take a look on it ? Thank you in advance

0 Kudos
Ponchon__François
1,219 Views

Personnally i never retrained model throught openVino.
I used only tensorflow object detection API.

In order to do this, i :

  1. Created a VOC Like Dataset with a VOC Tool. It generates PNG, with one color per class and one color per object + original file.
  2. Edited dataset_tool from TF object detection API in order to load my masks.
  3. Edited the config file corresponding to my network (samples\configs directory)
  4. Put it in a "training dir"
  5. Executed the command in order to start training (on a powerful computer)
    python model_main.py --alsologtostderr --model_dir=training_inception_mask2019/ --pipeline_config_path=training_inception_mask2019/z_mask_rcnn_inception_v2_coco.config
  6. Generate inference graph
    python export_inference_graph.py --input_type image_tensor --pipeline_config_path training_inception_mask/z_mask_rcnn_inception_v2_coco.config --trained_checkpoint_prefix training_inception_mask/model.ckpt-10835 --output_directory 2018_12_22_inception_mask/
  7. Used model optimiser on .pb generated file
  8. Used OpenCV Gengraph script...
  9. Did inference with Tensorflow / OpenCV and tried OpenVino (with bad results)VOC_Dataset.JPGCustom_Dataset_Tools.JPGCustom_TrainingModelDir.JPG

For source code :
I used this code in order to make OpenVino TF compatible :

def image_to_tensor(image,channels,h,w,info="",KeepRatio=False):
  image_tensor=np.zeros(shape=(1,channels,h,w),dtype=np.uint8)
  if image.shape[:-1]!=(h,w):
    log.warning("Image {} is resized from {} to {}".format(info, image.shape[:-1],(h,w)))
  if (KeepRatio):
    h0,w0=image.shape[0],image.shape[1]
    print((h0,w0))
    ratio=float(max(h,w))/float(max(h0,w0))
    h1=int(ratio*h0)
    w1=int(ratio*w0)
    image=cv2.resize(image,(w1,h1))
    #Normalise between [0-1]
    image=image.transpose((2,0,1))
    image_tensor[0][:,0:h1,0:w1]=image
  else:
    image=cv2.resize(image,(w,h))
    image = image.transpose((2, 0, 1))
    image_tensor[0]=image

  return image_tensor

def getMask_From_BBoxMask(Bbox,OriginalMask,original_image):
  #Cette méthode à pour but de recréer un masque [0-1] de la taille de l'image
  [Height,Width]=original_image.shape[0],original_image.shape[1]
  Result=np.zeros((Height,Width),dtype=np.uint8)
  y0, x0, y1, x1 = Bbox
  ymin=min(y1,y0)
  ymax=max(y1,y0)
  xmin=min(x1,x0)
  xmax=max(x1,x0)

  xmin=int(min(max(xmin*Width,0),Width))
  xmax=int(max(min(xmax*Width,Width),0))
  ymin=int(min(max(ymin*Height,0),Height))
  ymax=int(max(min(ymax*Height,Height),0))

  BboxW=int(abs(xmax-xmin))
  BboxH=int(abs(ymax-ymin))

  if(BboxW>0 and BboxH>0):
    MaxMask=(np.amax(OriginalMask,axis=0)*255).astype(np.uint8)
    ResizedMask = cv2.resize(MaxMask, dsize=(BboxW, BboxH), interpolation=cv2.INTER_CUBIC)
    Result[ymin:ymin+BboxH,xmin:xmin+BboxW]=(ResizedMask>128)*1
  return Result


def get_OutputDict_FromVino(result,original_image,image_tensor,KeepRatio=False):
  output_dict={}
  Detection=result['detection_output'][0][0] # (?,?,[batch], label, prob, x1, y1, x2, y2)
  Labels=Detection[:,1]
  Scores=Detection[:,2]
  coef=1
  h0,w0=original_image.shape[0],original_image.shape[1]
  if(KeepRatio):
    h,w=image_tensor.shape[2],image_tensor.shape[3]
    ratio=float(max(h,w))/float(max(h0,w0))
    coef=float(max(h,w))/(ratio*float(min(h0,w0)))
  print(coef)
  if h0<w0:
    Bboxs=np.array([Detection[:,4]*coef,Detection[:,3],Detection[:,6]*coef,Detection[:,5]]).transpose()
  else:
    Bboxs=np.array([Detection[:,4],Detection[:,3]*coef,Detection[:,6],Detection[:,5]*coef]).transpose()
  output_dict['detection_boxes']=Bboxs
  output_dict['detection_classes']=Labels.astype(np.uint8)
  print(Labels[:5])
  print(Scores[:5])
  output_dict['detection_scores']=Scores
  # Masks est l'image de taille 15x15 de l'intérieur Bbox
  Masks=[]
  for i in range(len(result['masks'])):
    Masks.append(getMask_From_BBoxMask(Bboxs,result['masks'],original_image))

  output_dict['detection_masks']=np.array(Masks)
  print(output_dict['detection_masks'].shape)
  return output_dict

 

Please, let me now if you face the same issue than me !
Because i didn't solve it yet and i tried to go on C++ code in order to solve the problem (without success yet).

0 Kudos
Truong__Dien_Hoa
New Contributor II
1,219 Views

Thank you for your very detailed explanation @Francois. I did the same thing as you. I used Tensorflow API too but used the legacy/train.py instead of model_train.py and used a different kind of dataset rather than VOC. I will look further on this.

So your issue with python API is the result is worst than with the mask_rcnn_demo right ? I don't have much experience on C++ so I think I would stick with python. Do you think about making a wrapping of using this C++ code inside the python application ?

Thank you,

0 Kudos
Ponchon__François
1,219 Views

Truong, Dien Hoa wrote:

Thank you for your very detailed explanation @Francois. I did the same thing as you. I used Tensorflow API too but used the legacy/train.py instead of model_train.py and used a different kind of dataset rather than VOC. I will look further on this.

So your issue with python API is the result is worst than with the mask_rcnn_demo right ? I don't have much experience on C++ so I think I would stick with python. Do you think about making a wrapping of using this C++ code inside the python application ?

Thank you,

 

It is exactly what i'want to do ! I don't have much experience in c++ too, but i try in this post :=> Last Issue <=

In order to label my images, i used this Tool :

Pleased to help !
You know i'am a little like you when i ask myself for some help then i'am glad when people can answer !

0 Kudos
Truong__Dien_Hoa
New Contributor II
1,219 Views

Sure @Francois, especially when I see the french comment in your code :D . Thanks to your help now I understand further mask-RCNN.

I then come to this post https://www.pyimagesearch.com/2018/11/19/mask-r-cnn-with-opencv/  to see how others processes the masks outputs.

I will try to fix my problem on convert the .pb model with MO then see why the performance is different in python code and C++.

I will let you know if I have news,

Regards,

Hoa

0 Kudos
Truong__Dien_Hoa
New Contributor II
1,219 Views

Hi @Francois, which tool do you use for creating the VOC like dataset ? then you use dataset_tools/create_pascal_tf_record.py to create your record ? I can't find the segmentation part in the VOC dataset. And also which pipelineconfig file you use ? Thank you in advance

0 Kudos
Ponchon__François
1,219 Views

Hey !
I used this tool :
http://artelab.dista.uninsubria.it/downloads/tools/voc_manager/voc_manager.html

But i think in every cases you'll have to change content of create_pascal_tf_record.py because it is not done for this format !
I did my own version that work !

I used the pipelineconfig from sample/config directory, because the one with the network do not work.
In order to convert the model with model optimiser, you can see how i did in my previous post !
But i didn't find any solution in order to get the same performance than with tensorflow or opencv.

Nowaday, i use an opencv based inference engine, exactly like in the post you showed me !
 

0 Kudos
Truong__Dien_Hoa
New Contributor II
1,219 Views

Thanks. So you mean the mask_rcnn_inception_v2_coco.config in the sample/ folder ?

0 Kudos
Ponchon__François
1,219 Views

Yes exactly, if it is the model you use :)

Just be careful, you'll need coherence between number of classes, evaluation etc...

In oder to retrain you'll have to edit the config file :

  • What is number of classes.
  • Where is located original model.
  • where is located training tf records.
  • where is located eval tf records.

Be also careful with the training step, i know that i had to reduce it because i faced divergence.
You'll have to follow the loss with tensorboard.
 

0 Kudos
Truong__Dien_Hoa
New Contributor II
1,219 Views

Hi @Francois, Thank you so much for always paying attention to my question. I was quite panic that keeping asking too many questions (even during the weekend). I think I finally find out the problem. It seems like I used the tensorflow version 1.12 which is not compatible yet with openvino. I now use version 1.9 and I can succesfully generate an IR model using mo.py. I need continue to explore if the model is correct. But anyway, now I'm happy

Now I can focus on improving my model and catch up with you ^^.

Thank you again and have a nice weekend,

Hoa

0 Kudos
Reply