- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I try to convert pytorch pretrained model (https://github.com/pytorch/vision/blob/master/torchvision/models/resnet.py) into openvino IR
First, I download the pretrained model and save it to onnx
import torch from torchvision.models.resnet import resnet50 net = resnet50(pretrained=True) x=torch.randn((1,3,224,224)) torch.onnx._export(net, x, 'test_model.onnx', export_params=True)
and then use mo.py to convert IR
python mo.py --input_model /home/tumh/pytorch-cifar/test_model.onnx --output_dir /home/tumh/test_model_FP32 --framework onnx --data_type FP32
and then tesing with classification_sample.py, for simplicity, the input is randomly generated with fixed seed.
Here is what I changed
import numpy as np # load net... success n, c, h, w = net.inputs[input_blob].shape images = np.ndarray(shape=(n, c, h, w)) # fix seed np.random.seed(133) r = np.random.randn(3,224,224) images[0] = r # process output # show only 10 of 1000 #[-0.4592718 -0.12941386 -0.11573323 -0.75521946 0.5491318 -1.4393116 # -0.3861863 -0.40474018 0.401676 -1.4279357 ]
However, the result from pytorch is quite different, Here is the output from pytorch
[-0.9367, -0.3480, -0.4053, -0.9139, -0.5280, -0.2631, -0.5692, 0.4866, 0.3328, -0.3892]
by the way, if I convert onnx to caffe first, and then convert caffe model to IR, the result is identical. So I doubt this might be onnx to openvino issues.
And openvino claims they do support resnet50, Any idea?
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Ming-Hsuan,
Have you solved this issue? Please make sure you are using the correct mean / scale values / reverse_input_channels. What happens for example if you convert using
mo_onnx.py --input_model test_model.onnx --data_type FP32 --mean_values [104.0,117.0,123.0] --scale 255 --reverse_input_channels
Not sure what the actual values are - above is just an example of using --mean_values, reverse_input_channels and --scale .
nikos
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@nikos
I have tried to add mean, scale and reverse channel.
but the result is different with pytorch.
Have you ever tried to reproduce my steps and see the result? it's easy to reproduce.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@nikos
it's much better if openvino can give some examples to convert some pretrained pytorch onnx models in the document.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Ming-Hsuan,
Yes, I did try to reproduce your steps using the details you included. That's how I realized that missing --mean_values and --scale could cause issues.
> I have tried to add mean, scale and reverse channel.
What values did you use?
> Have you ever tried to reproduce my steps and see the result? it's easy to reproduce.
Not so easy to test your end-to-end pipeline without your test code. Could you please attach test code you use for inference? Since this was missing I tried testing from my C++ test code on my test data and found no issues when appropriate parameters were used.
cheers,
nikos
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@nikos
here is my testing code
import numpy as np from torchvision.models.resnet import resnet50 import torchvision.transforms as transforms from PIL import Image transform_test = transforms.Compose([ transforms.Resize(( 224,224 )), transforms.ToTensor(), transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010)), ]) net = resnet50(pretrained=True) im = Image.open('/home/tumh/dog.jpeg') x = transform_test(im) x = x.unsqueeze(dim=0) print (net(x)[0][0:10])
and the conversion code
python mo.py --input_model /home/tumh/pytorch-cifar/test_model.onnx --output_dir /home/tumh/test_model_FP32 --scale_values [51.5865,50.847,51.255] --mean_values [125.307,122.961,113.8575] --framework onnx --data_type FP32 --reverse_input_channels
I switch to the real image and test with classification_sample.py (see the attachment)
python classification_sample.py -m /home/tumh/test_model_FP32/test_model.xml -i ~/dog.jpeg
the result from openvino
[ 0.93463564 -3.0111456 -2.6969056 -0.7203666 -3.5170152 -1.5799323 -4.9336224 -0.02198645 0.77573025 -0.03216804]
the result from pytorch
[-0.9243, -0.3208, -0.4276, -0.9591, -0.6213, -0.2226, -0.7890, 0.6524, 0.4050, -0.5093]
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Excellent - thank you for the additional information. Will work on this over the weekend - too busy at work.... I want to root-cause this too because I have similar issues.
> it's much better if openvino can give some examples to convert some pretrained pytorch onnx models in the document.
I agree that would be nice but on the other hand I prefer them spending time optimizing the SDK and working on new features too instead of writing samples for every possible combination of framework conversion.
In the meantime, a few thoughts:
- this is a multi-stage pipeline using many frameworks and I would NOT expect same numbers. Maybe we have to statistically compare results. Even worse after a model optimization process we expect minor discrepancies, correct?
- Are the original test networks trained on Imagenet or CIFAR - are weights loaded properly?
- Let's ignore output vectors for now. What is the classification result of pytorch, what is if run onnx inference (have you tried?) what is the classification result of openvino fp32 ?
- Have you tried the validation tool to get a better overall idea of accuracy?
- Still not convinced those are the right parameters - will test --scale_values [51.5865,50.847,51.255] --mean_values [125.307,122.961,113.8575]
Cheers,
nikos
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
this is a multi-stage pipeline using many frameworks and I would NOT expect same numbers. Maybe we have to statistically compare results. Even worse after a model optimization process we expect minor discrepancies, correct
Well, minor difference (1e-6) is accepted.
- Are the original test networks trained on Imagenet or CIFAR - are weights loaded properly?
Let's ignore output vectors for now. What is the classification result of pytorch, what is if run onnx inference (have you tried?) what is the classification result of openvino fp32 ?
the original weights is for imagenet, it's from offical pytorch model zoo. Indeed there are 1000 output values, but for simplicity I just print 10 of 1000 values. I have not verified the classification result (whether it's dog or others).
Have you tried the validation tool to get a better overall idea of accuracy?
Not yet. But it seems this issue should be solved first before I calculate the overall accuracy.
Still not convinced those are the right parameters - will test --scale_values [51.5865,50.847,51.255] --mean_values [125.307,122.961,113.8575]
it's a simple math. since in pytorch, the input is always normalized to [0,1]. and for imagenet, the mean is (0.4914, 0.4822, 0.4465), and std is (0.2023, 0.1994, 0.2010).
so the overall preprocess for R channel is ((r/255)-0.4914)/0.2023, to get the equivalent steps in openvino, we have (r-0.4914*255)/ (0.2023*255).
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
What do you get if set to testing mode using net.eval() like
net = resnet50(pretrained=True) net.eval() ################################# im = Image.open('/home/tumh/dog.jpeg') x = transform_test(im) x = x.unsqueeze(dim=0) print (net(x)[0][0:10])
My output seems now closer to OpenVino
pytorch [ 0.89 -3.04 -2.70 -0.74 -3.56 -1.79 -4.84 0.10 0.86 0.05] openvino [ 0.93 -3.01 -2.69 -0.72 -3.51 -1.57 -4.93 -0.02 0.77 -0.03]
and the classification result in OpenVino is correct too.
python3 classification_sample.py --labels test_model.labels -m test_model.xml -i dog.jpeg [ INFO ] Loading network files: test_model.xml test_model.bin [ INFO ] Preparing input blobs [ WARNING ] Image dog.jpeg is resized from (216, 233) to (224, 224) [ INFO ] Batch size is 1 [ INFO ] Loading model to the plugin [ INFO ] Starting inference (1 iterations) [ INFO ] Average running time of one iteration: 16.50834083557129 ms [ INFO ] Processing output blob [ 0.93463564 -3.0111456 -2.6969056 -0.7203666 -3.5170152 -1.5799323 -4.9336224 -0.02198645 0.77573025 -0.03216804] [ INFO ] Top 10 results: Image dog.jpeg 15.3578529 label German shepherd 11.2073421 label Leonberg 10.9584837 label malinois 9.9125881 label Norwegian elkhound, elkhound 8.9993887 label Irish wolfhound 8.9059830 label groenendael 8.5530519 label African hunting dog 8.4389133 label Afghan hound 7.9750319 label borzoi 7.9166555 label kelpie
same labels as pytorch 15.4252 n02106662 German shepherd, German shepherd dog, German police dog, alsatian 11.2401 n02111129 Leonberg 11.0313 n02105162 malinois 9.7304 n02091467 Norwegian elkhound, elkhound 8.9736 n02090721 Irish wolfhound 8.8621 n02105056 groenendael 8.5262 n02116738 African hunting dog, hyena dog, Cape hunting dog, Lycaon pictus 8.4578 n02088094 Afghan hound, Afghan 7.9833 n02090622 borzoi, Russian wolfhound 7.8347 n02105412 kelpie
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@nikos
it's good to see a better result after add .eval(). And I got the same result with you.
However, don't you think the error is too large? for example, the difference between 0.86 and 0.77 is about 0.1 and the sign (0.10 and -0.02 ) is different too
I think an issue may occur when the output of the network is embedding (for example, the typical dimension of the face embedding is 128). In face reid, we always compare the distance of two embeddings (vectors) to identify whether they are two identical persons. So the true positive verification rate may be changed if the embedding is changed after the conversion.
by the way, if I convert the onnx to caffe, and then convert caffe to IR, the result is almost identical (only 10e-5 difference). So don't you think there might be an numerical issue when converting onnx to IR?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Ming-Hsuan,
> However, don't you think the error is too large? for example, the difference between 0.86 and 0.77 is about 0.1 and the sign (0.10 and -0.02 ) is different too
In my experience I expect this kind of error but let's make sure. For that we would have to compare output on the same input - image processing is different in the two pipelines so i is not a fair comparison. Even in OpenVino comparing cv2.resize to "auto resize in OpenVino" will have different results.
Now that .eval() fixed most of the issues let's go back to your original idea of pushing the same fixed input vector
n, c, h, w = net.inputs[input_blob].shape images = np.ndarray(shape=(n, c, h, w)) # fix seed np.random.seed(133) r = np.random.randn(3,224,224) images[0] = r
We also have to make sure the input is in the same layout too ( NHCW vs NHWC )
I have the code running for OpenVino based on your modification above. Could you attach the pytorch test script you used for random / seed so that we can compare with fixed input?
Cheers,
Nikos
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Problem seems to be solved when you save input vector from pytorch and load to OpenVino so it seems it was the image processing causing discrepancies, try this
net = resnet50(pretrained=True) net.eval() im = Image.open('dog.jpeg') x = transform_test(im) x = x.unsqueeze(dim=0) print (x.shape) np.save("test_in_vector", x)
and then load from OpenVino to ensure same input
r = np.load("test_in_vector.npy")
You would have to change the way we create IR for this experiment , also try to disable optimizations for first test
mo_onnx.py --input_model test_model.onnx --data_type FP32 --disable_resnet_optimization --disable_fusing --disable_gfusing --data_type=FP32
OpenVino now agrees with pytorch
[ INFO ] Average running time of one iteration: 18.108606338500977 ms [ INFO ] Processing output blob [ 0.8969836 -3.0496185 -2.7041526 -0.7479727 -3.562203 -1.7981005 -4.8486257 0.10939903 0.86848104 0.05356242] [ INFO ] Top 10 results: Image dog.jpeg 15.4252462 label German shepherd 11.2401333 label Leonberg 11.0313234 label malinois 9.7304478 label Norwegian elkhound 8.9735994 label Irish wolfhound 8.8621044 label groenendael 8.5262327 label African huntingdog 8.4578342 label Afghanhound 7.9833093 label borzoi 7.8347163 label kelpie
same output as pytorch
15.4252 n02106662 German shepherd, German shepherd dog, German police dog, alsatian 11.2401 n02111129 Leonberg 11.0313 n02105162 malinois 9.7304 n02091467 Norwegian elkhound, elkhound 8.9736 n02090721 Irish wolfhound 8.8621 n02105056 groenendael 8.5262 n02116738 African hunting dog, hyena dog, Cape hunting dog, Lycaon pictus 8.4578 n02088094 Afghan hound, Afghan 7.9833 n02090622 borzoi, Russian wolfhound 7.8347 n02105412 kelpie
Finally try with model optimizer - very similar output
mo_onnx.py --input_model test_model.onnx --data_type FP32 --data_type=FP32
15.4252472 label German shepherd 11.2401333 label Leonberg 11.0313206 label malinois 9.7304420 label Norwegian elkhound 8.9735994 label Irish wolfhound 8.8621025 label groenendael 8.5262289 label African huntingdog 8.4578362 label Afghan hound 7.9833107 label borzoi 7.8347163 label kelpie
JFTR captured some of this in https://github.com/ngeorgis/pytorch_onnx_openvino
Can you verify? Could you also check if there is a normalization issue ( https://github.com/ngeorgis/pytorch_onnx_openvino/issues/1 ) ?
Any more issues?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
pytorch version is 1.01 newest
openvino version is 2018R5
and inference result is totally different in pytorch and openvino !
i use code like this :
-------- pytorch model convert to onnx
import onnx
import torch
from torchvision.models.resnet import resnet50
net = resnet50(pretrained=True)
x=torch.randn((1,3,224,224))
torch.onnx._export(net, x, 'test_model.onnx', export_params=True)
--------convert to openvino
python3 mo.py --input_model /home/forum-test/test_model.onnx --output_dir /home/forum-test/mymodel --framework onnx --data_type FP32
----test model in pytorch
import numpy as np
from torchvision.models.resnet import resnet50
import torchvision.transforms as transforms
from PIL import Image
transform_test = transforms.Compose([transforms.Resize(( 224,224 )),transforms.ToTensor(),transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994,0.2010)),])
net = resnet50(pretrained=True)
net.eval()
im = Image.open('1.jpg')
x = transform_test(im)
x = x.unsqueeze(dim=0)
print (net(x)[0][0:10])
result in pytorch:
tensor([-1.1601, -0.5671, -0.4668, -1.2231, -0.6918, -0.3618, -0.7984, 0.3102, 0.1104, -0.7210], grad_fn=<SliceBackward>)
----test xml bin in openvino
python3 classification_sample.py -m /home/forum-test/mymodel/test_model.xml -i 1.jpg
result in openvino:
[-3.3632674 -2.8450186 -1.418541 -3.3199158 -3.919244 -1.2973417
-0.56975985 -0.22444369 1.0697088 -2.761873]
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi guys,
Actually the main problem relates to different implementations of resize() function in PIL and OpenCV + .jpeg images.
Let's test on 224x224 .png image (so we don't use resize and .jpeg images)
----------- resnet_export.py -------------
import onnx import torch from torchvision.models.resnet import resnet50 x = torch.randn((1, 3, 224, 224)) net = resnet50(pretrained=True) net.eval() torch.onnx.export(net, x, 'resnet_test.onnx', export_params=True)
Run Model Optimizer on exported .onnx model.
python /opt/intel/openvino/deployment_tools/model_optimizer/mo_onnx.py --input_model ./resnet_test.onnx --data_type=FP32 --mean_values [123.675,116.28,103.53] --scale_values [58.395,57.12,57.375] --reverse_input_channels
----------- resnet_sample.py -------------
import numpy as np from torchvision.models.resnet import resnet50 import torchvision.transforms as transforms from PIL import Image import cv2 as cv transforms_test = transforms.Compose([transforms.ToTensor(), transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])]) net = resnet50(pretrained=True) net.eval() img = Image.open('2.png') x = transforms_test(img) x = x.unsqueeze(dim=0) print(net(x)[0][0:10])
Output:
tensor([-2.6386, 1.3077, -2.2054, -2.5019, -1.6188, -2.0899, -1.2727, -0.5596,
-1.7433, -3.0217], grad_fn=<SliceBackward>)
In classification sample I added the line with print first 10 elements:
# Processing output blob log.info("Processing output blob") res = res[out_blob] log.info("Top {} results: ".format(args.number_top)) print(res[0][:10])
python classification_sample.py -m resnet_test.xml -i 2.png
INFO ] Creating Inference Engine
[ INFO ] Loading network files:
resnet_test.xml
resnet_test.bin
[ INFO ] Preparing input blobs
[ INFO ] Batch size is 1
[ INFO ] Loading model to the plugin
[ INFO ] Starting inference in synchronous mode
[ INFO ] Processing output blob
[ INFO ] Top 10 results:
[-2.638615 1.3077483 -2.2053905 -2.5018773 -1.6187974 -2.089907
-1.2726829 -0.5595517 -1.7432679 -3.0216992]
Image 2.pngclassid probability
------- -----------
159 13.1708260
168 11.3585939
211 8.2475233
167 7.9942780
166 7.8777084
162 7.6203313
237 7.4518971
165 7.2671924
434 6.9654808
171 6.7230196
Input image is attached.
159 class in ImageNet relates to 'Rhodesian ridgeback' and 168 class relates to 'redbone', so it seems that classification result is correct:)
Hope it helps!

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page