Solved: The difference between onnx results and openvino results.

datapro · ‎11-04-2021

Originalonnx RuntimeopenVINO

I converted the REAL-ESRGAN model into openvino and executed it.

However, the result of converting to onnx and running to torch model is the same, but the model running to openvino differs as shown in the third picture.

There are two expected problems.
1. Scaling problem.
2. The model's Resize function works differently in openvino.
I'd appreciate it if you could check it out!

Peh_Intel · ‎11-16-2021

Hi datapro,

Thanks for sharing additional models that can produce good results. Yes, I totally agree with you that some work to be done in order for the Real-ESRGAN model to produce good results.

All these shared models are much helpful for our developers to explore and validate with OpenVINO. However, I cannot comment on their roadmap.

As such, I would like to request for the case closure as there is nothing else I can provide for these Super Resolution models at the current stage.

Regards,

Peh

View solution in original post

Peh_Intel · ‎11-08-2021

Hi datapro,

Thanks for reaching out to us.

I downloaded pre-trained models: RealESRGAN_x4plus.pth and convert the model into ONNX model by running pytorch2onnx.py script. Next, I converted the ONNX model to IR model by running Model Optimizer without specifying any parameter.

However, I unable to proceed the inferencing with the ONNX and IR model as the inferencing methods (Portable executable files and inference_realesrgan.py) in the GitHub repository does not support these model format.

Please share your inferencing method, which allow you to run ONNX and IR model with us to replicate from our side.

Regards,

Peh

datapro · ‎11-08-2021

Hello Peh

Here's the code I used.

Here is model structure code

import torch
import functools
from torch import nn as nn
from torch.nn import functional as F

def pixel_unshuffle(x, scale):
    """ Pixel unshuffle.
    Args:
        x (Tensor): Input feature with shape (b, c, hh, hw).
        scale (int): Downsample ratio.
    Returns:
        Tensor: the pixel unshuffled feature.
    """
    b, c, hh, hw = x.size()
    out_channel = c * (scale**2)
    assert hh % scale == 0 and hw % scale == 0
    h = hh // scale
    w = hw // scale
    x_view = x.view(b, c, h, scale, w, scale)
    return x_view.permute(0, 1, 3, 5, 2, 4).reshape(b, out_channel, h, w)

def make_layer(basic_block, num_basic_block, **kwarg):
    """Make layers by stacking the same blocks.
    Args:
        basic_block (nn.module): nn.module class for basic block.
        num_basic_block (int): number of blocks.
    Returns:
        nn.Sequential: Stacked blocks in nn.Sequential.
    """
    layers = []
    for _ in range(num_basic_block):
        layers.append(basic_block(**kwarg))
    return nn.Sequential(*layers)


class ResidualDenseBlock(nn.Module):
    """Residual Dense Block.
    Used in RRDB block in ESRGAN.
    Args:
        num_feat (int): Channel number of intermediate features.
        num_grow_ch (int): Channels for each growth.
    """

    def __init__(self, num_feat=64, num_grow_ch=32):
        super(ResidualDenseBlock, self).__init__()
        self.conv1 = nn.Conv2d(num_feat, num_grow_ch, 3, 1, 1)
        self.conv2 = nn.Conv2d(num_feat + num_grow_ch, num_grow_ch, 3, 1, 1)
        self.conv3 = nn.Conv2d(num_feat + 2 * num_grow_ch, num_grow_ch, 3, 1, 1)
        self.conv4 = nn.Conv2d(num_feat + 3 * num_grow_ch, num_grow_ch, 3, 1, 1)
        self.conv5 = nn.Conv2d(num_feat + 4 * num_grow_ch, num_feat, 3, 1, 1)

        self.lrelu = nn.LeakyReLU(negative_slope=0.2, inplace=True)

        # initialization
        # default_init_weights([self.conv1, self.conv2, self.conv3, self.conv4, self.conv5], 0.1)

    def forward(self, x):
        x1 = self.lrelu(self.conv1(x))
        x2 = self.lrelu(self.conv2(torch.cat((x, x1), 1)))
        x3 = self.lrelu(self.conv3(torch.cat((x, x1, x2), 1)))
        x4 = self.lrelu(self.conv4(torch.cat((x, x1, x2, x3), 1)))
        x5 = self.conv5(torch.cat((x, x1, x2, x3, x4), 1))
        # Emperically, we use 0.2 to scale the residual for better performance
        return x5 * 0.2 + x


class RRDB(nn.Module):
    """Residual in Residual Dense Block.
    Used in RRDB-Net in ESRGAN.
    Args:
        num_feat (int): Channel number of intermediate features.
        num_grow_ch (int): Channels for each growth.
    """

    def __init__(self, num_feat, num_grow_ch=32):
        super(RRDB, self).__init__()
        self.rdb1 = ResidualDenseBlock(num_feat, num_grow_ch)
        self.rdb2 = ResidualDenseBlock(num_feat, num_grow_ch)
        self.rdb3 = ResidualDenseBlock(num_feat, num_grow_ch)

    def forward(self, x):
        out = self.rdb1(x)
        out = self.rdb2(out)
        out = self.rdb3(out)
        # Emperically, we use 0.2 to scale the residual for better performance
        return out * 0.2 + x


class RRDBNet(nn.Module):
    """Networks consisting of Residual in Residual Dense Block, which is used
    in ESRGAN.
    ESRGAN: Enhanced Super-Resolution Generative Adversarial Networks.
    We extend ESRGAN for scale x2 and scale x1.
    Note: This is one option for scale 1, scale 2 in RRDBNet.
    We first employ the pixel-unshuffle (an inverse operation of pixelshuffle to reduce the spatial size
    and enlarge the channel size before feeding inputs into the main ESRGAN architecture.
    Args:
        num_in_ch (int): Channel number of inputs.
        num_out_ch (int): Channel number of outputs.
        num_feat (int): Channel number of intermediate features.
            Default: 64
        num_block (int): Block number in the trunk network. Defaults: 23
        num_grow_ch (int): Channels for each growth. Default: 32.
    """

    def __init__(self, num_in_ch, num_out_ch, scale=4, num_feat=64, num_block=23, num_grow_ch=32):
        super(RRDBNet, self).__init__()
        self.scale = scale
        RRDB_block_f = functools.partial(RRDB, num_feat=num_feat, num_grow_ch=num_grow_ch)
        if scale == 2:
            num_in_ch = num_in_ch * 4
        elif scale == 1:
            num_in_ch = num_in_ch * 16
        self.conv_first = nn.Conv2d(num_in_ch, num_feat, 3, 1, 1,bias=True)
        self.body = make_layer(RRDB_block_f, num_block)
        self.conv_body = nn.Conv2d(num_feat, num_feat, 3, 1, 1,bias=True)
        # upsample
        self.conv_up1 = nn.Conv2d(num_feat, num_feat, 3, 1, 1,bias=True)
        self.conv_up2 = nn.Conv2d(num_feat, num_feat, 3, 1, 1,bias=True)
        self.conv_hr = nn.Conv2d(num_feat, num_feat, 3, 1, 1,bias=True)
        self.conv_last = nn.Conv2d(num_feat, num_out_ch, 3, 1, 1,bias=True)

        self.lrelu = nn.LeakyReLU(negative_slope=0.2, inplace=True)
        self.pixel_unshuffle = nn.PixelUnshuffle(2)

    def forward(self, x):
        # input b h w c
        x = torch.div(x,255)
        x = x.permute(0,3,1,2)
        if self.scale == 2:
            feat = pixel_unshuffle(x,2)
        elif self.scale == 1:
            feat = pixel_unshuffle(x,2)
        else:
            feat = x
        feat = self.conv_first(feat)
        body_feat = self.conv_body(self.body(feat))
        feat = feat + body_feat
        # upsample
        upsample1 = F.interpolate(feat, scale_factor=2, mode='nearest')

        feat = self.lrelu(self.conv_up1(upsample1))
        
        upsample2 = F.interpolate(feat, scale_factor=2, mode='nearest')
        feat = self.lrelu(self.conv_up2(upsample2))
        out = self.conv_last(self.lrelu(self.conv_hr(feat)))
        
        out = out.permute(0,2,3,1)
        out = torch.mul(out,255)
        # 
        return out

Here is pytorch to onnx code

import argparse
import sys
import numpy as np
import os
import torch.backends.cudnn as cudnn
import torch.utils.data.distributed
from PIL import Image
# from models.EDSR import EDSR
from models.Real_NET import RRDBNet
# from models.bsrgan import RRDBNet
if __name__ == '__main__':
    parser = argparse.ArgumentParser(description="Real-Time Single Image and Video Super-Resolution")
    parser.add_argument("--scale", default=2, type=int, choices=[2, 3, 4, 8],
                        help="Super resolution upscale factor. (default:4)")
    parser.add_argument("--weights", type=str, default="weights/edsr_baseline_x2-1bc95232.pt",
                        help="Generator model name.  (default:`weights/espcn_4x.pth`)")
    parser.add_argument("--batch", type=int, default=1, help="number of batch")
    parser.add_argument("--channel", type=int, default=1, help="number of channel")
    parser.add_argument("--width", type=int, required=True, help="Input width")
    parser.add_argument("--height", type=int, required=True, help="Input height")
    parser.add_argument("--cuda", action="store_true", help="Enables cuda")

    args = parser.parse_args()

    cudnn.benchmark = True

    if torch.cuda.is_available() and not args.cuda:
        print("WARNING: You have a CUDA device, so you should probably run with --cuda")

    device = torch.device("cuda:0" if args.cuda else "cpu")

    # create model
    # model = EDSR(scale=args.scale).to(device)
    
    # Load state dicts
    # model.load_state_dict(torch.load(args.weights, map_location=device))
    #model.load_state_dict(torch.load(args.weights, map_location=device))
    
    model = RRDBNet(num_in_ch=3, num_out_ch=3,scale=4).to(device)
    model.load_state_dict(torch.load(args.weights)["params_ema"])
    
    # model = RRDBNet(in_nc=3,out_nc=3,sf=2).to(device)
    # model.load_state_dict(torch.load(args.weights))
    # Set eval mode
    model.eval()

    x = torch.randn(args.batch, args.height, args.width, args.channel, dtype=torch.float32)
    print(f'batch : {args.batch}, channel : {args.channel}, height : {args.height}, width : {args.width}')
    dynamic_ax = {'input':{1:"height",2:"width"},'output':{1:"height",2:"width"}}

    torch.onnx.export(model,               # 실행될 모델
                        x,                         # 모델 입력값 (튜플 또는 여러 입력값들도 가능)
                        f"BSRGAN.onnx",   # 모델 저장 경로 (파일 또는 파일과 유사한 객체 모두 가능)
                        export_params=True,        # 모델 파일 안에 학습된 모델 가중치를 저장할지의 여부
                        opset_version=11,          # 모델을 변환할 때 사용할 ONNX 버전
                        do_constant_folding=True,  # 최적하시 상수폴딩을 사용할지의 여부
                        input_names = ['input'],   # 모델의 입력값을 가리키는 이름
                        output_names = ['output'], # 모델의 출력값을 가리키는 이름
                        dynamic_axes=dynamic_ax
                        )

datapro · ‎11-08-2021

I don't know if it's a hint, but BSRGAN, which has a very similar model structure, produces good results.

Here is my code

import functools
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.nn.init as init


# def initialize_weights(net_l, scale=1):
#     if not isinstance(net_l, list):
#         net_l = [net_l]
#     for net in net_l:
#         for m in net.modules():
#             if isinstance(m, nn.Conv2d):
#                 init.kaiming_normal_(m.weight, a=0, mode='fan_in')
#                 m.weight.data *= scale  # for residual block
#                 if m.bias is not None:
#                     m.bias.data.zero_()
#             elif isinstance(m, nn.Linear):
#                 init.kaiming_normal_(m.weight, a=0, mode='fan_in')
#                 m.weight.data *= scale
#                 if m.bias is not None:
#                     m.bias.data.zero_()
#             elif isinstance(m, nn.BatchNorm2d):
#                 init.constant_(m.weight, 1)
#                 init.constant_(m.bias.data, 0.0)


def make_layer(block, n_layers):
    layers = []
    for _ in range(n_layers):
        layers.append(block())
    return nn.Sequential(*layers)


class ResidualDenseBlock_5C(nn.Module):
    def __init__(self, nf=64, gc=32, bias=True):
        super(ResidualDenseBlock_5C, self).__init__()
        # gc: growth channel, i.e. intermediate channels
        self.conv1 = nn.Conv2d(nf, gc, 3, 1, 1, bias=bias)
        self.conv2 = nn.Conv2d(nf + gc, gc, 3, 1, 1, bias=bias)
        self.conv3 = nn.Conv2d(nf + 2 * gc, gc, 3, 1, 1, bias=bias)
        self.conv4 = nn.Conv2d(nf + 3 * gc, gc, 3, 1, 1, bias=bias)
        self.conv5 = nn.Conv2d(nf + 4 * gc, nf, 3, 1, 1, bias=bias)
        self.lrelu = nn.LeakyReLU(negative_slope=0.2, inplace=True)

        # initialization
        # initialize_weights([self.conv1, self.conv2, self.conv3, self.conv4, self.conv5], 0.1)

    def forward(self, x):
        x1 = self.lrelu(self.conv1(x))
        x2 = self.lrelu(self.conv2(torch.cat((x, x1), 1)))
        x3 = self.lrelu(self.conv3(torch.cat((x, x1, x2), 1)))
        x4 = self.lrelu(self.conv4(torch.cat((x, x1, x2, x3), 1)))
        x5 = self.conv5(torch.cat((x, x1, x2, x3, x4), 1))
        return x5 * 0.2 + x


class RRDB(nn.Module):
    '''Residual in Residual Dense Block'''

    def __init__(self, nf, gc=32):
        super(RRDB, self).__init__()
        self.RDB1 = ResidualDenseBlock_5C(nf, gc)
        self.RDB2 = ResidualDenseBlock_5C(nf, gc)
        self.RDB3 = ResidualDenseBlock_5C(nf, gc)

    def forward(self, x):
        out = self.RDB1(x)
        out = self.RDB2(out)
        out = self.RDB3(out)
        return out * 0.2 + x


class RRDBNet(nn.Module):
    def __init__(self, in_nc=3, out_nc=3, nf=64, nb=23, gc=32, sf=4):
        super(RRDBNet, self).__init__()
        RRDB_block_f = functools.partial(RRDB, nf=nf, gc=gc)
        self.sf = sf
        print([in_nc, out_nc, nf, nb, gc, sf])

        self.conv_first = nn.Conv2d(in_nc, nf, 3, 1, 1, bias=True)
        self.RRDB_trunk = make_layer(RRDB_block_f, nb)
        self.trunk_conv = nn.Conv2d(nf, nf, 3, 1, 1, bias=True)
        #### upsampling
        self.upconv1 = nn.Conv2d(nf, nf, 3, 1, 1, bias=True)
        if self.sf==4:
            self.upconv2 = nn.Conv2d(nf, nf, 3, 1, 1, bias=True)
        self.HRconv = nn.Conv2d(nf, nf, 3, 1, 1, bias=True)
        self.conv_last = nn.Conv2d(nf, out_nc, 3, 1, 1, bias=True)

        self.lrelu = nn.LeakyReLU(negative_slope=0.2, inplace=True)

    def forward(self, x):
        x = torch.div(x,255)
        x = x.permute(0,3,1,2)
        fea = self.conv_first(x)
        trunk = self.trunk_conv(self.RRDB_trunk(fea))
        fea = fea + trunk

        fea = self.lrelu(self.upconv1(F.interpolate(fea, scale_factor=2, mode='nearest')))
        if self.sf==4:
            fea = self.lrelu(self.upconv2(F.interpolate(fea, scale_factor=2, mode='nearest')))
        out = self.conv_last(self.lrelu(self.HRconv(fea)))
        out = out.permute(0,2,3,1)
        out = torch.mul(out,255)
        return out

and Pre-train code BSRGAN.pth

Peh_Intel · ‎11-11-2021

Hi datapro,

Thanks for bringing out these two models: ESRGAN and BSRGAN. The topologies of these models are not present in the supported list. Even though these two models are quite similar model structure and able to convert into Intermediate Representation (IR) but these two models are yet to be validated by OpenVINO Developers.

We do believe that these models are great for our developers to explore and validate with OpenVINO. However, I cannot comment on if or when it will be implemented.

Glad to know that the BSRGAN model can produce good results which can be your alternative selection.

Regards,

Peh

datapro · ‎11-14-2021

I'm glad it helped you. Additionally
I will share the Super-resolution models that work successfully.

ECBSR https://github.com/xindongzhang/ECBSR

EDSR https://github.com/sanghyun-son/EDSR-PyTorch

I thought about the reason why Real-ESRGAN is not working, and I guess it's a precision problem when I see BSRGAN working properly.
Or, I think the model optimizer may ignore a specific operator by the model name while reading the model.
Anyway, I hope it'll be solved.

Thank you.

Peh_Intel · ‎11-16-2021

Hi datapro,

Thanks for sharing additional models that can produce good results. Yes, I totally agree with you that some work to be done in order for the Real-ESRGAN model to produce good results.

All these shared models are much helpful for our developers to explore and validate with OpenVINO. However, I cannot comment on their roadmap.

As such, I would like to request for the case closure as there is nothing else I can provide for these Super Resolution models at the current stage.

Regards,

Peh

Peh_Intel · ‎11-18-2021

Hi datapro,

Thank you for your question. If you need any additional information from Intel, please submit a new question as this thread is no longer being monitored.

Regards,

Peh