- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Relating this post:
> .... such as filter pruning. The important advantage of this method is that it is generic and does not require special HW instructions. Currently, two filter pruning techniques are supported:
and a paper: https://arxiv.org/pdf/2002.08679.pdf
> Filter pruning ... NNCF also supports structured pruning for convolutional neural networks in the form of filter pruning.
This means, when applying the structure pruning, can I get a small model which allows me to make fast inference? But, when reading this instruction on NNCF pruning: https://github.com/openvinotoolkit/nncf/blob/develop/docs/compression_algorithms/Pruning.md
I did not find a mention on the structure pruning itself and how much the structure pruning has advantage on the inference speed... Please tell me comments on the structure pruning? Is it available?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I got your previous message: you couldn't run pruning transformation.
Please try this example:
# Assume you have openvino_dev==2022.1 and nncf installed
# Go to examples dir
cd nncf/examples/torch/classification
# Export pruned model
python main.py --config configs/pruning/resnet50_imagenet_accuracy_aware.json --mode export --to-onnx resnet50_pruned.onnx --cpu-only
# Convert without pruning
mo --input_model resnet50_pruned.onnx -o not_pruned
# Convert with pruning
mo --input_model resnet50_pruned.onnx --transform=Pruning -o pruned
# Check IR sizes
du -h
# my output is
# ...
# 89M ./pruned
# 98M ./not_pruned
My openvino version:
$ pip freeze | grep openvino
openvino==2022.1.0
openvino-dev==2022.1.0
openvino-telemetry==2022.1.1
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Timosy,
Thank you for reaching out to us.
We are checking this with our development team and will get back to you when we receive feedback.
Sincerely,
Zulkifli
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Timosy,
Thank you for your patience.
This documentation explains filter pruning and how it can help optimize the model and speed up inference. We do not have the table/graph that shows the inference speed before and after applying the compression method, since the speed varies depending on the hardware and the model.
Sincerely,
Zulkifli
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for your kind additional informaton
Accrong to Filter Pruning in the page you linked, "Filter Pruning" (structure pruning) is ceratainly mentioned to be supported. It means that a model can be shrinked with this method, and gets a bit fast.
I try to use/check "Filter Pruning" more.
My previous usage might be improper.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I tested a config file below, which is the filer pruning
nncf_config_pruning_dict = {
"model": "testnet",
"num_classes": classes,
"batch_size": g_batch_size,
"pretrained": True,
"epochs": 100,
"input_info": {"sample_size": [1, 3, image_size, image_size] },
"optimizer": {
"type": "SGD",
"base_lr": 0.1,
"weight_decay": 1e-4,
"schedule_type": "multistep",
"steps": [
20,
40,
60,
80
],
"optimizer_params":
{
"momentum": 0.9,
"nesterov": True
}
},
"compression": [
{
"algorithm": "filter_pruning",
"pruning_init": 0.1,
"params": {
"schedule": "exponential",
"pruning_target": 0.6,
"pruning_steps": 15,
"filter_importance": "geometric_median"
}
}
]
}
and files before/after pruning I got are something like
-rwxrwxrwx 1 user user 242031227 Aug 31 19:59 model_fp32.onnx
-rwxrwxrwx 1 user user 900032 Sep 12 13:26 model_fp32_layer2.bin
-rwxrwxrwx 1 user user 940 Sep 12 13:26 model_fp32_layer2.mapping
-rwxrwxrwx 1 user user 11266 Sep 12 13:26 model_fp32_layer2.xml
-rwxrwxrwx 1 user user 242031626 Sep 12 13:18 model_prun.onnx
-rwxrwxrwx 1 user user 900032 Sep 12 13:27 model_prun_layer2.bin
-rwxrwxrwx 1 user user 1012 Sep 12 13:27 model_prun_layer2.mapping
-rwxrwxrwx 1 user user 11362 Sep 12 13:27 model_prun_layer2.xml
It seems the file are not compressed... I mistook something?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Timosy,
We are currently investigating this issue and will get back to you with the finding.
Sincerely,
Zulkifli
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi timosy,
Could you provide us with the model that you try to compress with filter Pruning? So that we can understand the layer of the model and the cause of it.
If you could provide some detail on your compression, that would be helpful for us to further investigate the issue you are facing.
Thank you
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for your kind help.
My model is actually a simple AlexNet model shown below because what I wanted check at first was to confirm whether a function of the filter/structure pruning works or not. It means whether a file is compressed or not ? and inference gets fast or not ? with a simple model.
A cutting model that I listed in my above message corresponds to a model where the cutting was done just after a 2nd convolution layer. It's also a quite simple model.
Since maximum file size I can upload here is <100MB, I can not do it. But, if AlexNet is compressed with your filter/structure pruning configuration, it will be already great information for me. Hopefully, I'd like to test the filter/stucture pruning + quantization for AlexNet.
class AlexNet_A(nn.Module):
def __init__(self, num_classes: int = 1000, dropout: float = 0.5):
super(AlexNet_A, self).__init__()
self.features = nn.Sequential(
nn.Conv2d( 3, 96, kernel_size=14, stride=4, padding=0),
nn.ReLU(inplace=True),
nn.AvgPool2d(kernel_size=3, stride=2, ceil_mode=True),
nn.Conv2d(96, 256, kernel_size=4, padding=2),
nn.ReLU(inplace=True),
nn.MaxPool2d(kernel_size=3, stride=2),
nn.Conv2d(256, 384, kernel_size=3, padding=1),
nn.ReLU(inplace=True),
nn.Conv2d(384, 384, kernel_size=3, padding=1),
nn.ReLU(inplace=True),
nn.Conv2d(384, 256, kernel_size=3, padding=1),
nn.ReLU(inplace=True),
nn.MaxPool2d(kernel_size=3, stride=2),
)
self.avgpool = nn.AdaptiveAvgPool2d((6, 6))
self.classifier = nn.Sequential(
nn.Dropout(p=dropout),
nn.Linear(256 * 6 * 6, 4096),
nn.ReLU(inplace=True),
nn.Dropout(p=dropout),
nn.Linear(4096, 4096),
nn.ReLU(inplace=True),
nn.Linear(4096, num_classes),
)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi timosy
Thank you for the information you provide, currently, we are still investigating this and will get back to you very soon.
Thank you
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @timosy,
thank you for you interest in NNCF!
Sorry for inconvenience, we have gap in our documentation.
NNCF filter pruning algo only puts zeros inside convolutional and linear layers parameters of a model and doesn't reduce size of the model. To actually remove channels/rows from the model one need to utilize additional Ngraph pruning transformation. To do this, additional argument `--transform=Pruning` should be added to Model Optimizer conversion line during model conversion to IR. Ref: https://www.intel.com/content/www/us/en/developer/articles/release-notes/openvino-relnotes.html
Please try this argument and reach us out in case anything goes wrong.
Best Regards,
Daniil Liakhov, NNCF team member
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for your kind comment.
I tested it, but unfortunately it seems not to work...
Model size does not change after converting to IR model.
Packages I'm using are
onnx 1.11.0
onnxruntime 1.9.0
opencv-python 4.5.5.64
openvino 2022.1.0
openvino-dev 2022.1.0
openvino-telemetry 2022.1.1
and, configuration and a message when applying nncf is following,
"compression": [
{
"algorithm": "filter_pruning",
"pruning_init": 0.1,
"params": {
"schedule": "exponential",
"pruning_target": 0.8,
"pruning_steps": 15,
"filter_importance": "geometric_median"
}
}
]
INFO:nncf:Please, provide execution parameters for optimal model initialization
INFO:nncf:Wrapping module AlexNet/Sequential[features]/Conv2d[0] by AlexNet/Sequential[features]/NNCFConv2d[0]
INFO:nncf:Wrapping module AlexNet/Sequential[features]/Conv2d[3] by AlexNet/Sequential[features]/NNCFConv2d[3]
INFO:nncf:Wrapping module AlexNet/Sequential[features]/Conv2d[6] by AlexNet/Sequential[features]/NNCFConv2d[6]
INFO:nncf:Wrapping module AlexNet/Sequential[features]/Conv2d[8] by AlexNet/Sequential[features]/NNCFConv2d[8]
INFO:nncf:Wrapping module AlexNet/Sequential[features]/Conv2d[10] by AlexNet/Sequential[features]/NNCFConv2d[10]
INFO:nncf:Wrapping module AlexNet/Sequential[classifier]/Linear[1] by AlexNet/Sequential[classifier]/NNCFLinear[1]
INFO:nncf:Wrapping module AlexNet/Sequential[classifier]/Linear[4] by AlexNet/Sequential[classifier]/NNCFLinear[4]
INFO:nncf:Wrapping module AlexNet/Sequential[classifier]/Linear[6] by AlexNet/Sequential[classifier]/NNCFLinear[6]
INFO:nncf:Group of nodes [AlexNet/Sequential[features]/NNCFConv2d[0]/conv2d_0] can't be pruned, because some nodes should't be pruned, error messages for this nodes: ignored adding Weight Pruner in: AlexNet/Sequential[features]/NNCFConv2d[0]/conv2d_0 because this scope is one of the first convolutions.
INFO:nncf:Group of nodes [AlexNet/Sequential[features]/NNCFConv2d[3]/conv2d_0] will be pruned together.
INFO:nncf:Group of nodes [AlexNet/Sequential[features]/NNCFConv2d[6]/conv2d_0] will be pruned together.
INFO:nncf:Group of nodes [AlexNet/Sequential[features]/NNCFConv2d[8]/conv2d_0] will be pruned together.
INFO:nncf:Group of nodes [AlexNet/Sequential[features]/NNCFConv2d[10]/conv2d_0] will be pruned together.
INFO:nncf:Group of nodes [AlexNet/Sequential[classifier]/NNCFLinear[1]/linear_0] will be pruned together.
INFO:nncf:Group of nodes [AlexNet/Sequential[classifier]/NNCFLinear[4]/linear_0] will be pruned together.
INFO:nncf:Group of nodes [AlexNet/Sequential[classifier]/NNCFLinear[6]/linear_0] can't be pruned, because some nodes should't be pruned, error messages for this nodes: ignored adding Weight Pruner in: AlexNet/Sequential[classifier]/NNCFLinear[6]/linear_0 because this scope is convolution with output which directly affects model output dimensions.
INFO:nncf:Adding Weight Pruner in scope: AlexNet/Sequential[features]/NNCFConv2d[3]/conv2d_0
INFO:nncf:Adding Weight Pruner in scope: AlexNet/Sequential[features]/NNCFConv2d[6]/conv2d_0
INFO:nncf:Adding Weight Pruner in scope: AlexNet/Sequential[features]/NNCFConv2d[8]/conv2d_0
INFO:nncf:Adding Weight Pruner in scope: AlexNet/Sequential[features]/NNCFConv2d[10]/conv2d_0
INFO:nncf:Adding Weight Pruner in scope: AlexNet/Sequential[classifier]/NNCFLinear[1]/linear_0
INFO:nncf:Adding Weight Pruner in scope: AlexNet/Sequential[classifier]/NNCFLinear[4]/linear_0
INFO:nncf:Computing filter importance scores and binary masks...
NNCF ONNX model exported.
Model Optimizer arguments:
Common parameters:
- Path to the Input Model: model_nncf.onnx
- Path for generated IR: ./
- IR output name: model_nncf
- Log level: ERROR
- Batch: Not specified, inherited from the model
- Input layers: Not specified, inherited from the model
- Output layers: 32
- Input shapes: [1, 3, 2048, 2048]
- Source layout: Not specified
- Target layout: Not specified
- Layout: Not specified
- Mean values: Not specified
- Scale values: Not specified
- Scale factor: Not specified
- Precision of IR: FP16
- Enable fusing: True
- User transformations: Pruning
- Reverse input channels: False
- Enable IR generation for fixed input shape: False
- Use the transformations config file: None
I also cheked "mo --help", I found "--transform", but no explanation on Pruning.
--data_type {FP16,FP32,half,float}
Data type for all intermediate tensors and weights. If original
model is in FP32 and --data_type=FP16 is specified, all model
weights and biases are compressed to FP16.
--transform TRANSFORM
Apply additional transformations. Usage: "--transform transformation_name1[args],transformation_name2..."
where [args] is key=value pairs separated by semicolon. Examples: "--transform LowLatency2" or
"--transform LowLatency2[use_const_initializer=False]" or
"--transform "MakeStateful[param_res_names={'input_name_1':'output_name_1','input_name_2':'output_name_2'}]""
Available transformations: "LowLatency2", "MakeStateful"
....
I might mistake something?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I got your previous message: you couldn't run pruning transformation.
Please try this example:
# Assume you have openvino_dev==2022.1 and nncf installed
# Go to examples dir
cd nncf/examples/torch/classification
# Export pruned model
python main.py --config configs/pruning/resnet50_imagenet_accuracy_aware.json --mode export --to-onnx resnet50_pruned.onnx --cpu-only
# Convert without pruning
mo --input_model resnet50_pruned.onnx -o not_pruned
# Convert with pruning
mo --input_model resnet50_pruned.onnx --transform=Pruning -o pruned
# Check IR sizes
du -h
# my output is
# ...
# 89M ./pruned
# 98M ./not_pruned
My openvino version:
$ pip freeze | grep openvino
openvino==2022.1.0
openvino-dev==2022.1.0
openvino-telemetry==2022.1.1
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank your for the demonstration.
Finally, i could also apply Pruning + Quantization to my model.
I could see that the model data was compressed.
I actually had to set "prune_first_conv" to be True
because I'm cutting shallow layer.
I appreciate your help! and close this question.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Timosy,
This thread will no longer be monitored since this issue has been resolved. If you need any additional information from Intel, please submit a new question.
Sincerely,
Zulkifli
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page